使用另一个文件中的第 n 行从两个字符串之间的任何地方替换每个第 n 行

Posted 2023-03-24

技术标签:

【中文标题】使用另一个文件中的第 n 行从两个字符串之间的任何地方替换每个第 n 行【英文标题】：Replace each nth occurs from anything between two strings using nth line from another file 【发布时间】：2021-12-21 17:53:01 【问题描述】：

我实际上想使用 nth 行替换 0.txt 文件上的两个字符串之间的任何内容，)\t", 每隔 nth 行从另一个文件 1.txt 到 awk。

这类似于Replace each nth occurrence of 'foo' by numerically respective nth line of a supplied file

我一直在寻找一些东西，我试图适应这个https://***.com/a/21876700/10824251，但我不知道它如何适用于我所寻找的东西。这是我的尝试：

awk \

'NR==FNR a[NR]=$0; next /^group_tree(/ /gsub("tortoise", a[++i]) /^)\t",/1' \
    1.txt 0.txt

不要只生成消息的任何结果：

Usage: awk [POSIX or GNU style options] -f progfile [--] file ...
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options:      GNU long options: (standard)
.....

我的源文件：

0.txt：

"#sun\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"#sun\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",

1.txt：

(food, apple,)(bag, tortoise,)
(sky, cat,)(sun, sea,)
(car, shape)(milk, market,)
(man, shirt)(hair, life)
(dog, big)(bal, pink)

我想要的输出2.txt:

"#sun\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((man, shirt)(hair, life))\t",
"machine(shoes_shirt.shop)\t",
"#sun\t",
"car_snif = house.group_tree((dog, big)(bal, pink))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",

【问题讨论】：

您输入中的 \ts 是字面上的 2 个字符 \ 和 t 还是您输入中实际上有文字制表符？在您的问题文本中您说group_tree ( 带有空格，但在您的示例输入/输出中没有这样的空格。请edit您的问题使文本和示例保持一致和正确。 【参考方案1】：

POSIX awk：

awk '
FNR==NR a[i++] = $0
FNR!=NR if (sub(/group_tree[[:space:]]*\(.*\)\\t",$/,
             "group_tree("a[j%i]")\\t\"")) j++
         print' 1.txt 0.txt

您的描述在group_tree 和( 之间有一个空格，但您的示例数据没有。我允许任何一种情况。

由于两种模式之间存在贪婪的.*，这并不完全可靠。如果您的所有数据都与示例相似，则可能没问题。

请注意，sub(/foo/, a[j++]) 会迭代 j，无论 sub 是否成功。

【讨论】：

我现在正在恢复这个问题的工作，我没有注意到你提到的空间，但我会尝试纠正这个问题，现在你的回复有效，我看看她是如何工作的。如果1.txt 包含&，那将失败。尝试在1.txt 中将sun, sea 更改为sun & sea。 @dan 可以解释j%i 的含义，我知道它是一个remaing，但我不确定j 和i 是否是索引。 @dan 可以让我知道gsub 也可以取代sub？【参考方案2】：

您从调用 awk 得到的错误消息是因为您在 awk \ 和脚本之间有一个空行，所以它就像在没有脚本和没有参数的情况下调用 awk。如果你从这里改变它：

awk \

'NR==FNR a[NR]=$0; next /^group_tree(/ /gsub("tortoise", a[++i]) /^)\t",/1' \
    1.txt 0.txt

到这里：

awk \
'NR==FNR a[NR]=$0; next /^group_tree(/ /gsub("tortoise", a[++i]) /^)\t",/1' \
    1.txt 0.txt

或者更习惯的说法是：

awk '
    NR==FNR a[NR]=$0; next /^group_tree(/ /gsub("tortoise", a[++i]) /^)\t",/1
' 1.txt 0.txt

那么您将不会再收到该错误消息（但您将收到不同的错误消息，因为脚本仍然包含语法错误）。

不过，为了解决您的实际问题，使用 GNU awk 作为 match() 和 ARGIND 的第三个参数：

$ cat tst.awk
ARGIND == 1 
    newVals[++totNew] = $0
    next

match($0,/(.*group_tree\().*(\)\\t",.*)/,a) 
    newIdx = ( (++numNew - 1) % totNew ) + 1
    $0 = a[1] newVals[newIdx] a[2]

 print

$ awk -f tst.awk 1.txt 0.txt
"#sun\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((man, shirt)(hair, life))\t",
"machine(shoes_shirt.shop)\t",
"#sun\t",
"car_snif = house.group_tree((dog, big)(bal, pink))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",

以上假设每个group_tree( 后面只有一个)\t",。

【讨论】：

你可以解释你的脚本的每一步，执行简单地显示没有错误并且文件2.txt得到0.txt的内容哪一部分不清楚？我认为它所做的事情的结构是显而易见的，我给了它非常清晰的变量名，所以你知道它们每个的用途，所以我无法想象需要解释什么。另一个响应的测试更快，因为它已经通过更改两个字符串在第一次尝试和其他测试中工作。我会测试你的更新并尽快给你反馈。我的解决方案与当前接受的答案之间的主要区别是我使用捕获组，因此我不需要两次指定相同的分隔符字符串，一次在正则表达式中，然后再次在替换中，并且我正在使用 *sub() 进行文字字符串替换而不是启用反向引用的替换，因此无论 1.txt 包含哪些字符，它都会起作用，例如&。我还在测试ARGIND==1 而不是使用NR==FNR，因此您可以将其复制到任何其他文件（我知道您将在后续问题中添加）。不过，我的脚本是 gawk-only。什么代码使用FNR？我不知道tortoise 是什么，也不知道它与我的答案有什么关系。如果你告诉我答案的哪一部分你不明白，我会解释的。

以上是关于使用另一个文件中的第 n 行从两个字符串之间的任何地方替换每个第 n 行的主要内容，如果未能解决你的问题，请参考以下文章