grep/sed/awk - 用新的计算值“$X/10”替换文件中的所有“$X”

Posted

技术标签:

【中文标题】grep/sed/awk - 用新的计算值“$X/10”替换文件中的所有“$X”【英文标题】:grep/sed/awk - Replace all "$X" in file with new computed value "$X/10" 【发布时间】:2021-12-31 00:43:51 【问题描述】:

这是我的原始文件:

#Game No : 1000693
***** 888poker Hand History for Game 1000693 *****
$1/$2 Blinds No Limit Holdem - *** 18 11 2021 10:41:44
Table DD Poker 9 Max (Real Money)
Seat 7 is the button
Total number of players : 9
Seat 1: Monroe ( $2 )
Seat 2: Mpeti ( $2 )
Seat 3: Bowen ( $2 )
Seat 4: Riccardo ( $2 )
Seat 5: Ramman ( $2 )
Seat 6: Reva ( $2 )
Seat 7: Miklos ( $2 )
Seat 9: Samlet ( $2 )
Seat 10: Geralyn ( $2 )
Hamlet posts small blind [$1]
Geralyn posts big blind [$2]
** Dealing down cards **
Monroe calls [$2]
Mpeti folds
Bowen folds
Riccardo folds
Ramman folds
Reva folds
Miklos calls [$2]
Hamlet folds
** Dealing flop ** [ As, 6s, Ah ]
** Dealing turn ** [ 8s ]
** Dealing river ** [ 4h ]
** Summary **
Monroe shows [ Jh, Td ]
Miklos shows [ Tc, Jc ]
Samlet shows [ Th, 8c ]
Geralyn shows [ Js, Ts ]
Geralyn collected [ $17 ]

对于所有美元值“$X”,我需要用“$X/10”替换它们(Eks. $2/10=0.2) 我可以使用这个管道命令提取美元值并将它们除以 10:

grep -o '\$[[:digit:]]*' dollar.txt | sed 's/[^0-9]//g' | awk 'print $1/10'

0.1
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.1
0.2
0.2
0.2

如何更改整个原始文件(或创建一个新文件),以便文件的其余部分包含所有“$X”值都替换为“$X/10”?

【问题讨论】:

这里根本没有理由使用 sed 或 grep;这段代码可以合理有效地实现为 100% awk……尽管就个人而言,我会在这个代码上联系 Python。 (Python 内置的东西,必须在 awk 中手动完成,是一种将用于计算替换值的函数传递给标准库的基于正则表达式的查找的能力- 和替换功能)。 【参考方案1】:

作为纯 awk 脚本(不需要 bash)的一个实现可能如下所示:

#!/usr/bin/env awk -f
/[$][[:digit:]]+/ 
  while (match($0, /[$][[:digit:]]+/) > 0) 
    prefix = substr($0, 1, RSTART-1)
    suffix = substr($0, RSTART+RLENGTH)
    input = substr($0, RSTART+1, RLENGTH-1)
    output = input / 10
    $0 = prefix output suffix
  

 print 

如果使用./awkscript <in.txtawk -f awkscript <in.txt 运行,则输出为:

#Game No : 1000693
***** 888poker Hand History for Game 1000693 *****
0.1/0.2 Blinds No Limit Holdem - *** 18 11 2021 10:41:44
Table DD Poker 9 Max (Real Money)
Seat 7 is the button
Total number of players : 9
Seat 1: Monroe ( 0.2 )
Seat 2: Mpeti ( 0.2 )
Seat 3: Bowen ( 0.2 )
Seat 4: Riccardo ( 0.2 )
Seat 5: Ramman ( 0.2 )
Seat 6: Reva ( 0.2 )
Seat 7: Miklos ( 0.2 )
Seat 9: Samlet ( 0.2 )
Seat 10: Geralyn ( 0.2 )
Hamlet posts small blind [0.1]
Geralyn posts big blind [0.2]
** Dealing down cards **
Monroe calls [0.2]
Mpeti folds
Bowen folds
Riccardo folds
Ramman folds
Reva folds
Miklos calls [0.2]
Hamlet folds
** Dealing flop ** [ As, 6s, Ah ]
** Dealing turn ** [ 8s ]
** Dealing river ** [ 4h ]
** Summary **
Monroe shows [ Jh, Td ]
Miklos shows [ Tc, Jc ]
Samlet shows [ Th, 8c ]
Geralyn shows [ Js, Ts ]
Geralyn collected [ 1.7 ]

【讨论】:

【参考方案2】:

我会使用 GNU AWK 来完成这个任务,让 file.txt 内容成为

#Game No : 1000693
***** 888poker Hand History for Game 1000693 *****
$1/$2 Blinds No Limit Holdem - *** 18 11 2021 10:41:44
Table DD Poker 9 Max (Real Money)
Seat 7 is the button
Total number of players : 9
Seat 1: Monroe ( $2 )
Seat 2: Mpeti ( $2 )
Seat 3: Bowen ( $2 )
Seat 4: Riccardo ( $2 )
Seat 5: Ramman ( $2 )
Seat 6: Reva ( $2 )
Seat 7: Miklos ( $2 )
Seat 9: Samlet ( $2 )
Seat 10: Geralyn ( $2 )
Hamlet posts small blind [$1]
Geralyn posts big blind [$2]
** Dealing down cards **
Monroe calls [$2]
Mpeti folds
Bowen folds
Riccardo folds
Ramman folds
Reva folds
Miklos calls [$2]
Hamlet folds
** Dealing flop ** [ As, 6s, Ah ]
** Dealing turn ** [ 8s ]
** Dealing river ** [ 4h ]
** Summary **
Monroe shows [ Jh, Td ]
Miklos shows [ Tc, Jc ]
Samlet shows [ Th, 8c ]
Geralyn shows [ Js, Ts ]
Geralyn collected [ $17 ]

然后

awk 'BEGINFPAT="[^0-9]|([0-9]*)";OFS=""dollar=0;for(i=1;i<=NF;i+=1)if(dollar)$i*=0.1;dollar=($i=="$");print' file.txt

输出

#Game No : 1000693
***** 888poker Hand History for Game 1000693 *****
$0.1/$0.2 Blinds No Limit Holdem - *** 18 11 2021 10:41:44
Table DD Poker 9 Max (Real Money)
Seat 7 is the button
Total number of players : 9
Seat 1: Monroe ( $0.2 )
Seat 2: Mpeti ( $0.2 )
Seat 3: Bowen ( $0.2 )
Seat 4: Riccardo ( $0.2 )
Seat 5: Ramman ( $0.2 )
Seat 6: Reva ( $0.2 )
Seat 7: Miklos ( $0.2 )
Seat 9: Samlet ( $0.2 )
Seat 10: Geralyn ( $0.2 )
Hamlet posts small blind [$0.1]
Geralyn posts big blind [$0.2]
** Dealing down cards **
Monroe calls [$0.2]
Mpeti folds
Bowen folds
Riccardo folds
Ramman folds
Reva folds
Miklos calls [$0.2]
Hamlet folds
** Dealing flop ** [ As, 6s, Ah ]
** Dealing turn ** [ 8s ]
** Dealing river ** [ 4h ]
** Summary **
Monroe shows [ Jh, Td ]
Miklos shows [ Tc, Jc ]
Samlet shows [ Th, 8c ]
Geralyn shows [ Js, Ts ]
Geralyn collected [ $1.7 ]

解释:我通知 GNU AWK 该字段是(0 位或多位数字)或单个使用 FPAT 的任何其他字符。然后我使用for 循环遍历字段,首先如果dollar 变量为真,我将字段替换为值的0.1,然后如果字段等于$,我将dollar 变量设置为真。顺序至关重要,因为这意味着第一个操作是在美元字段之后立即进行字段。处理完我print 行中的所有字段后。我将输出字段分隔符 (OFS) 设置为空字符串,以防止注入不需要的字符。

(在 GNU Awk 5.0.1 中测试)

【讨论】:

【参考方案3】:

如果.2 足以满足0.2 的需求,您只需在每个美元值的最后一位前加一个点即可:

sed -E 's/(\$)([0-9]*)([0-9])/\2.\3/g

否则(改进,感谢波通的建议):

sed -E '
    s/\$([0-9]+)([0-9])/\1.\2/g
    s/\$([0-9])/0.\1/g' file
对于包含两位或多位数字的美元值,请删除 $ 并在最后一位数字前添加一个点(实际上除以 10)。 任何剩余的美元值都有一位。去掉$,在数字前加上0.。 如果输入已经可以包含十进制美元值(如$10.00),这将不起作用

【讨论】:

第二个解决方案可能是sed -E 's/\$([0-9]+)([0-9])/\1.\2/g;s/\$([0-9])/0.\1/g' file? @potong 是的,功能相同且更短。【参考方案4】:

如果您可以为此使用perl,则解决方案将尽可能短

perl -i -pe 's\$(\d+)($1/10)ge' file

请参阅online demo。 详情

-i - 内联文件更改选项 s\$(\d+)($1/10)ge - substitution 命令,其中 RHS(替换部分)是 Perl 表达式(通过 e 标志启用),匹配 $ 每一行上的所有 (g) 出现符号,然后是一个或多个数字(捕获到第 1 组,$1),并将每个匹配项替换为第 1 组值除以 10 的结果。

如果您想明确禁止执行浮点值替换(例如$10.500),您可以将\$(\d+) 替换为\$(\d+)(?!\.?\d)

【讨论】:

在这里我看到了perl 电源replacing each match with the result of division of the Group 1 value。简单高效; 1+【参考方案5】:

使用 GNU awk 进行“就地”编辑,第三个参数为 match()

$ awk -i inplace 'match($0,/(.*\$)([0-9]+)(.*)/,a) $0=a[1] a[2]/10 a[3]  1' file

$ cat file
#Game No : 1000693
***** 888poker Hand History for Game 1000693 *****
$1/$0.2 Blinds No Limit Holdem - *** 18 11 2021 10:41:44
Table DD Poker 9 Max (Real Money)
Seat 7 is the button
Total number of players : 9
Seat 1: Monroe ( $0.2 )
Seat 2: Mpeti ( $0.2 )
Seat 3: Bowen ( $0.2 )
Seat 4: Riccardo ( $0.2 )
Seat 5: Ramman ( $0.2 )
Seat 6: Reva ( $0.2 )
Seat 7: Miklos ( $0.2 )
Seat 9: Samlet ( $0.2 )
Seat 10: Geralyn ( $0.2 )
Hamlet posts small blind [$0.1]
Geralyn posts big blind [$0.2]
** Dealing down cards **
Monroe calls [$0.2]
Mpeti folds
Bowen folds
Riccardo folds
Ramman folds
Reva folds
Miklos calls [$0.2]
Hamlet folds
** Dealing flop ** [ As, 6s, Ah ]
** Dealing turn ** [ 8s ]
** Dealing river ** [ 4h ]
** Summary **
Monroe shows [ Jh, Td ]
Miklos shows [ Tc, Jc ]
Samlet shows [ Th, 8c ]
Geralyn shows [ Js, Ts ]
Geralyn collected [ $1.7 ]

如果您想从输出中删除$s,只需将match() 中使用的正则表达式中的\$) 更改为)\$

以上假设您每行只有 0 或 1 个$&lt;number&gt;,并且所有数字都是整数。

【讨论】:

以上是关于grep/sed/awk - 用新的计算值“$X/10”替换文件中的所有“$X”的主要内容,如果未能解决你的问题,请参考以下文章

Linux命令进阶:grep,sed,awk全家桶(文本处理技术详例)

Linux命令进阶:grep,sed,awk全家桶(文本处理技术详例)

grep,sed,awk与简单正则表达式应用

3. linux常用命令及三剑客 grep sed awk 用法

正则三剑客grep, sed, awk复习记录

linux三剑客的基本使用——grep、sed、awk