awk grep 或 sed：如何匹配两个文件

Posted 2023-05-08

技术标签:

【中文标题】awk grep 或 sed：如何匹配两个文件【英文标题】：awk grep or sed : how to match two files 【发布时间】：2021-08-30 16:05:28 【问题描述】：

（你真棒）

如何将参数从上一个命令传递给 awk 或 grep？

我想匹配两个文件中的一个字段（用户名）：

预期结果：

firstname   Lastname  Username IPaddress
Mozes       Bowman    user1    134.244.47.32
Jazzmyn     Parrish   user2    3.249.198.34
Chet        Woods     user3    52.215.73.213

来自这 2 个：

**file1**
IPaddress       Username
34.244.47.32    user1
3.249.198.34    user2
52.215.73.213   user3

**file2**
firstname Lastname   Username
Mozes     Bowman     user1
Jazzmyn   Parrish    user2
Chet      Woods      user3

这是我能做到的：

awk 'print $2,$1' file1 | while read Username IP ; do grep $username file2 && echo $IP; done

导致IP每次都换行显示：

firstname   Lastname    Username
IPaddress
Mozes   Bowman  user1
34.244.47.32
Jazzmyn Parrish user2
3.249.198.34
Chet    Woods   user3
52.215.73.213

【问题讨论】：

如果匹配的行可以在 2 个文件之间以不同的顺序出现，或者在 2 个文件之间可能存在不匹配的行，那么在您的示例中包含这些行，因为它们都使它成为比您更难的问题'到目前为止已经显示。 【参考方案1】：

你可以试试这个awk

awk 'NR==FNR r[$3]=$0 ; next   print r[$2]"\t" $1 ' $file2 $file1

输出

firstname Lastname   Username   IPaddress
Mozes     Bowman     user1      34.244.47.32
Jazzmyn   Parrish    user2      3.249.198.34
Chet      Woods      user3      52.215.73.213

【讨论】：

【参考方案2】：

请尝试关注awk 代码。这将处理我们在用户名中也有空格的情况。以与 OP 所示示例相同的格式编写和测试。

awk '
BEGIN
  OFS="\t"
  print "firstname   Lastname  Username IPaddress"

NR==1 next 
FNR==NR
  match($0,/^([0-9]+\.)+[0-9]+[[:space:]]+/)
  value=substr($0,RSTART,RLENGTH)
  sub(/[[:space:]]+$/,"",value)
  arr[substr($0,RSTART+RLENGTH)]=value
  next


  first=$1
  second=$2
  $1=$2=""
  sub(/^[[:space:]]+/,"")

($0 in arr)
  print first,second,$0,arr[$0]

' file1 file2

说明：为上述解决方案添加详细说明。

awk '                                                ##Starting awk program from here.
BEGIN                                               ##Starting BEGIN section of this program from here.
  OFS="\t"                                           ##Setting OFS as tab here.
  print "firstname   Lastname  Username IPaddress"   ##printing header here.

NR==1 next                                         ##Checking condition if this is very first line of file1 then simply ignore it.
FNR==NR                                             ##This condition will be TRUE when file1 is being read.
  match($0,/^([0-9]+\.)+[0-9]+[[:space:]]+/)         ##Using match function to match IP address in file1 here followed by spaces.
  value=substr($0,RSTART,RLENGTH)                    ##Creating value which is substring of matched regex.
  sub(/[[:space:]]+$/,"",value)                      ##Substituting spaces till last in value.
  arr[substr($0,RSTART+RLENGTH)]=value               ##Creating array named arr with index of rest of value(apart from ip values followed by spaces in file1) and its value is variable value.
  next                                               ##next will skip all further statements from here.


  first=$1                                           ##Creating first which has $1 of current line.
  second=$2                                          ##Creating second which has $2 of current line.
  $1=$2=""                                           ##Nullifying $1 and $2 here.
  sub(/^[[:space:]]+/,"")                            ##Substituting spaces from starting with null.

($0 in arr)                                         ##If current line is present in arr then do following.
  print first,second,$0,arr[$0]                      ##Printing variables then current line followed by arr value.

' file1 file2                                        ##Mentioning Input_file names here.

【讨论】：

【参考方案3】：

通过您问题中的示例，您只需要：

$ paste file2 file1 | sed 's/ *[^ ]*$//'
firstname Lastname   Username   IPaddress
Mozes     Bowman     user1      34.244.47.32
Jazzmyn   Parrish    user2      3.249.198.34
Chet      Woods      user3      52.215.73.213

【讨论】：

【参考方案4】：

你可以使用这个awk:

awk 'NR==FNR a[$2]=$1; next print $0, a[$3]' f1 f2 | column -t
firstname  Lastname  Username  IPaddress
Mozes      Bowman    user1     34.244.47.32
Jazzmyn    Parrish   user2     3.249.198.34
Chet       Woods     user3     52.215.73.213

将column -t 用于表格输出。

【讨论】：

谢谢，它有效。如果我理解得很好，第二个中的 [ ] 以变量 a[] 的形式从第一个 [ ] 获取参数？是的，这是正确的。我们使用数组a来存储键值映射，键为$2，值为$1，来自file1。稍后从 file2 我们使用 $3 作为键访问相同的数组。

以上是关于awk grep 或 sed：如何匹配两个文件的主要内容，如果未能解决你的问题，请参考以下文章