文本处理grep命令

Posted 2020-07-08 积少成多

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了文本处理grep命令相关的知识，希望对你有一定的参考价值。

 1 this is a words file.
 2 words words to be 
 3 1 2, 3  4 , 5 , 5 6 , 6 , 7 , 7 , 8 , 8 9 , 9 , 10
 4 beginning linux programming 4th edition
 5 1000 222222 334 5 99999
 6 
 7 this is a line containing pattern
 8 ,.<>?;‘;;;‘ [] {= = \ \ \| [email protected]#$%^&*() [email protected]#$$%%^&*(()*&%@(#$%))
 9 
10 www.regexper.com
11 www.google.com
12 www.baidu.com
13 www.redhat.com

我们的测试文件名字叫 n，如上所示，共13行。

grep按行检索，按行输出。

1，搜索特定模式的行

1 [[email protected]128-93 shell]$ grep words n
2 this is a words file.
3 words words to be 
4 [[email protected]128-93 shell]$

2，单个grep命令可以对多个文件进行检索

[[email protected]128-93 shell]$ grep words n n1 n2
n:this is a words file.
n:words words to be 
n1:this is a words file.
n1:words words to be 
n2:this is a words file.
n2:words words to be 
[[email protected]-128-93 shell]$

3，使用正则表达式，添加-E选项，或者直接egrep (在terminal下可以看到这些被匹配的部分是被红色特殊显示的，这里显示的是被匹配到的行)

[[email protected]128-93 shell]$ egrep "[a-o]+" n
this is a words file.
words words to be 
beginning linux programming 4th edition
this is a line containing pattern
www.regexper.com
www.google.com
www.baidu.com
www.redhat.com
[[email protected]-128-93 shell]$

4，只输出文件中匹配到的文本部分呢，使用-o

[[email protected]128-93 shell]$ grep words n
this is a words file.
words words to be 
[[email protected]-128-93 shell]$ grep words n -o
words
words
words
[[email protected]-128-93 shell]$

5，打印除包含match_pattern行之外的所有行，使用-v

[[email protected]128-93 shell]$ grep words n -v
1 2, 3  4 , 5 , 5 6 , 6 , 7 , 7 , 8 , 8 9 , 9 , 10
beginning linux programming 4th edition
1000 222222 334 5 99999

this is a line containing pattern
,.<>?;‘;;;‘ [] {= = \ \ \| [email protected]#$%^&*() [email protected]#$$%%^&*(()*&%@(#$%))

www.regexper.com
www.google.com
www.baidu.com
www.redhat.com
[[email protected]-128-93 shell]$

6，统计文件包含匹配字符串的行数，使用-c （-c只统计匹配到的行数，并不会统计匹配到的次数）

[[email protected]128-93 shell]$ grep words n -c
2
[[email protected]-128-93 shell]$ grep words n 
this is a words file.
words words to be 
[[email protected]-128-93 shell]$

7，统计匹配到的字符串的数量，使用-o | wc -l

[[email protected]128-93 shell]$ grep -o words n | wc -l
3
[[email protected]-128-93 shell]$ grep words n
this is a words file.
words words to be 
[[email protected]-128-93 shell]$

8，打印出包含匹配字符串的行号，使用-n

[[email protected] shell]$ grep w -n n n1
n:1:this is a words file.
n:2:words words to be 
n:10:www.regexper.com
n:11:www.google.com
n:12:www.baidu.com
n:13:www.redhat.com
n1:1:this is a words file.
n1:2:words words to be 
n1:10:www.regexper.com
n1:11:www.google.com
n1:12:www.baidu.com
n1:13:www.redhat.com
[[email protected] shell]$

9 打印模式匹配所位于的字符或字节偏移，使用-b -o

[[email protected]128-93 shell]$ grep words n
this is a words file.
words words to be 
[[email protected]-128-93 shell]$ grep words -b -o n
10:words
22:words
28:words
[[email protected]-128-93 shell]$

10，搜索多个文件并找出文本位于哪一个文件中，使用-l

[[email protected]128-93 shell]$ grep words n n1
n:this is a words file.
n:words words to be 
n1:this is a words file.
n1:words words to be 
[[email protected]-128-93 shell]$ grep words -l n n1
n
n1
[[email protected]-128-93 shell]$

使用-L 大写的L字符，取相反的结果

[[email protected]128-93 shell]$ grep words n n1
n:this is a words file.
n:words words to be 
n1:this is a words file.
n1:words words to be 
[[email protected]-128-93 shell]$ grep words -l n n1
n
n1
[[email protected]-128-93 shell]$ grep words -L n n1
[[email protected]-128-93 shell]$

11，递归搜索文件，使用-R -n （-n选项表示显示所在文件名：行号）

[[email protected]128-93 shell]$ grep words . -R -n
./n:1:this is a words file.
./n:2:words words to be 
./n1:1:this is a words file.
./n1:2:words words to be 
./n2:1:this is a words file.
./n2:2:words words to be 
[[email protected]-128-93 shell]$

12，忽略样式中的大小写，使用-i

[[email protected]128-93 shell]$ grep WORDS -i n
this is a words file.
words words to be 
[[email protected]-128-93 shell]$

13，使用grep匹配多个样式，使用-e

[[email protected] shell]$ grep -e words  -e www -o n
words
words
words
www
www
www
www
[[email protected] shell]$

14，使用样式文件，利用grep逐行读取样式文件，grep会将匹配到的行输出

[[email protected]128-93 shell]$ grep -f f n
this is a words file.
words words to be 
1 2, 3  4 , 5 , 5 6 , 6 , 7 , 7 , 8 , 8 9 , 9 , 10
1000 222222 334 5 99999
www.regexper.com
www.google.com
www.baidu.com
www.redhat.com
[[email protected]-128-93 shell]$

15，在grep搜索中指定或排除文件

# grep "main()" . -r --include *.{c,cpp}

#grep "main()" . -r --exclude "readme"

16,grep 的静默输出，使用-q

#########################################################################
# File Name: begin.sh
# Author: lizhen
# mail: [email protected]163.com
# Created Time: Wed 18 May 2016 08:29:32 PM CST
#########################################################################
#!/bin/bash
if [ $# -ne 2  ]
then
    echo "usage: $0 match_text filename"
    exit 1
fi

match_text=$1
filename=$2
grep -q "$match_text" $filename

if [ $? -eq 0 ]
then
    echo "The text exists in the file"
else
    echo "text does not exist in the file"
fi

echo "done!"

[[email protected]128-93 shell]$ ./begin.sh words n
The text exists in the file
done!
[[email protected]-128-93 shell]$

17，打印匹配行之前或之后的行，使用-B，-A，-C选项

[[email protected]128-93 shell]$ grep www -B 3 n
this is a line containing pattern
,.<>?;‘;;;‘ [] {= = \ \ \| [email protected]#$%^&*() [email protected]#$$%%^&*(()*&%@(#$%))

www.regexper.com
www.google.com
www.baidu.com
www.redhat.com
[[email protected]-128-93 shell]$ grep www -B 1 n

www.regexper.com
www.google.com
www.baidu.com
www.redhat.com
[[email protected]-128-93 shell]$ grep www n
www.regexper.com
www.google.com
www.baidu.com
www.redhat.com
[[email protected]-128-93 shell]$

[[email protected]128-93 shell]$ grep words -A  1 n
this is a words file.
words words to be 
1 2, 3  4 , 5 , 5 6 , 6 , 7 , 7 , 8 , 8 9 , 9 , 10
[[email protected]-128-93 shell]$

[[email protected]128-93 shell]$ grep 2222 n
1000 222222 334 5 99999
[[email protected]-128-93 shell]$ grep 2222 n -C 1
beginning linux programming 4th edition
1000 222222 334 5 99999

[[email protected]-128-93 shell]$

以上是关于文本处理grep命令的主要内容，如果未能解决你的问题，请参考以下文章