linux的正则grep及egrep介绍

Posted 2020-10-15

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了linux的正则grep及egrep介绍相关的知识，希望对你有一定的参考价值。

正则介绍：

解释对“正则表达式”的定义是：它使用单个字符串来描述或匹配一系列符合某个句法规则的字符串。在很多文本编辑器或其他工具里，正则表达式通常用来检索和替换那些符合某个模式的文本内容。许多程序设计语言也都支持利用正则表达式进行字符串操作。对于系统管理员来讲,正则表达式贯穿在我们的日常运维工作中,无论是查找某个文档,还是查询某个日志文件并分析其内容,都会用正则表达式。其实正则表达式只是一种思想、一种表示方法。只要我们使用的工具支持这种表示方法，那么这个工具就可以处理正则表达式的字符串。常用的工具有grep、sed、awk等，其中grep、sed和awk都是针对文本的行进行操作的。

grep工具的使用：

grep命令主要作用：过滤指定关键词

命令格式为：grep [-cinvABC] ‘word‘ filename

常用选项含义表示如下：

-c：表示打印符合要求的行数。
-i：表示忽略大小写。
-n：表示输出符合要求的行及其行号。
-v：表示打印不符合要求的行，取反。
-r：遍历所有子目录
-A：后面跟一个数字（有无空格都可以），例如-A2表示打印符合要求的行以及下面的两行。
-B：后面跟一个数字，例如-B2表示打印符合要求的行以及上面两行。
-C：后面跟一个数字，例如-C2表示打印符合要求的行以及上下各两行。

验证操作：

验证准备：

[[email protected] ~]# mkdir grep 
[[email protected] ~]# cd grep/
[[email protected] grep]# cp /etc/passwd .
[[email protected] grep]# pwd/root/grep
[[email protected] grep]# lspasswd

验证实例：

[[email protected] grep]# grep -c ‘nologin‘ passwd  //打印带有关键词的总行数
17
[[email protected] grep]# grep -n ‘root‘ passwd //过滤出带有某个关键词的行，并输出行号，前面的数字显示为绿色，表示行号。
1:root:x:0:0:root:/root:/bin/bash
10:operator:x:11:0:operator:/root:/sbin/nologin
[[email protected] grep]# grep -nv ‘nologin‘ passwd //过滤出不带有某个关键词的行，并输出行号
1:root:x:0:0:root:/root:/bin/bash
6:sync:x:5:0:sync:/sbin:/bin/sync
7:shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
8:halt:x:7:0:halt:/sbin:/sbin/halt
22:aming:x:1000:1005::/home/aming:/bin/bash
23:user1:x:1001:1001::/home/user1:/bin/bash
24:aminglinux:x:1002:1002::/home/aminglinux:/bin/bash
25:user2:x:1003:1003::/home/user2:/bin/bash
26:user3:x:1004:1005::/home/user3:/bin/bash
28:user5:x:1007:1007::/home/user5:/bin/bash
29:user6:x:1008:1010::/home/user6:/bin/bash
[[email protected] grep]# vim passwd //编辑文件把第二行no改成大写NO
[[email protected] grep]# grep -n ‘nologin‘ passwd //不显示出大写的第二行
3:daemon:x:2:2:daemon:/sbin:/sbin/nologin
4:adm:x:3:4:adm:/var/adm:/sbin/nologin
5:lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
9:mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
10:operator:x:11:0:operator:/root:/sbin/nologin
[[email protected] grep]# grep -ni ‘nologin‘ passwd //加-i   不区分大小写，把大写行列出
2:bin:x:1:1:bin:/bin:/sbin/NOlogin
3:daemon:x:2:2:daemon:/sbin:/sbin/nologin
4:adm:x:3:4:adm:/var/adm:/sbin/nologin
5:lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
9:mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
10:operator:x:11:0:operator:/root:/sbin/nologin
[[email protected] grep]# grep -r ‘root‘ /etc/  //加-r打印显示etc下的所有子目录
/etc/pki/ca-trust/ca-legacy.conf:# The upstream Mozilla.org project tests all changes to the root CA
/etc/pki/ca-trust/ca-legacy.conf:# to temporarily keep certain (legacy) root CA certificates trusted,
/etc/pki/ca-trust/ca-legacy.conf:#   It may keep root CA certificate as trusted, which the upstream 
/etc/pki/ca-trust/extracted/README:root CA certificates.
/etc/pki/ca-trust/extracted/java/README:root CA certificates.
....省略
[[email protected] grep]# grep -A2 ‘halt‘ passwd //-A2会把包含halt的行以及这行下面的两行都打印出来
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
[[email protected] grep]# grep -B2 ‘halt‘ passwd //-B2会把包含halt的行以及这行上面的两行都打印出来
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
[[email protected] grep]# grep -C2 ‘halt‘ passwd //-C2会把包含halt的行以及这行上下各两行都打印出来
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin

实验操作扩展:

//过滤出所有包含数字的行（只要有一个数字就算匹配到）

[[email protected] grep]# grep ‘[0-9]‘ /etc/inittab 
# multi-user.target: analogous to runlevel 3
# graphical.target: analogous to runlevel 5

//过滤出所有不包含数字的行（只要是包含一个数字，就不显示）

[[email protected] grep]# grep -v ‘[0-9]‘ /etc/inittab 
# inittab is no longer used when using systemd.
#
# ADDING CONFIGURATION HERE WILL HAVE NO EFFECT ON YOUR SYSTEM.
#
# Ctrl-Alt-Delete is handled by /usr/lib/systemd/system/ctrl-alt-del.target
#
# systemd uses ‘targets‘ instead of runlevels. By default, there are two main targets:
#
#
# To view current default target, run:
# systemctl get-default
#
# To set a default target, run:
# systemctl set-default TARGET.target
#

//过滤掉所有以#开头的行，这里面含有空行。

[[email protected] grep]# cp /etc/inittab .  //实验需要把这个文件拷贝到grep目录下
[[email protected] grep]# ls
inittab  passwd
[[email protected] grep]# vim inittab //编辑这个文件随意加上几行字符，注意：系统目录下这个文件是不能随意更改的，不能系统可以会无法启动。
[[email protected] grep]# grep -v ‘^#‘ inittab 
%^&$$%%##$%%%%
djffafdfafd
773834442345

$$%%HFHFHf121

767dfadfdf

//过滤掉所有空行和以#开头的行

[[email protected] grep]# grep -v ‘^#‘ inittab |grep -v  ‘^$‘  
%^&$$%%##$%%%%
djffafdfafd
773834442345
$$%%HFHFHf121
767dfadfdf

扩展知识：

在正则表达式中,^ 表达行的开始，$表示行的结尾那么空行则可以用^$表示。那如何打印出不以英文字开头的行呢?如下所示：

我们先来自定义一个文件：

[[email protected] ~]# mkdir /tmp/1
[[email protected] ~]# cd /tmp/1
[[email protected] 1]# vim test.txt  //在test中写几行字符串用来做实验
[[email protected] 1]# cat test.txt 
123
abc
456

abc2323
#fjdkfaf
A423423423

实验如下：

[[email protected] 1]# grep ‘^[^a-zA-Z]‘ test.txt 
123
456
#fjdkfaf
[[email protected] 1]# grep ‘[^a-zA-Z]‘ test.txt 
123
456
abc2323
#fjdkfaf
A423423423

备注：如果是数字就用【0-9】这样的形式（当遇到类似【15】的形式时，表示只含有1或者5）。如果要过滤数字以及大小写字母，则要写成类似【0-9a-zA-Z】的形式。另外，【^{字符】表示除【】内字符之外的字符。请注意，把}写到方括号里面和外面是有区别的。

//过滤出任意一个字符和重复字符

[[email protected] grep]# grep ‘r.o‘ passwd 
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

.表示任意一个字符，上例中，r.o表示把r与o之前有一个任意字符的行过滤出来。

[[email protected] grep]# grep ‘ooo*‘ passwd 
root:x:0:0:root:/root:/bin/bash
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin

表示零个或多个前面的字符，上例中，ooo*表示oo、ooo、oooo...或者更多的o。

[[email protected] grep]# grep ‘.*‘ passwd |wc -l
29
[[email protected] grep]# wc -l passwd 
29 passwd

上例中，.*表示零个或多个任意字符，空行也包含在内，它会把passwd文件里面的所有行都匹配到。

//指定要过滤出的字符出现次数

[[email protected] grep]# grep ‘0\{2\}‘ passwd 
games:x:12:100:games:/usr/games:/sbin/nologin
aming:x:1000:1005::/home/aming:/bin/bash
user1:x:1001:1001::/home/user1:/bin/bash
aminglinux:x:1002:1002::/home/aminglinux:/bin/bash
user2:x:1003:1003::/home/user2:/bin/bash
user3:x:1004:1005::/home/user3:/bin/bash
user4:x:1006:1005::/home/aming111:/sbin/nologin
user5:x:1007:1007::/home/user5:/bin/bash
user6:x:1008:1010::/home/user6:/bin/bash

这里用到了符号｛｝，其内部为数字，表示前面的字符要重复的次数。需要强调的是，｛｝左右都需要加上转义字符\。另外，使用“｛｝”还可以表示一个范围，具体格式为｛n1,n2｝，其中n1<n2，表示重复n1到n2次数前面的字符，n2还可以为空，这时表示大于等于n1次。

egrep工具的使用：

egrep工具是grep工具的扩展版本，可以完成grep不能完成的工作。

下面用实验来验证下egrep这个工具的几个用法，操作如下：

为了实验方便，编辑test.txt文件为以下内容：

rot:x:o:o:/rot:/bin/bash
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/rooot:/sbin/nologin
roooot:x:o:o:/roooooot:/bin/bash
1111111111111111111111111111111
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

//过滤出一个或多个指定的字符

[[email protected] grep]# vim test.txt
[[email protected] grep]# egrep ‘o+‘ test.txt 
rot:x:o:o:/rot:/bin/bash
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/rooot:/sbin/nologin
roooot:x:o:o:/roooooot:/bin/bash
[[email protected] grep]# egrep ‘oo+‘ test.txt 
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/rooot:/sbin/nologin
roooot:x:o:o:/roooooot:/bin/bash
[[email protected] grep]# egrep ‘ooo+‘ test.txt 
operator:x:11:o:operator:/rooot:/sbin/nologin
roooot:x:o:o:/roooooot:/bin/bash

和grep不同，这里egrep使用的是符号+，它表示匹配1个或多个+前面的字符，这个“+”是不支持审美观点grep直接使用的，包括上面｛｝，也是可以直接被egrep使用，而不用加\转义，示例如下：

[[email protected] grep]# egrep ‘o{2}‘ passwd 
root:x:o:o:root:/root:/bin/bash
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:o:operator:/root:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin

//过滤出零个或一个指定的字符

[[email protected] grep]# egrep ‘o?‘ test.txt 
rot:x:o:o:/rot:/bin/bash
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/rooot:/sbin/nologin
roooot:x:o:o:/roooooot:/bin/bash
1111111111111111111111111111111
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
[[email protected] grep]# egrep ‘ooo?‘ test.txt 
operator:x:11:0:operator:/root:/sbin/nologin
operator:x:11:0:operator:/rooot:/sbin/nologin
roooot:x:o:o:/roooooot:/bin/bash
[[email protected] grep]# egrep ‘oooo?‘ test.txt 
operator:x:11:o:operator:/rooot:/sbin/nologin
roooot:x:o:o:/roooooot:/bin/bash

//过滤出字符串1或者字符串2

[[email protected] grep]# egrep ‘aaa|111|ooo‘ test.txt 
operator:x:11:o:operator:/rooot:/sbin/nologin
roooot:x:o:o:/roooooot:/bin/bash
1111111111111111111111111111111
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

//egrep中（）的应用

[[email protected] grep]# egrep ‘r(oo|at)o‘ test.txt 
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/rooot:/sbin/nologin
roooot:x:o:o:/roooooot:/bin/bash

这里用()表示一个整体，上例中会把包含rooo或者rato的行过滤出来，另外也可以把（）和其他符号组合在一起，例如(oo)+表示1个或者多个oo，如下所示：

[[email protected] grep]# egrep ‘(oo)+‘ test.txt 
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/rooot:/sbin/nologin
roooot:x:o:o:/roooooot:/bin/bash

本文出自 “Gary博客” 博客，请务必保留此出处http://taoxie.blog.51cto.com/10245493/1983586

以上是关于linux的正则grep及egrep介绍的主要内容，如果未能解决你的问题，请参考以下文章

grep与egrep命令及正则表达式

Linux grep、egrep使用命令详解

grepegrep及相应的正则表达式和用法

Linux CentOS7 VMware正则介绍grep工具egrep表达式

Linux文本过滤搜索器grep与egrep的常用正则表达式与用法

grep和egrep命令及相应的正则表达式用法总结