shell编程初步grep及正则表达式

Posted 2020-10-30

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了shell编程初步grep及正则表达式相关的知识，希望对你有一定的参考价值。

bash的基础特性（3）
1、提供了编程环境

程序=指令+数据

程序编程风格
过程式：以指令为中心，数据服务于指令
对象式：以数据为中心，指令服务于数据

shell程序：提供了编程能力，解释执行

程序的执行方式：
计算机：运行二进制指令
编程语言：
低级：汇编
高级：
编译：高级语言-->编译器-->目标代码（C、C++、java）
解释：高级语言-->解释器-->机器代码（shell、perl、python）

过程式编程：
顺序执行、循环执行、选择执行

shell编程：过程式、解释执行
编程语言的基本结构：数据存储（变量、数组）、表达式、语句

shell脚本：文本文件
shebang：
#!/bin/bash
#!/usr/bin/python
#!/usr/bin/perl

magic number 魔数

运行脚本：
1、给予执行权限，通过具体的文件路径指定文件执行
2、直接运行解释器，将脚本作为解释器程序的参数运行

变量：命名的内存空间

数据存储方式：ASCII
字符：110 转换为二进制 24位
数值：110 转换为二进制 8位（整型、浮点型）

变量的作用：
1、数据存储格式；2、参与的运算、3、表示的数据范围

类型：
字符
数值：整型、浮点型

编程程序语言：
强类型：C
弱类型：bash 把所有要存储的数据统统当作字符进行，支持隐式类型转换、不支持浮点数

逻辑运算：
true、false（1、0）

与：
1 && 1=1
1 && 0=0
0 && 1=0
0 && 0=0

或：
1 || 1=1
1 || 0=1
0 || 1=1
0 || 0=0

非：
！1=0
！0=1

短路运算：
与：
第一个为0，结果必定为0
[[email protected] ~]# catt /etc/issue && cat /etc/issue
-bash: catt: command not found

    [[email protected] ~]# catt /etc/issue && cat /etc/issuee
    -bash: catt: command not found

    第一个为1，第二个必须要参与运算
    [[email protected] ~]# cat /etc/issue && echo "true"
    CentOS release 6.5 (Final)
    Kernel \r on an \m

    true

    [[email protected] ~]# cat /etc/issue && catt /etc/issue
    CentOS release 6.5 (Final)
    Kernel \r on an \m

    -bash: catt: command not found

或：
    第一个为1，结果必定为1
    [[email protected] ~]# cat /etc/issue || cat /etc/issue
    CentOS release 6.5 (Final)
    Kernel \r on an \m

    [[email protected] ~]# cat /etc/issue || catt /etc/issue
    CentOS release 6.5 (Final)
    Kernel \r on an \m

    第一个为0，第二个必须要参与运算
    [[email protected] ~]# catt /etc/issue || cat /etc/issue
    -bash: catt: command not found
    CentOS release 6.5 (Final)
    Kernel \r on an \m

    [[email protected] ~]# catt /etc/issue || cat /etc/issuee
    -bash: catt: command not found
    cat: /etc/issuee: No such file or directory

与、或的混合应用
[[email protected] ~]# cat /etc/issue &> /dev/null && echo "true" || echo "false"
true

[[email protected] ~]# catt /etc/issue &> /dev/null && echo "true" || echo "false"
false

grep：
linux上文本处理的三剑客
grep 文本过滤工具（模式：pattern）
grep、egrep、fgrep
sed stream editor 流编辑器文本编辑工具
awk linux上的实现gawk 文本报告生成器

grep：global search expression and print out the line.
作用：文本搜索工具，根据用户指定的"模式"对目标文本逐行进行匹配检查，打印匹配到的行；
模式：由正则表达式字符及文本字符所编写的过滤条件

REGEXP：由一类特殊字符及文本字符所编写的模式，其中有些字符不表示字符字面意义，而表示控制或通配的功能；
分两类：
基本正则表达式：BRE
扩展正则表达式：ERE
    grep -E，egrep

正则表达式引擎
grep [options] pattern [file...]
选项：
--color=auto，对匹配到的文本着色显示
[[email protected] ~]# grep --color=auto "root" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

-v：显示不能够被pattern匹配到的行
[[email protected] ~]# grep -v "abc" /etc/issue
CentOS release 6.5 (Final)
Kernel \r on an \m

-i：忽略字符大小写
[[email protected] ~]# grep -i "centos" /etc/issue
CentOS release 6.5 (Final)

-o：仅显示匹配到的字符串
[[email protected] ~]# grep -o "release" /etc/issue
release

-q：静默模式，不输出任何信息
[[email protected] ~]# grep -q "release" /etc/issue
[[email protected] ~]# echo $?
0

-A #：显示匹配到的行，追加显示后面的#行，如果后面没有文本内容，则不予显示，表示after
[[email protected] ~]# grep -A 2 "root" /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin

operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
gopher:x:13:30:gopher:/var/gopher:/sbin/nologin

-B #：显示匹配到的行，追加显示前面的#行，如果前面没有文本内容，则不予显示，表示before
[[email protected] ~]# grep -B 2 "root" /etc/passwd
root:x:0:0:root:/root:/bin/bash

mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin

-C #：显示匹配到的行，追加显示前后的各#行，如果前后没有文本内容，则不予显示，表示context上下文
[[email protected] ~]# grep -C 2 "root" /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin

mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
gopher:x:13:30:gopher:/var/gopher:/sbin/nologin

-E：使用ERE扩展正则表达式

基本正则表达式
字符匹配：
.：匹配任意单个字符
[[email protected] ~]# grep "." /tmp/abc
a
b
c

[]：匹配指定范围内的任意单个字符
[[email protected] ~]# grep [abc] /tmp/abc
a
b
c

[^]：匹配指定范围外的任意单个字符
[[email protected] ~]# grep [^abc] /tmp/abc
d
e

[:digit:]：匹配任意单个数字
[[email protected] ~]# grep [[:digit:]] /tmp/abc
1
2
3

[:lower:]：匹配任意单个小写字母
[[email protected] ~]# grep [[:lower:]] /tmp/abc
a
b
c

[:upper:]：匹配任意单个大写字母
[[email protected] ~]# grep [[:upper:]] /tmp/abc
A
B
C

[:alpha:]：匹配任意单个大小写字母
[[email protected] ~]# grep [[:alpha:]] /tmp/abc
a
b
c
A
B
C

[:alnum:]：匹配任意单个大小写字母或数字
[[email protected] ~]# grep [[:alnum:]] /tmp/abc
a
b
c
1
2
3
A
B
C

[:punct:]：匹配任意单个标点符号
[[email protected] ~]# grep [[:punct:]] /tmp/abc
,
.
?

[:space:]：匹配空格

匹配次数：用在要指定次数的字符后面，用于指定前面的的字符要出现的次数
：匹配前面的字符任意次，尽可能的匹配，贪婪模式
[[email protected] ~]# grep "ab" /tmp/abc
b
b
ab
aab
aaab

.：任意长度的任意字符
[[email protected] ~]# grep "." /tmp/abc
a
b
c

\?：匹配其前面的字符0或1次，即前面的字符可有可无
[[email protected] ~]# grep --color=auto "a\?b" /tmp/abc
b
b
ab
aab
aaab
xyab
xyxyab
xyxyxyab

+：匹配其前面的字符至少1次；
[[email protected] ~]# grep "a+b" /tmp/abc
ab
aab
aaab

{m}：匹配前面的字符m次
[[email protected] ~]# grep "a{3}b" /tmp/abc
aaab

{m,n}：匹配前面的字符至少m次，至多n次
[[email protected] ~]# grep "a{1,3}b" /tmp/abc
ab
aab
aaab

{0,n}：匹配前面的字符至多n次
[[email protected] ~]# grep "a{0,3}b" /tmp/abc
b
b
ab
aab
aaab

{m,}：匹配前面的字符至少m次
[[email protected] ~]# grep "a{2,}b" /tmp/abc
aab
aaab

位置锚定：
^：行首锚定，用于模式的最左侧
[[email protected] ~]# grep "^root" /etc/passwd
root:x:0:0:root:/root:/bin/bash

$：行尾锚定，用于模式的最右侧
[[email protected] ~]# grep "bash$" /etc/passwd
root:x:0:0:root:/root:/bin/bash
mary:x:503:503:I am mary.:/home/mary:/bin/bash
centos:x:504:504::/tmp/centos:/bin/bash
test:x:505:505::/tmp/test:/bin/bash
rocket:x:507:507::/home/rocket:/bin/bash

^pattern$：用户模式匹配整行
[[email protected] ~]# grep "^ab$" /tmp/abc
ab

^$：匹配空行

\<或\b：词首锚定，用于单词模式的左侧
[[email protected] ~]# grep "\<root" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

\>或\b：词尾锚定，用于单词模式的右侧
[[email protected] ~]# grep "bash\>" /etc/passwd
root:x:0:0:root:/root:/bin/bash
mary:x:503:503:I am mary.:/home/mary:/bin/bash
centos:x:504:504::/tmp/centos:/bin/bash
test:x:505:505::/tmp/test:/bin/bash
rocket:x:507:507::/home/rocket:/bin/bash

\<pattern\>：匹配整个单词
[[email protected] ~]# grep "\<aaab\>" /tmp/abc
aaab

分组：
()：将一个或多个字符捆绑在一起，当作一个整体进行处理
[[email protected] ~]# grep --color=auto "(xy)*ab" /tmp/abc
ab
aab
aaab
xyab
xyxyab
xyxyxyab

注意：分组括号中的模式匹配到的内容会被正则表达式引擎记录与内部的变量中，这些变量的命名方式为：\1，\2，\3...
\1：从左侧器，第一个左括号以及匹配右括号之间的模式所匹配到的字符
(ab+(xy))
\1：ab+(xy)
\2：xy

后向引用：引用前面的分组括号中到的模式所匹配的字符，（而非模式本身）
[[email protected] ~]# grep --color=auto "(xy)*ab\1+" /tmp/abc
xyabxy
xyxyabxyxy
xyxyxyabxyxyxy

练习：
1、显示/proc/meminfo文件中以大小s开头的行（要求：使用两种方式）
grep "^[sS]" /proc/meminfo
grep -i "^s" /proc/meminfo

2、显示/etc/passwd文件中不以/bin/bash结尾的行；
grep -v "/bin/bash$" /etc/passwd

3、显示/etc/passwd文件中ID号最大的用户的用户名；
cat /etc/passwd | sort -t: -k3 -n | tail -1 | cut -d: -f1

5、找出/etc/passwd中的两位或三位数；
id root &> /dev/null && grep "^root\>" /etc/passwd | cut -d: -f7
grep "^root\>" /etc/passwd &> /dev/null && grep "^root\>" /etc/passwd | cut -d: -f7

6、显示/etc/rc.d/rc.sysinit文件中，至少以一个空白字符开头的且后面存在非空白字符的行；
grep "^[[:space:]]+[^[:space:]]" /etc/rc.d/rc.sysinit

7、找出"netstat -tan"命令的结果中以"LISTEN"后跟0、1或多个空白字符结尾的行；
netstat -tan | grep "LISTEN([[:space:]])$"
netstat -tan | grep "LISTEN[[:space:]]$"

8、添加用户bash、testbash、basher以及nologin（其shell为/sbin/nologin）;而后找出/etc/passwd文件中同shell名的行；
useradd bash
useradd testbash
useradd basher
useradd -s /sbin/nologin nologin

[[email protected] ~]# tail -4 /etc/passwd
bash:x:601:601::/home/bash:/bin/bash
testbash:x:602:602::/home/testbash:/bin/bash
basher:x:603:603::/home/basher:/bin/bash
nologin:x:604:604::/home/nologin:/sbin/nologin

grep "\([[:alnum:]]\+\).*\1\?" /etc/passwd 错误做法
grep "^\([[:alnum:]]\{1,\}\>\).*\1$" /etc/passwd
grep "^\([[:alnum:]]\{1,\}\)\>.*\1$" /etc/passwd
grep "\(\<[[:alnum:]]\{1,\}\>\).*\1$" /etc/passwd
grep "\(\<[[:alnum:]]\+\>\).*\1$" /etc/passwd

练习：
1、写一个脚本，实现如下功能
如果user1用户存在，就显示其存在，否则添加之；
显示添加的用户的id号等信息
[[email protected] test]# cat /tmp/test/a.sh
[[email protected] test]# cat /tmp/test/a.sh
#!/bin/bash

id user1 &> /dev/null && echo "user1 exists." || useradd user1

grep "\<user1\>" /etc/passwd
id user1

[[email protected] test]# bash /tmp/test/a.sh
user1:x:605:605::/home/user1:/bin/bash
uid=605(user1) gid=605(user1) groups=605(user1)

[[email protected] test]# bash /tmp/test/a.sh
user1 exists.
user1:x:605:605::/home/user1:/bin/bash
uid=605(user1) gid=605(user1) groups=605(user1)

2、写一个脚本，完成如下功能
如果root用户登录了当前系统，就显示root用户在线，否则说明其未登录
[[email protected] test]# cat /tmp/test/b.sh
#!/bin/bash

who | grep "^root\>" &> /dev/null && echo "user online." || "user no login."

[[email protected] test]# bash /tmp/test/b.sh
user online.

以上是关于shell编程初步grep及正则表达式的主要内容，如果未能解决你的问题，请参考以下文章

shell编程初步grep及正则表达式

-A #：显示匹配到的行，追加显示后面的#行，如果后面没有文本内容，则不予显示，表示after[[email protected] ~]# grep -A 2 "root" /etc/passwdroot:x:0:0:root:/root:/bin/bashbin:x:1:1:bin:/bin:/sbin/nologindaemon:x:2:2:daemon:/sbin:/sbin/nologin

-B #：显示匹配到的行，追加显示前面的#行，如果前面没有文本内容，则不予显示，表示before[[email protected] ~]# grep -B 2 "root" /etc/passwdroot:x:0:0:root:/root:/bin/bash

-C #：显示匹配到的行，追加显示前后的各#行，如果前后没有文本内容，则不予显示，表示context上下文[[email protected] ~]# grep -C 2 "root" /etc/passwdroot:x:0:0:root:/root:/bin/bashbin:x:1:1:bin:/bin:/sbin/nologindaemon:x:2:2:daemon:/sbin:/sbin/nologin

-A #：显示匹配到的行，追加显示后面的#行，如果后面没有文本内容，则不予显示，表示after
[[email protected] ~]# grep -A 2 "root" /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin

-B #：显示匹配到的行，追加显示前面的#行，如果前面没有文本内容，则不予显示，表示before
[[email protected] ~]# grep -B 2 "root" /etc/passwd
root:x:0:0:root:/root:/bin/bash

-C #：显示匹配到的行，追加显示前后的各#行，如果前后没有文本内容，则不予显示，表示context上下文
[[email protected] ~]# grep -C 2 "root" /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin