linux基础学习-10.3-正则表达式详解

Posted klanti

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了linux基础学习-10.3-正则表达式详解相关的知识,希望对你有一定的参考价值。

一、正则表达式 RE regular expression

1、什么是正则 为何用它?

你可以通过什么方法选出这里面的身份证号码。

440304199604012792

130528197108126121

3605sss98304033896

342923198310042132

1404ddddddddd5694X

61242619860416291X

5002xxxxxx04279521

330900199806382320

654126197703092303

131127197105115662

数字与X(在最后一位)

通过符号匹配查找出各种文字。

正则表达式通过特殊符号 ^ $ [] . * 表示各种各样的文字。

方便我们处理文本(日志)。

2、正则使用范围

谁可以使用正则?

三剑客正则(grep sed awk )

python java

3、正则表达式与通配符区别

正则-----在文件中进行过滤(查找文件内容) 三剑客支持

通配符---找出文件(文件名) 大部分命令都可以使用

4、使用正则注意事项

1) 正则默认是按照行为单位处理。

2) 一定要注意不要使用中文符号。

. ‘‘""^ `` ( ) {} [] <>

。‘’“”……??( ){}【】《》

3) 给grep/egrep加上别名

cat >>/etc/profile<<EOF

alias grep=‘grep --color=auto‘

alias egrep=‘egrep --color=auto‘

EOF

source /etc/profile

5、正则分类

基础正则: ^ $ . * []

扩展正则: + | () {} ?

技术分享图片技术分享图片基本正则 grep sed

扩展的正则 egrep sed -r awk

适合linux三剑客

以行为单位处理

cat >>/etc/profile<<EOF

alias egrep=‘egrep --color=auto‘

alias grep=‘grep --color=auto‘

EOF

source /etc/profile

export LC_ALL=C

测试文件

cat >>~/test/oldboy.log<<EOF

I am oldboy teacher!

I teach linux.

I like badminton ball ,billiard ball and chinese chess!

my blog ishttp://oldboy.blog.51cto.com

our site ishttp://www.etiantian.org

my qq num is 49000448.

not 4900000448.

my god ,i am not oldbey,but OLDBOY!

gd

good

god

goood

gooood

oldboy1

EOF

二、基础正则

基础正则(BRE)第一波字符说明

1、 ^word 匹配开头word

2、 word$ 匹配word结尾

3、 ^$ 匹配开头结尾 即空行

[[email protected] test]# grep "^m" oldboy.log

my blog ishttp://oldboy.blog.51cto.com

my qq num is 49000448.

my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep "m$" oldboy.log

my blog ishttp://oldboy.blog.51cto.com

[[email protected] test]# grep -n "^$" oldboy.log

3:

8:

[[email protected] test]# grep "^m" oldboy.log

my blog ishttp://oldboy.blog.51cto.com

my qq num is 49000448.

my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep -n "^$" oldboy.log

3:

8:

[[email protected] test]# grep "oldboy" oldboy.log

I am oldboy teacher!

my blog ishttp://oldboy.blog.51cto.com

[[email protected] test]# grep "oldb.y" oldboy.log

I am oldboy teacher!

my blog ishttp://oldboy.blog.51cto.com

my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep "oldb.y" oldboy.log -o

oldboy

oldboy

oldbey

[[email protected] test]# grep ".$" oldboy.log

I teach linux.

my qq num is 49000448.

not 4900000448.

基础正则(BRE)第二波字符说明

. 代表且只能代表任意一个字符

转译符号,特殊字符还原本意

\n 表示换行

\n 在 echo sed awk 中使用

* 匹配前面的一个字符0次或多次(任意多次)

.* 匹配任意一个字符任意多次(任意字符串)

^.* 匹配开头任意字符串

[[email protected] test]# grep "^.*o" oldboy.log

I am oldboy teacher!

I like badminton ball ,billiard ball and chinese chess!

my blog ishttp://oldboy.blog.51cto.com

our site ishttp://www.etiantian.org

not 4900000448.

my god ,i am not oldbey,but OLDBOY!

说明:如果匹配的内容在一行中有多处,grep会从左到右匹配到最后一个,多多益善

提示:点(.)的特殊含义小结:

1、当前目录

2、使得文件生效相当于source

3、隐藏文件的开头

4、任意一个字符

[[email protected] test]# grep -n "." oldboy.log

2:I teach linux.

5:my blog ishttp://oldboy.blog.51cto.com

6:our site ishttp://www.etiantian.org

7:my qq num is 49000448.

9:not 4900000448.

[[email protected] test]# grep -n "." oldboy.log

1:I am oldboy teacher!

2:I teach linux.

4:I like badminton ball ,billiard ball and chinese chess!

5:my blog ishttp://oldboy.blog.51cto.com

6:our site ishttp://www.etiantian.org

7:my qq num is 49000448.

9:not 4900000448.

10:my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep -n ".*" oldboy.log

1:I am oldboy teacher!

2:I teach linux.

3:

4:I like badminton ball ,billiard ball and chinese chess!

5:my blog ishttp://oldboy.blog.51cto.com

6:our site ishttp://www.etiantian.org

7:my qq num is 49000448.

8:

9:not 4900000448.

10:my god ,i am not oldbey,but OLDBOY!

基础正则(BRE)第三波字符说明

[abc] 匹配a、b、c任意一个字符

[^abc] 匹配非a、非b、非c 任意一个字符

a{n,m} 匹配a这个字符 n到m次

a{n,} 匹配a这个字符至少n次

a{n} 匹配a这个字符n次

a{,m} 匹配a这个字符最多m次

注意: 上面的 是转义 ,但是在 egrep (grep -E) 或 sed -r 或 awk {} 不需要转义

[[email protected] test]# grep -n "[abc]" oldboy.log

1:I am oldboy teacher!

2:I teach linux.

4:I like badminton ball ,billiard ball and chinese chess!

5:my blog ishttp://oldboy.blog.51cto.com

6:our site ishttp://www.etiantian.org

10:my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep -n "[^abc]" oldboy.log

1:I am oldboy teacher!

2:I teach linux.

4:I like badminton ball ,billiard ball and chinese chess!

5:my blog ishttp://oldboy.blog.51cto.com

6:our site ishttp://www.etiantian.org

7:my qq num is 49000448.

9:not 4900000448.

10:my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep -n "[a-Z0-9]" oldboy.log

1:I am oldboy teacher!

2:I teach linux.

4:I like badminton ball ,billiard ball and chinese chess!

5:my blog ishttp://oldboy.blog.51cto.com

6:our site ishttp://www.etiantian.org

7:my qq num is 49000448.

9:not 4900000448.

10:my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep -n "^[abc]" oldboy.log

[[email protected] test]# grep -n "[^abc]" oldboy.log

1:I am oldboy teacher!

2:I teach linux.

4:I like badminton ball ,billiard ball and chinese chess!

5:my blog ishttp://oldboy.blog.51cto.com

6:our site ishttp://www.etiantian.org

7:my qq num is 49000448.

9:not 4900000448.

10:my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep -n "^[^abc]" oldboy.log

1:I am oldboy teacher!

2:I teach linux.

4:I like badminton ball ,billiard ball and chinese chess!

5:my blog ishttp://oldboy.blog.51cto.com

6:our site ishttp://www.etiantian.org

7:my qq num is 49000448.

9:not 4900000448.

10:my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep "[^a-z]" oldboy.log

I am oldboy teacher!

I teach linux.

I like badminton ball ,billiard ball and chinese chess!

my blog ishttp://oldboy.blog.51cto.com

our site ishttp://www.etiantian.org

my qq num is 49000448.

not 4900000448.

my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep "[^0-9]" oldboy.log

I am oldboy teacher!

I teach linux.

I like badminton ball ,billiard ball and chinese chess!

my blog ishttp://oldboy.blog.51cto.com

our site ishttp://www.etiantian.org

my qq num is 49000448.

not 4900000448.

my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep "[^0-9]" oldboy.log

I am oldboy teacher!

I teach linux.

I like badminton ball ,billiard ball and chinese chess!

my blog ishttp://oldboy.blog.51cto.com

our site ishttp://www.etiantian.org

my qq num is 49000448.

not 4900000448.

my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep -v "[0-9]" oldboy.log

I am oldboy teacher!

I teach linux.

I like badminton ball ,billiard ball and chinese chess!

our site ishttp://www.etiantian.org

my god ,i am not oldbey,but OLDBOY!

[[email protected] test]# grep "[0-9]" oldboy.log

my blog ishttp://oldboy.blog.51cto.com

my qq num is 49000448.

not 4900000448.

[[email protected] test]# grep "0{2,3}" oldboy.log

my qq num is 49000448.

not 4900000448.

[[email protected] test]# grep -o "0{2,3}" oldboy.log

000

000

00

[[email protected] test]# grep "0{1,5}" oldboy.log

my qq num is 49000448.

not 4900000448.

[[email protected] test]# grep -o "0{1,5}" oldboy.log

000

00000

[[email protected] test]# egrep "0{1,5}" oldboy.log

my qq num is 49000448.

not 4900000448.

[[email protected] test]# egrep -o "0{1,5}" oldboy.log

000

00000

三、扩展正则

. 匹配前面一个字符一次或多次

a+ 匹配a这个字符一次或多次

? 匹配前面一个字符0次或1次

a? 匹配a这个字符0次或1次

| 表示或者 用于同时过滤多个

( ) 分组 后向引用

egrep "0+" oldboy.txt

egrep -o "0+" oldboy.txt

egrep -o "[a-z]+" oldboy.txt

egrep "[a-z]+" oldboy.txt

egrep -o "[a-z]+" oldboy.txt

egrep -o "[a-zA-Z]+" oldboy.txt

egrep -o "[a-Z]+" oldboy.txt

[[email protected] test]# egrep "go?d" oldboy.log

my god ,i am not oldbey,but OLDBOY!

gd

god

[[email protected] test]# egrep "go+d" oldboy.log

my god ,i am not oldbey,but OLDBOY!

good

god

goood

gooood

[[email protected] test]# egrep "go*d" oldboy.log

my god ,i am not oldbey,but OLDBOY!

gd

good

god

goood

gooood

[[email protected] test]# egrep "go{0,3}d" oldboy.log

my god ,i am not oldbey,but OLDBOY!

gd

good

god

goood

[[email protected] test]# dumpe2fs /dev/sda1 |grep -i "inode size"

dumpe2fs 1.41.12 (17-May-2010)

Inode size: 128

[[email protected] test]# dumpe2fs /dev/sda1 |grep -i "block size"

dumpe2fs 1.41.12 (17-May-2010)

Block size: 1024

[[email protected] test]# dumpe2fs /dev/sda1 |grep -i "inode count"

dumpe2fs 1.41.12 (17-May-2010)

Inode count: 51200

[[email protected] test]# dumpe2fs /dev/sda1 |grep -i "block count"

dumpe2fs 1.41.12 (17-May-2010)

Block count: 204800

Reserved block count: 10240

[[email protected] ~]# echo "##**##@@##@##*#[email protected]@@@@@@2**@@@**##**" |egrep "[#@*]+"

##**##@@##@##*#[email protected]@@@@@@2**@@@**##**

[[email protected] ~]# echo "##**##@@##@##*#[email protected]@@@@@@2**@@@**##**" |egrep "[#@*]+" -o

##**##@@##@##*#

@@@@@@@

**@@@**##**

[[email protected] test]# egrep "oldb(o|e)y" oldboy.log

I am oldboy teacher!

my blog ishttp://oldboy.blog.51cto.com

my god ,i am not oldbey,but OLDBOY!

oldboy1

[[email protected] test]# dumpe2fs /dev/sda1 |egrep -i "(inode|block) size"

dumpe2fs 1.41.12 (17-May-2010)

Block size: 1024

Inode size: 128

[[email protected] test]# dumpe2fs /dev/sda1 |egrep -i "(inode|block) count"

dumpe2fs 1.41.12 (17-May-2010)

Inode count: 51200

Block count: 204800

Reserved block count: 10240

[[email protected] test]# dumpe2fs /dev/sda1 |egrep -i "(inode|block) (size|count)"

dumpe2fs 1.41.12 (17-May-2010)

Inode count: 51200

Block count: 204800

Reserved block count: 10240

Block size: 1024

Inode size: 128

练习题:

1、什么是正则及正则的作用?

2、基础正则的相关符号

3、扩展正则的相关符号

以上是关于linux基础学习-10.3-正则表达式详解的主要内容,如果未能解决你的问题,请参考以下文章

(基础篇) 正则表达式详解

Linux 正则表达式一些题目的详解

Linux 正则表达式详解

Linux学习8之Shell编程--基础正则表达式

Linux学习笔记:Shell基础正则表达式

shell脚本——正则表达式Sed与Awk文本处理工具详解