Python正则表达式详解

Posted 2020-12-11 laochun

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Python正则表达式详解相关的知识，希望对你有一定的参考价值。

. 匹配任意一个字符(除了 )
[] 匹配[]中列举的字符
d 匹配数字即0-9
D 匹配非数字即不是数字
s 匹配空白即空格tab建
S 匹配非空白
w 匹配单词字符即a-z,A-Z,0-9,_
W 匹配非单词字符

* 匹配前一个字符出现0次或者无限次,可有可无
+ 匹配前一个字符出现一次或者无限次至少一次
? 匹配前一个字符出现一次或者零次要么有一次要么没有
{m} 匹配前一个字符出现m次
{m,n} 匹配前一个字符出现从m到n次
^ 匹配字符串开头
$ 匹配字符串结尾

| 匹配左右任意一个表达式
(ab) 将括号中字符作为一个分组
um 引用分组num匹配到的字符串
(?p<name>) 分组起别名
(?P=name) 引用别名为name分组匹配到的字符串

import re
# re.match(正则表达式,需要处理的字符串)
re.match(r"hello","hello world")

# [] 可以读出大小写
re.match(r"[hH]ello","hello world")

re.match(r"秦时明月1","秦时明月1")

# d 代替
re.match(r"秦时明月d","秦时明月2")
ret = re.match(r"秦时明月d","秦时明月2")
ret.group()

# [123456] 取在内的数据
re.match(r"秦时明月[123456]","秦时明月5").group()
re.match(r"秦时明月[1-6]","秦时明月3").group()
# [1-6a-zA-Z]
re.match(r"秦时明月[1-6a-zA-Z]","秦时明月T").group()
# w匹配单词字符即a-z,A-Z,0-9,_
re.match(r"秦时明月w","秦时明月5").group()
# s 匹配空白即空格tab建
re.match(r"秦时明月sd","秦时明月 5").group()
# 匹配两位数字 d{1,2},{}里面多少位就可以取多大
re.match(r"秦时明月d{1,2}","秦时明月15").group()

# {}指定多少个
re.match(r"d{11}","12345678901").group()
re.match(r"021-d{8}","021-12345678").group()
# ? 要么有要么没有 -
re.match(r"021-?d{8}","02112345678").group()
# d{3-4} 前面三至四位数都行
re.match(r"d{3-4}-?d{8}","0531-12345678").group()

html_content = """"<h1>hahaha
dadsada
asdad</h1>"""
# .* 除了换行不匹配
re.match(r".*",html_content).group()
# re.S匹配换行所有
re.match(r".*",html_content,re.S).group()

# 实例
import re
def main():
names = ["age","_age","1age","age1","a_age","age_1_","age!","a#ge123","____"]
for name in names:
#ret = re.match(r"[a-zA-Z_][a-zA-Z0-9_]*",name)
# $ 匹配字符串结尾
ret = re.match(r"[a-zA-Z_][a-zA-Z0-9_]*$", name)
# ^ 匹配字符串结尾
ret = re.match(r"^[a-zA-Z_][a-zA-Z0-9_]*", name)
if ret:
print("变量名:%s符合要求...通过正则匹配出来的数据是:%s" % (name,ret.group()))
else:
print("变量名:%s 不符合要求..." % name)
if __name__ == ‘__main__‘:
main()

# 163邮箱实例
import re

def main():
email = input("请输入一个邮箱地址:")
# 如果在正则表达式中需要用到某些普通的字符比如. 比如?等
# 需要在他们前面添加一个反斜杠进行转义
#ret = re.match(r"[a-zA-Z0-9]{4,20}@163.com.com$",email)
# 其他邮箱
ret = re.match(r"[a-zA-Z0-9_]{4,20}@(163|126).com$", email)
if ret:
print("%s符合要求..."% email)
else:
print("%s不符合要求..."% email)
if __name__ == ‘__main__‘:
main()

# 匹配html
html_str = "<h1>hahaha</h1>"
re.match(r"<w*>.*</w*>",html_str).group()
# 匹配html标签一至
html_str = "<h1>hahhaha</h2>"
re.match(r"<(w*)>.*</1>",html_str).group()

# search 从尾匹配
import re
ret = re.search(r"d+","阅读次数为999")
ret.group()
# ^
re.search(r"^d+","阅读次数为999,点赞数为100").group(2)

# findall 匹配所有列表
re.findall(r"d+","python=999,c=100,c#=120")

# sub 替换
re.sub(r"d+","100","python999")

# split 根据匹配进行切割字符串并返回一个列表

import re
ret = re.split(r":|","info:xiaogu 888 shenzhen")
print(ret)

import re

def add(temp):
strNum = temp.group()
num = int(strNum) +1
return str(num)

ret = re.sub(r"d+",add,"python=99")
print(ret)

ret = re.sub(r"d+",add,"python=998")

print(ret)

以上是关于Python正则表达式详解的主要内容，如果未能解决你的问题，请参考以下文章