常用模块(subprocess/hashlib/configparser/logging/re)

Posted 2021-01-15 wanlei

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了常用模块(subprocess/hashlib/configparser/logging/re)相关的知识，希望对你有一定的参考价值。

一、subprocess（用来执行系统命令）

import os

cmd = r‘dir D:xxx | findstr "py"‘
# res = subprocess.Popen(cmd,shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
# # 从管道中读取数据   管道就是 两个进程通讯的媒介
# # print(type(res.stdout.read().decode("GBK")))
# print(res.stdout.read().decode("GBK"))
# print(res.stderr.read().decode("GBK"))

subprocess使用当前系统默认编码，得到结果为bytes类型，在windows下需要用gbk解码。

Conclusion

subprocess 主要用于执行系统指令（启动子进程）与os.system的不同在于
subprocess 可以与这个子进程进行数据交换

二、hashlib（加密）

hash是一种算法是将一个任意长的数据根据计算得到一个固定长度特征码
特征: 不同输入可能会有相同的结果几率特别小，相同的输入必然得到相同结果，由于散列(特征)的性质，从原理来看是不可能反解。

用来验证两个输入的数据是否一致

使用场景：

1.密码验证

2.验证数据是否被篡改比如游戏安装包有没有被改过，为了防止别人撞库成功可用提升密码的复杂度其次可以为密码加盐 (加点内容进去)

import  hashlib

m = hashlib.md5("aaa".encode("utf-8"))
print(len(m.hexdigest()))   32

# 撞库破解的原理 有人事先 把常见的 明文和密文的对应关系 存到了数据库中
# 运气好就能查询到
pwds = {"aaa":"47bce5c74f589f4867dbd57e9ca9f808"}


h1 = hashlib.sha512("123".encode("utf-8"))
h2 = hashlib.sha3_512("123".encode("utf-8"))

# print(len(h.hexdigest()))
print(h1.hexdigest())
print(h2.hexdigest())

# 2b70683ef3fa64572aa50775acc84855

# 加盐
m = hashlib.md5("321".encode("utf-8"))
#加
m.update("abcdefplkjoujhh".encode("utf-8"))

print(m.hexdigest())

import hmac
# 没啥区别 只是在创建的时候必须加盐
h = hmac.new("abcdefjjjj".encode("utf-8"))

h.update("123".encode("utf-8"))

print(h.hexdigest())

三、configparser（用于解析配置文件的模块）

何为配置文件？
包含配置程序信息的文件就称为配置文件
什么样的数据应作为配置信息？
需要改但是不经常改的信息例如数据文件的路径 DB_PATH

配置文件中只有两种内容
一种是section 分区
一种是option 选项就是一个key=value形式

用的最多的就是get功能用来从配置文件获取一个配置选项

import configparser
# 创建一个解析器
config = configparser.ConfigParser()
# 读取并解析test.cfg
config.read("test.cfg",encoding="utf-8")
# 获取需要的信息
# 获取所有分区
# print(config.sections())
# 获取所有选项
# print(config.options("user"))
# 获取某个选项的值
# print(config.get("path","DB_PATH"))
# print(type(config.get("user","age")))
#
# # get返回的都是字符串类型  如果需要转换类型 直接使用get+对应的类型(bool int float)
# print(type(config.getint("user","age")))
# print(type(config.get("user","age")))

# 是否由某个选项
config.has_option()
# 是否由某个分区
# config.has_section()

# 不太常用的
# 添加
# config.add_section("server")
# config.set("server","url","192.168.1.2")
# 删除
# config.remove_option("user","age")
# 修改
# config.set("server","url","192.168.1.2")

# 写回文件中
# with open("test.cfg", "wt", encoding="utf-8") as f:
#     config.write(f)

练习:
做一个登录首先查看配置文件是否又包含用户名和密码如果由直接登录如果没有就进行输入用户名密码登录
登录成功后询问是否要保存密码如果是写入配置文件

# import configparser
# 
# config = configparser.ConfigParser()
# config.read(‘login.ini‘, encoding=‘utf-8‘)
# username1 = ‘wwl‘
# password1 = ‘123‘
# if config.has_option(‘user‘,‘username‘) and config.has_option(‘user‘,‘password‘):
#     print(‘welcome logging‘)
#     exit()
# else:
#     username = input(‘>>>请输入用户名：‘).strip()
#     password = input(‘>>>请输入密码：‘).strip()
#     if username == username1 and password == password1:
#         print(‘welcome logging‘)
#         print(‘保存密码请输入1，退出请输入2‘)
#         choice = input(‘请输入：‘)
#         if choice == ‘1‘:
#             with open(‘login.ini‘, ‘wt‘, encoding=‘utf-8‘) as f:
#                 config.add_section("login")
#                 config.set("login", "Username",username)
#                 config.set("login", "Password", password)
#                 print(config.get(‘login‘,‘Username‘))
#                 config.write(f)
#         elif choice == ‘2‘:
#             exit()
#     else:
#         print(‘wrong username or password‘)

login.ini #产生了新的配置文件

[login]
username = wwl
password = 123

四、logging

一、日志级别：

CRITICAL = 50 #FATAL = CRITICAL
ERROR = 40
WARNING = 30 #WARN = WARNING
INFO = 20
DEBUG = 10
NOTSET = 0 #不设置

二、默认级别为warning，默认打印到终端：

import logging

logging.debug(‘调试debug‘)
logging.info(‘消息info‘)
logging.warning(‘警告warn‘)
logging.error(‘错误error‘)
logging.critical(‘严重critical‘)

‘‘‘
WARNING:root:警告warn
ERROR:root:错误error
CRITICAL:root:严重critical

三、为logging模块指定全局配置，针对所有logger有效，控制打印到文件中

可在logging.basicConfig()函数中通过具体参数来更改logging模块默认行为，可用参数有
filename：用指定的文件名创建FiledHandler（后边会具体讲解handler的概念），这样日志会被存储在指定的文件中。
filemode：文件打开方式，在指定了filename时使用这个参数，默认值为“a”还可指定为“w”。
format：指定handler使用的日志显示格式。 
datefmt：指定日期时间格式。 
level：设置rootlogger（后边会讲解具体概念）的日志级别 
stream：用指定的stream创建StreamHandler。可以指定输出到sys.stderr,sys.stdout或者文件，默认为sys.stderr。若同时列出了filename和stream两个参数，则stream参数会被忽略

#格式
%(name)s：Logger的名字，并非用户名，详细查看

%(levelno)s：数字形式的日志级别

%(levelname)s：文本形式的日志级别

%(pathname)s：调用日志输出函数的模块的完整路径名，可能没有

%(filename)s：调用日志输出函数的模块的文件名

%(module)s：调用日志输出函数的模块名

%(funcName)s：调用日志输出函数的函数名

%(lineno)d：调用日志输出函数的语句所在的代码行

%(created)f：当前时间，用UNIX标准的表示时间的浮 点数表示

%(relativeCreated)d：输出日志信息时的，自Logger创建以 来的毫秒数

%(asctime)s：字符串形式的当前时间。默认格式是 “2003-07-08 16:49:45,896”。逗号后面的是毫秒

%(thread)d：线程ID。可能没有

%(threadName)s：线程名。可能没有

%(process)d：进程ID。可能没有

%(message)s：用户输出的消息

logging.basicConfig()

四、logging模块的Formatter，Handler，Logger，Filter对象

#logger：产生日志的对象

#Filter：过滤日志的对象

#Handler：接收日志然后控制打印到不同的地方，FileHandler用来打印到文件中，StreamHandler用来打印到终端

#Formatter对象：可以定制不同的日志格式对象，然后绑定给不同的Handler对象使用，以此来控制不同的Handler的日志格式

‘‘‘
critical=50
error =40
warning =30
info = 20
debug =10
‘‘‘


import logging

#1、logger对象：负责产生日志，然后交给Filter过滤，然后交给不同的Handler输出
logger=logging.getLogger(__file__)

#2、Filter对象：不常用，略

#3、Handler对象：接收logger传来的日志，然后控制输出
h1=logging.FileHandler(‘t1.log‘) #打印到文件
h2=logging.FileHandler(‘t2.log‘) #打印到文件
h3=logging.StreamHandler() #打印到终端

#4、Formatter对象：日志格式
formmater1=logging.Formatter(‘%(asctime)s - %(name)s - %(levelname)s -%(module)s:  %(message)s‘,
                    datefmt=‘%Y-%m-%d %H:%M:%S %p‘,)

formmater2=logging.Formatter(‘%(asctime)s :  %(message)s‘,
                    datefmt=‘%Y-%m-%d %H:%M:%S %p‘,)

formmater3=logging.Formatter(‘%(name)s %(message)s‘,)


#5、为Handler对象绑定格式
h1.setFormatter(formmater1)
h2.setFormatter(formmater2)
h3.setFormatter(formmater3)

#6、将Handler添加给logger并设置日志级别
logger.addHandler(h1)
logger.addHandler(h2)
logger.addHandler(h3)
logger.setLevel(10)

#7、测试
logger.debug(‘debug‘)
logger.info(‘info‘)
logger.warning(‘warning‘)
logger.error(‘error‘)
logger.critical(‘critical‘

五、应用

"""
logging配置
"""

import os
import logging.config

# 定义三种日志输出格式 开始

standard_format = ‘[%(asctime)s][%(threadName)s:%(thread)d][task_id:%(name)s][%(filename)s:%(lineno)d]‘                   ‘[%(levelname)s][%(message)s]‘ #其中name为getlogger指定的名字

simple_format = ‘[%(levelname)s][%(asctime)s][%(filename)s:%(lineno)d]%(message)s‘

id_simple_format = ‘[%(levelname)s][%(asctime)s] %(message)s‘

# 定义日志输出格式 结束

logfile_dir = os.path.dirname(os.path.abspath(__file__))  # log文件的目录

logfile_name = ‘all2.log‘  # log文件名

# 如果不存在定义的日志目录就创建一个
if not os.path.isdir(logfile_dir):
    os.mkdir(logfile_dir)

# log文件的全路径
logfile_path = os.path.join(logfile_dir, logfile_name)

# log配置字典
LOGGING_DIC = {
    ‘version‘: 1,
    ‘disable_existing_loggers‘: False,
    ‘formatters‘: {
        ‘standard‘: {
            ‘format‘: standard_format
        },
        ‘simple‘: {
            ‘format‘: simple_format
        },
    },
    ‘filters‘: {},
    ‘handlers‘: {
        #打印到终端的日志
        ‘console‘: {
            ‘level‘: ‘DEBUG‘,
            ‘class‘: ‘logging.StreamHandler‘,  # 打印到屏幕
            ‘formatter‘: ‘simple‘
        },
        #打印到文件的日志,收集info及以上的日志
        ‘default‘: {
            ‘level‘: ‘DEBUG‘,
            ‘class‘: ‘logging.handlers.RotatingFileHandler‘,  # 保存到文件
            ‘formatter‘: ‘standard‘,
            ‘filename‘: logfile_path,  # 日志文件
            ‘maxBytes‘: 1024*1024*5,  # 日志大小 5M
            ‘backupCount‘: 5,
            ‘encoding‘: ‘utf-8‘,  # 日志文件的编码，再也不用担心中文log乱码了
        },
    },
    ‘loggers‘: {
        #logging.getLogger(__name__)拿到的logger配置
        ‘‘: {
            ‘handlers‘: [‘default‘, ‘console‘],  # 这里把上面定义的两个handler都加上，即log数据既写入文件又打印到屏幕
            ‘level‘: ‘DEBUG‘,
            ‘propagate‘: True,  # 向上（更高level的logger）传递
        },
    },
}


def load_my_logging_cfg():
    logging.config.dictConfig(LOGGING_DIC)  # 导入上面定义的logging配置
    logger = logging.getLogger(__name__)  # 生成一个log实例
    logger.info(‘It works!‘)  # 记录该文件的运行状态

if __name__ == ‘__main__‘:
    load_my_logging_cfg()

logging配置文件

五、re（正则表达式相关）

什么是正则表达式？

一堆带有特殊意义的符号组成式子

它的作用，处理(匹配查找替换 )字符串。
1.在爬虫中大量使用其实有框架帮你封装了这些复杂的正则
2.在网站和手机app的注册功能中大量使用例如判断你的邮箱地址是否正确

import re

# =========单个字符匹配=========
print(re.findall("
","1
"))  # 匹配换行符
print(re.findall("	","1asasas121   	"))  # 匹配制表符

# ==========范围匹配===========
print(re.findall("w","1aA_*")) # 匹配数字字母下划线
print(re.findall("W","1aA_*,")) # 匹配非数字字母下划线
print(re.findall("s","   

	f")) # 匹配任意空白字符
print(re.findall("S","   

	f")) # 匹配任意非空白字符
print(re.findall("d","123abc1*")) # 匹配任意非空白字符
print(re.findall("D","123abc1*")) # 匹配任意非空白字符
# print(re.findall("[abc]","AaBbCc")) # 匹配 a b c都行
# print(re.findall("[^abc]","AaBbCc")) # 除了 a b c都行
# print(re.findall("[0-9]","AaBbCc12349")) # 除了 a b c都行
print(re.findall("[a-z]","AaBbCc12349")) # a-z 英文字母
print(re.findall("[A-z]","AaBbC:c??2349[]")) # A-z 匹配原理 是按照ascII码表

# ===========匹配位置======
print(re.findall("Ad","123abc1*")) # 从字符串的开始处匹配
print(re.findall("d","123abc1*9
")) # 从字符串的结束处匹配 注意把写在表达式的右边
print(re.findall("d$","123abc1*9"))  # 从字符串的结束处匹配  如果末尾有换行 换行不会参与匹配
print(re.findall("^d","s1asasas121   	"))  # 从字符开始匹配数字

import re


# [] 范围匹配  中间 用-来连接
# re.findall("[a-zA-Z0-9]","a ab abc abcd a123c")
# 如果要匹配 符号-  要写表达式的左边或右边
# print(re.findall("[-ab]","a ab abc abcd a123c a--"))

# 重复匹配 表达式的匹配次数
# * 表示 任意次数  所以0次也满足
print(re.findall("[a-zA-Z]*","a ab abc abcdssdsjad a123c"))
#                               [a-zA-Z]*
# +    一次或多次
print(re.findall("[a-zA-Z]+","a ab abc abcdssdsjad a123c"))
#                                                  [a-zA-Z]+
# ?  0次或1次
print(re.findall("[a-zA-Z]?","a ab abc abcdssdsjad a123c"))

# {1,2} 自定义匹配次数  {1，} 1到无穷 {，1} 0到1次
print(re.findall("[a-zA-Z]{1,2}","a ab abc abcdsdssjad a123c"))


一般用非贪婪匹配的情况多一些：

# + * 贪婪匹配  表达式匹配的情况下 尽可能的多拿     （一直匹配 直到不满足为止）

# print(re.findall("w*","jjsahdjshdjssadsa dssddsads"))
# print(re.findall("w+","jjsahdjshdjssadsa dssddsads"))
# 非贪婪匹配 在表达式的后面加上?
# print(re.findall("w?","jjsahdjshdjssadsa dssddsads")) # 非贪婪匹配

分组

# 分组 加上分组 不会改变原来的规则 仅仅是将括号中的内容单独拿出来了
print(re.findall("([a-zA-Z]+)_dsb","aigen_dsb cxx_dsb alex_dsb zxx_xsb _dsb"))

模块中常用的函数

# re模块中常用的函数
# match 从字符串开始处匹配  只找一个
print(re.match("w*","abc").group(0)) # 获取匹配成功的内容
# group 用来获取某个分组的内容 默认获取第0组 就是整个表达式本身
print(re.match("([a-zA-Z]+)(_dsb)","aigen_dsb cxx_dsb alex_dsb zxx_xsb _dsb").group(2))
print(re.match("w*","abc").span()) # 获取匹配成功的内容的索引

print(re.search("w*","abc").group())
# 从全文范围取一个
print(re.search("([a-zA-Z]+)(_dsb)","xxx aigen_dsb cxx_dsb alex_dsb zxx_xsb _dsb"))
# 从开始的位置开始匹配
# print(re.match("([a-zA-Z]+)(_dsb)","xxx aigen_dsb cxx_dsb alex_dsb zxx_xsb _dsb").group())
# 将正则表达式 编译成一个对象 往后可以不用在写表达式 直接开始匹配
# print(re.compile("w*").findall("abcd"))
# print(re.split("|_*|","python|____|js|____|java"))

# 替换
print(re.sub("python","PYTHON","js|python|java"))
# 用正则表达式来交换位置
text = "java|C++|js|C|python"
# text1 = "java|C++|js|C|python"
# 将整个内容分为三块 java     |C++xxxxxx|      python
partten = "(.+?)(|.+|)(.+)"
".+?ahgshags"
# ?:用于取消分组  就和没写括号一样
# partten = "(?:.+?)(|.+|)(.+)"
# print(re.search(partten,text).group(0))
print(re.sub(partten,r"231",text))


# 当要匹配的内容包含时
text = "ap"
"p"

print(text)
print(re.findall(r"a\p",text))

练习题：

# qq密码  长度6--16   数字字母特殊    不包含^
# 如果包含^ 不匹配任何内容
# 除了^ 别的都能匹配上
"[^^]{6,16}"
import re
# print(re.search("[^^]{6,16}","1234567^as^"))
# print(re.search("[^[^]+.{6,16}","1234567as"))

# print(re.match("[^@]{6,16}","[email protected]"))
#
# print(re.match("[a-z]{6，16}","abasadsasasa^"))
# 长度必须为6 不能包含@
print(re.match("^[^^]{6,8}$","1111111^56781111"))

# print(re.match("[0-9]{6,7}","1234567"))
# print(re.match("^"[^@]{6,16}"$", ‘"1234567io1u"‘))

# ^$ 整体匹配   将字符串内容看作一个整体  而不是像之前的逐个匹配
print(re.match("^[^^]{3,6}$","1234567"))

# 手机号码验证 长度11 以1开头 全都是数字
print(re.match("^1(89|80|32)d{8}$","18921999093"))

# 邮箱地址验证 字母数字下划线(至少6个)@字母数字下划线(最少一个).(cn com org edu任意一个)   可以有[email protected]
partten = "^w{6,}@w+.(cn|com|org|edu)$"
# 只接受  qq  sina 163

print(re.match(partten,"18921999as [email protected]"))

# 身份证号码  要么18 要么15位数字  最后一个可能是X
# partten = "^d{17}(X|d)$"
partten2 = "(^d{15}$)|(^d{17}(X|d)$)"
print(re.match(partten2,"123321200010100"))

以上是关于常用模块(subprocess/hashlib/configparser/logging/re)的主要内容，如果未能解决你的问题，请参考以下文章