python开发 【五】字符串格式化和模块
1) 字符串格式化的两种方式 % 和format
- 模块简介;
- 模块使用;
- 常用模块的功能方法,如:os、sys、hashlib、用于序列化的json 和 pickle 、shutil、ConfigParser、logging、time、re、random
一. 字符串格式化
Python的字符串格式化有两种方式: 百分号方式、format方式
This PEP proposes a new system for built-in string formatting operations, intended as a replacement for the existing ‘%‘ string formatting operator.
- (name) 【可选】,用于选择指定的key
- flags 【可选】,可供选择的值有:
- + 右对齐;正数前加正好,负数前加负号;
- - 左对齐;正数前无符号,负数前加负号;
- 空格 右对齐;正数前加空格,负数前加负号;
- 0 右对齐;正数前无符号,负数前加负号;用0填充空白处
- width 【可选】,占有宽度
- .precision 【可选】,小数点后保留的位数
- typecode 必选
- s,获取传入对象的__str__方法的返回值,并将其格式化到指定位置
- r,获取传入对象的__repr__方法的返回值,并将其格式化到指定位置
- c,整数:将数字转换成其unicode对应的值,10进制范围为 0 <= i <= 1114111(py27则只支持0-255);字符:将字符添加到指定位置
- o,将整数转换成 八 进制表示,并将其格式化到指定位置
- x,将整数转换成十六进制表示,并将其格式化到指定位置
- d,将整数、浮点数转换成 十 进制表示,并将其格式化到指定位置
- e,将整数、浮点数转换成科学计数法,并将其格式化到指定位置(小写e)
- E,将整数、浮点数转换成科学计数法,并将其格式化到指定位置(大写E)
- f, 将整数、浮点数转换成浮点数表示,并将其格式化到指定位置(默认保留小数点后6位)
- F,同上
- g,自动调整将整数、浮点数转换成 浮点型或科学计数法表示(超过6位数用科学计数法),并将其格式化到指定位置(如果是科学计数则是e;)
- G,自动调整将整数、浮点数转换成 浮点型或科学计数法表示(超过6位数用科学计数法),并将其格式化到指定位置(如果是科学计数则是E;)
- %,当字符串中存在格式化标志时,需要用 %%表示一个百分号
tpl = "i am %s" % "name1" tpl = "i am %s age %d" % ("name1", 18) tpl = "i am %(name)s age %(age)d" % {"name": "name1", "age": 18} tpl = "percent %.2f" % 99.97623 tpl = "i am %(pp).2f" % {"pp": 123.425556, } tpl = "i am %.2f %%" % 123.425556
- fill 【可选】空白处填充的字符
- align 【可选】对齐方式(需配合width使用)
- <,内容左对齐
- >,内容右对齐(默认)
- =,内容右对齐,将符号放置在填充字符的左侧,且只对数字类型有效。 即使:符号+填充物+数字
- ^,内容居中
- sign 【可选】有无符号数字
- +,正号加正,负号加负;
- -,正号不变,负号加负;
- 空格 ,正号空格,负号加负;
- # 【可选】对于二进制、八进制、十六进制,如果加上#,会显示 0b/0o/0x,否则不显示
- , 【可选】为数字添加分隔符,如:1,000,000
- width 【可选】格式化位所占宽度
- .precision 【可选】小数位保留精度
- type 【可选】格式化类型
- 传入” 字符串类型 “的参数
- s,格式化字符串类型数据
- 空白,未指定类型,则默认是None,同s
- 传入“ 整数类型 ”的参数
- b,将10进制整数自动转换成2进制表示然后格式化
- c,将10进制整数自动转换为其对应的unicode字符
- d,十进制整数
- o,将10进制整数自动转换成8进制表示然后格式化;
- x,将10进制整数自动转换成16进制表示然后格式化(小写x)
- X,将10进制整数自动转换成16进制表示然后格式化(大写X)
- 传入“ 浮点型或小数类型 ”的参数
- e, 转换为科学计数法(小写e)表示,然后格式化;
- E, 转换为科学计数法(大写E)表示,然后格式化;
- f , 转换为浮点型(默认小数点后保留6位)表示,然后格式化;
- F, 转换为浮点型(默认小数点后保留6位)表示,然后格式化;
- g, 自动在e和f中切换
- G, 自动在E和F中切换
- %,显示百分比(默认显示小数点后6位)
- 传入” 字符串类型 “的参数
tpl = "i am {}, age {}, {}".format("seven", 18, ‘name1‘)
tpl = "i am {}, age {}, {}".format(*["seven", 18, ‘name1‘])
tpl = "i am {0}, age {1}, really {0}".format("seven", 18)
tpl = "i am {0}, age {1}, really {0}".format(*["seven", 18])
tpl = "i am {name}, age {age}, really {name}".format(name="seven", age=18)
tpl = "i am {name}, age {age}, really {name}".format(**{"name": "seven", "age": 18})
tpl = "i am {0[0]}, age {0[1]}, really {0[2]}".format([1, 2, 3], [11, 22, 33])
tpl = "i am {:s}, age {:d}, money {:f}".format("seven", 18, 88888.1)
tpl = "i am {:s}, age {:d}".format(*["seven", 18])
tpl = "i am {name:s}, age {age:d}".format(name="seven", age=18)
tpl = "i am {name:s}, age {age:d}".format(**{"name": "seven", "age": 18})
tpl = "numbers: {:b},{:o},{:d},{:x},{:X}, {:%}".format(15, 15, 15, 15, 15, 15.87623, 2)
tpl = "numbers: {0:b},{0:o},{0:d},{0:x},{0:X}, {0:%}".format(15)
tpl = "numbers: {num:b},{num:o},{num:d},{num:x},{num:X}, {num:%}".format(num=15)
- 访问者不需要关心迭代器内部的结构,仅需通过next()方法不断去取下一个内容
- 不能随机访问集合中的某个值 ,只能从头到尾依次访问
- 访问到一半时不能往回退
- 便于循环比较大的数据集合,节省内存
>>> a = iter([1,2,3,4,5]) >>> a <list_iterator object at 0x101402630> >>> a.__next__() 1 >>> a.__next__() 2 >>> a.__next__() 3 >>> a.__next__() 4 >>> a.__next__() 5 >>> a.__next__() Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
def func(): yield 1 yield 2 yield 3 yield 4
>>> temp = func() >>> temp.__next__() 1 >>> temp.__next__() 2 >>> temp.__next__() 3 >>> temp.__next__() 4 >>> temp.__next__() Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
def nrange(num): temp = -1 while True: temp = temp + 1 if temp >= num: return else: yield temp m = nrange(4) for i in m: print(i) # 执行结果: 0 1 2 3
m = nrange(4) print(m.__next__()) print(m.__next__()) print(m.__next__()) print(m.__next__()) # 执行结果: 0 1 2 3
Traceback (most recent call last): File "E:/liuhailong/student13/s13/day4/", line 45, in <module> print(m.__next__()) StopIteration
二. 模块
类似于函数式编程和面向过程编程,函数式编程则完成一个功能,其他代码用来调用即可,提供了代码的重用性和代码间的耦合。而对于一个复杂的功能来,可能需要多个函数才能完成(函数又可以在不同的.py文件中),n个 .py 文件组成的代码集合就称为模块。
如:os 是系统相关的模块;file是文件操作相关的模块
- 自定义模块
- 内置模块
- 开源模块
标准1: 必须是一个.py结尾的Python程序,可以被Python执行使用;
标准2: 模块要写在,执行文件的同级目录下,模块可以写在目录里,也可以是多个.py文件;
标准3: 不同作用的模块,分目录存储,比如:配置相关的写到conf目录里,功能相关的写到lib目录里,主程序使用manage,核心执行程序写到core目录里等;
import module from module.xx.xx import xx from module.xx.xx import xx as rename from module.xx.xx import *
- 导入一个py文件,解释器解释该py文件
- 导入一个包,解释器解释该包下的 文件
import sys print(sys.path) # 执行结果: [‘E:\\\\liuhailong\\\\student13\\\\s13\\\\day4‘, ‘E:\\\\liuhailong\\\\student13\\\\s13‘, ‘E:\\\\liuhailong\\\\python3\\\\‘, ‘E:\\\\liuhailong\\\\python3\\\\DLLs‘, ‘E:\\\\liuhailong\\\\python3\\\\lib‘, ‘E:\\\\liuhailong\\\\python3‘, ‘E:\\\\liuhailong\\\\python3\\\\lib\\\\site-packages‘]
如果sys.path路径列表没有你想要的路径,可以通过 sys.path.append(‘路径‘) 添加。
import sys import os pre_path = os.path.abspath(‘../‘) sys.path.append(pre_path)
方式一: yum pip apt-get ...
编译源码 python build
安装源码 python install
yum install gcc yum install python-devel 或 apt-get python-dev
安装成功后,模块会自动安装到 sys.path 中的某个目录中,如:
三、模块 paramiko
# pycrypto,由于 paramiko 模块内部依赖pycrypto,所以先下载安装pycrypto # 下载安装 pycrypto wget tar -xvf pycrypto-2.6.1.tar.gz cd pycrypto-2.6.1 python build python install # 进入python环境,导入Crypto检查是否安装成功 # 下载安装 paramiko wget tar -xvf paramiko-1.10.1.tar.gz cd paramiko-1.10.1 python build python install # 进入python环境,导入paramiko检查是否安装成功
- 执行命令 - 通过用户名和密码连接服务器
#!/usr/bin/env python #coding:utf-8 import paramiko ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) ssh.connect(‘‘, 22, ‘alex‘, ‘123‘) stdin, stdout, stderr = ssh.exec_command(‘df‘) print ( ssh.close();
- 执行命令 - 过密钥链接服务器
import paramiko private_key_path = ‘/home/auto/.ssh/id_rsa‘ key = paramiko.RSAKey.from_private_key_file(private_key_path) ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) ssh.connect(‘主机名 ‘, 端口, ‘用户名‘, key) stdin, stdout, stderr = ssh.exec_command(‘df‘) print( ssh.close()
- 上传或者下载文件 - 通过用户名和密码
import os,sys import paramiko t = paramiko.Transport((‘‘,22)) t.connect(username=‘wupeiqi‘,password=‘123‘) sftp = paramiko.SFTPClient.from_transport(t) sftp.put(‘/tmp/‘,‘/tmp/‘) t.close() import os,sys import paramiko t = paramiko.Transport((‘‘,22)) t.connect(username=‘wupeiqi‘,password=‘123‘) sftp = paramiko.SFTPClient.from_transport(t) sftp.get(‘/tmp/‘,‘/tmp/‘) t.close() 上传或者下载文件 - 通过用户名和密码
- 上传或下载文件 - 通过密钥
import paramiko pravie_key_path = ‘/home/auto/.ssh/id_rsa‘ key = paramiko.RSAKey.from_private_key_file(pravie_key_path) t = paramiko.Transport((‘‘,22)) t.connect(username=‘wupeiqi‘,pkey=key) sftp = paramiko.SFTPClient.from_transport(t) sftp.put(‘/tmp/‘,‘/tmp/‘) t.close() import paramiko pravie_key_path = ‘/home/auto/.ssh/id_rsa‘ key = paramiko.RSAKey.from_private_key_file(pravie_key_path) t = paramiko.Transport((‘‘,22)) t.connect(username=‘wupeiqi‘,pkey=key) sftp = paramiko.SFTPClient.from_transport(t) sftp.get(‘/tmp/‘,‘/tmp/‘) t.close() 上传或下载文件 - 通过密钥
os.getcwd() 获取当前工作目录,即当前python脚本工作的目录路径 os.chdir("dirname") 改变当前脚本工作目录;相当于shell下cd os.curdir 返回当前目录: (‘.‘) os.pardir 获取当前目录的父目录字符串名:(‘..‘) os.makedirs(‘dirname1/dirname2‘) 可生成多层递归目录 os.removedirs(‘dirname1‘) 若目录为空,则删除,并递归到上一级目录,如若也为空,则删除,依此类推 os.mkdir(‘dirname‘) 生成单级目录;相当于shell中mkdir dirname os.rmdir(‘dirname‘) 删除单级空目录,若目录不为空则无法删除,报错;相当于shell中rmdir dirname os.listdir(‘dirname‘) 列出指定目录下的所有文件和子目录,包括隐藏文件,并以列表方式打印 os.remove() 删除一个文件 os.rename("oldname","newname") 重命名文件/目录 os.stat(‘path/filename‘) 获取文件/目录信息 os.sep 输出操作系统特定的路径分隔符,win下为"\\\\",Linux下为"/" os.linesep 输出当前平台使用的行终止符,win下为"\\t\\n",Linux下为"\\n" os.pathsep 输出用于分割文件路径的字符串 输出字符串指示当前使用平台。win->‘nt‘; Linux->‘posix‘ os.system("bash command") 运行shell命令,直接显示 os.environ 获取系统环境变量 os.path.abspath(path) 返回path规范化的绝对路径 os.path.split(path) 将path分割成目录和文件名二元组返回 os.path.dirname(path) 返回path的目录。其实就是os.path.split(path)的第一个元素 os.path.basename(path) 返回path最后的文件名。如何path以/或\\结尾,那么就会返回空值。即os.path.split(path)的第二个元素 os.path.exists(path) 如果path存在,返回True;如果path不存在,返回False os.path.isabs(path) 如果path是绝对路径,返回True os.path.isfile(path) 如果path是一个存在的文件,返回True。否则返回False os.path.isdir(path) 如果path是一个存在的目录,则返回True。否则返回False os.path.join(path1[, path2[, ...]]) 将多个路径组合后返回,第一个绝对路径之前的参数将被忽略 os.path.getatime(path) 返回path所指向的文件或者目录的最后存取时间 os.path.getmtime(path) 返回path所指向的文件或者目录的最后修改时间
sys.argv 命令行参数List,第一个元素是程序本身路径 sys.exit(n) 退出程序,正常退出时exit(0) sys.version 获取Python解释程序的版本信息 sys.maxint 最大的Int值 sys.path 返回模块的搜索路径,初始化时使用PYTHONPATH环境变量的值 sys.platform 返回操作系统平台名称 sys.stdout.write(‘please:‘) val = sys.stdin.readline()[:-1]
用于加密相关的操作,代替了md5模块和sha模块,主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ,MD5 算法
import hashlib # ######## md5 ######## hash = hashlib.md5() hash.update(‘admin‘) print (hash.hexdigest()) # ######## sha1 ######## hash = hashlib.sha1() hash.update(‘admin‘) print (hash.hexdigest()) # ######## sha256 ######## hash = hashlib.sha256() hash.update(‘admin‘) print (hash.hexdigest()) # ######## sha384 ######## hash = hashlib.sha384() hash.update(‘admin‘) print (hash.hexdigest()) # ######## sha512 ######## hash = hashlib.sha512() hash.update(‘admin‘) print (hash.hexdigest())
import hashlib # ######## md5 ######## hash = hashlib.md5(‘898oaFs09f‘) hash.update(‘admin‘) print (hash.hexdigest())
还不够吊?python 还有一个 hmac 模块,它内部对我们创建 key 和 内容 再进行处理然后再加密
import hmac h =‘wueiqi‘) h.update(‘hellowo‘) print (h.hexdigest())
四、json 和 pickle
- json,用于字符串 和 python数据类型间进行转换
- pickle,用于python特有的类型 和 python的数据类型间进行转换
import json li = [123,456] r = json.dumps(li) print(r,type(r)) ret = json.loads(r) print(ret,type(ret)) # 执行结果: [123, 456] <class ‘str‘> [123, 456] <class ‘list‘>
import json adict = {"k1":"v1","k2":"v2"} with open(‘db.txt‘,‘w‘) as f: r = json.dump(adict,f) with open(‘db.txt‘,‘r‘) as f: r = json.load(f) print(r,type(r))
{‘k2‘: ‘v2‘, ‘k1‘: ‘v1‘} <class ‘dict‘> # 执行打开db.txt,只写后,json.dump将adict内容由字典类型转为字符串类型,并写入打开的文件; 此时生成一个包含adict内容的db.txt文件; # 以只读的方式打开db.txt文件,json.load将文件内容由字符串类型转为数据类型格式; # 打印出结果;
import pickle adict = {"k1":"v1","k2":"v2"} r = pickle.dumps(adict) print(r,type(r)) ret = pickle.loads(r) print(ret,type(ret))
b‘\\x80\\x03}q\\x00(X\\x02\\x00\\x00\\x00k2q\\x01X\\x02\\x00\\x00\\x00v2q\\x02X\\x02\\x00\\x00\\x00k1q\\x03X\\x02\\x00\\x00\\x00v1q\\x04u.‘ <class ‘bytes‘> {‘k2‘: ‘v2‘, ‘k1‘: ‘v1‘} <class ‘dict‘>
import pickle adict = {"k1":"v1","k2":"v2"} with open(‘db.txt‘,‘wb‘) as f: r = pickle.dump(adict,f) with open(‘db.txt‘,‘rb‘) as f: r = pickle.load(f) print(r,type(r))
{‘k2‘: ‘v2‘, ‘k1‘: ‘v1‘} <class ‘dict‘> # 以只写方式打开db.txt,pickle.dump将adict内容由字典类型转为pickle字节型数据,并写入打开的文件; 此时生成一个包含adict内容的db.txt文件; # 文件内容 : ?}q (X k1qX v1qX k2qX v2qu. # 以只读的方式打开db.txt文件,pickle.load将文件内容由pickle字符串类型转为数据类型格式; # 打印出结果;
- os.system
- os.spawn*
- os.popen* --废弃
- popen2.* --废弃
- commands.* --废弃,3.x中被移除
import commands result = commands.getoutput(‘cmd‘) result = commands.getstatus(‘cmd‘) result = commands.getstatusoutput(‘cmd‘)
以上执行shell命令的相关的模块和函数的功能均在 subprocess 模块中实现,并提供了更丰富的功能。
ret =["ls", "-l"], shell=False) ret ="ls -l", shell=True)
shell = True ,允许 shell 命令是字符串形式
执行命令,如果执行状态码是 0 ,则返回0,否则抛异常
subprocess.check_call(["ls", "-l"]) subprocess.check_call("exit 1", shell=True)
执行命令,如果状态码是 0 ,则返回执行结果,否则抛异常
subprocess.check_output(["echo", "Hello World!"]) subprocess.check_output("exit 1", shell=True)
- args:shell命令,可以是字符串或者序列类型(如:list,元组)
- bufsize:指定缓冲。0 无缓冲,1 行缓冲,其他 缓冲区大小,负值 系统缓冲
- stdin, stdout, stderr:分别表示程序的标准输入、输出、错误句柄
- preexec_fn:只在Unix平台下有效,用于指定一个可执行对象(callable object),它将在子进程运行之前被调用
- close_sfs:在windows平台下,如果close_fds被设置为True,则新创建的子进程将不会继承父进程的输入、输出、错误管道。
所以不能将close_fds设置为True同时重定向子进程的标准输入、输出与错误(stdin, stdout, stderr)。 - shell:同上
- cwd:用于设置子进程的当前目录
- env:用于指定子进程的环境变量。如果env = None,子进程的环境变量将从父进程中继承。
- universal_newlines:不同系统的换行符不同,True -> 同意使用 \\n
- startupinfo与createionflags只在windows下有效
import subprocess ret1 = subprocess.Popen(["mkdir","t1"]) ret2 = subprocess.Popen("mkdir t2", shell=True)
- 输入即可得到输出,如:ifconfig
- 输入进行某环境,依赖再输入,如:python
1. import subprocess obj = subprocess.Popen("mkdir t3", shell=True, cwd=‘/home/dev‘,) 2. import subprocess obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) obj.stdin.write(‘print 1 \\n ‘) obj.stdin.write(‘print 2 \\n ‘) obj.stdin.write(‘print 3 \\n ‘) obj.stdin.write(‘print 4 \\n ‘) obj.stdin.close() cmd_out = obj.stdout.close() cmd_error = obj.stderr.close() print (cmd_out) print (cmd_error) 3. import subprocess obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) obj.stdin.write(‘print 1 \\n ‘) obj.stdin.write(‘print 2 \\n ‘) obj.stdin.write(‘print 3 \\n ‘) obj.stdin.write(‘print 4 \\n ‘) out_error_list = obj.communicate() print (out_error_list) 4. import subprocess obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) out_error_list = obj.communicate(‘print "hello"‘) print (out_error_list)
高级的 文件、文件夹、压缩包 处理模块
shutil.copyfileobj(fsrc, fdst[, length])
def copyfileobj(fsrc, fdst, length=16*1024): """copy data from file-like object fsrc to file-like object fdst""" while 1: buf = if not buf: break fdst.write(buf)
shutil.copyfile(src, dst)
def copyfile(src, dst): """Copy data from src to dst""" if _samefile(src, dst): raise Error("`%s` and `%s` are the same file" % (src, dst)) for fn in [src, dst]: try: st = os.stat(fn) except OSError: # File most likely does not exist pass else: # XXX What about other special files? (sockets, devices...) if stat.S_ISFIFO(st.st_mode): raise SpecialFileError("`%s` is a named pipe" % fn) with open(src, ‘rb‘) as fsrc: with open(dst, ‘wb‘) as fdst: copyfileobj(fsrc, fdst)
shutil.copymode(src, dst)
def copymode(src, dst): """Copy mode bits from src to dst""" if hasattr(os, ‘chmod‘): st = os.stat(src) mode = stat.S_IMODE(st.st_mode) os.chmod(dst, mode)
shutil.copystat(src, dst)
拷贝状态的信息,包括:mode bits, atime, mtime, flags
def copystat(src, dst): """Copy all stat info (mode bits, atime, mtime, flags) from src to dst""" st = os.stat(src) mode = stat.S_IMODE(st.st_mode) if hasattr(os, ‘utime‘): os.utime(dst, (st.st_atime, st.st_mtime)) if hasattr(os, ‘chmod‘): os.chmod(dst, mode) if hasattr(os, ‘chflags‘) and hasattr(st, ‘st_flags‘): try: os.chflags(dst, st.st_flags) except OSError, why: for err in ‘EOPNOTSUPP‘, ‘ENOTSUP‘: if hasattr(errno, err) and why.errno == getattr(errno, err): break else: raise
shutil.copy(src, dst)
def copy(src, dst): """Copy data and mode bits ("cp src dst"). The destination may be a directory. """ if os.path.isdir(dst): dst = os.path.join(dst, os.path.basename(src)) copyfile(src, dst) copymode(src, dst)
shutil.copy2(src, dst)
def copy2(src, dst): """Copy data and all stat info ("cp -p src dst"). The destination may be a directory. """ if os.path.isdir(dst): dst = os.path.join(dst, os.path.basename(src)) copyfile(src, dst) copystat(src, dst)
shutil.copytree(src, dst, symlinks=False, ignore=None)
例如:copytree(source, destination, ignore=ignore_patterns(‘*.pyc‘, ‘tmp*‘))
def ignore_patterns(*patterns): """Function that can be used as copytree() ignore parameter. Patterns is a sequence of glob-style patterns that are used to exclude files""" def _ignore_patterns(path, names): ignored_names = [] for pattern in patterns: ignored_names.extend(fnmatch.filter(names, pattern)) return set(ignored_names) return _ignore_patterns def copytree(src, dst, symlinks=False, ignore=None): """Recursively copy a directory tree using copy2(). The destination directory must not already exist. If exception(s) occur, an Error is raised with a list of reasons. If the optional symlinks flag is true, symbolic links in the source tree result in symbolic links in the destination tree; if it is false, the contents of the files pointed to by symbolic links are copied. The optional ignore argument is a callable. If given, it is called with the `src` parameter, which is the directory being visited by copytree(), and `names` which is the list of `src` contents, as returned by os.listdir(): callable(src, names) -> ignored_names Since copytree() is called recursively, the callable will be called once for each directory that is copied. It returns a list of names relative to the `src` directory that should not be copied. XXX Consider this example code rather than the ultimate tool. """ names = os.listdir(src) if ignore is not None: ignored_names = ignore(src, names) else: ignored_names = set() os.makedirs(dst) errors = [] for name in names: if name in ignored_names: continue srcname = os.path.join(src, name) dstname = os.path.join(dst, name) try: if symlinks and os.path.islink(srcname): linkto = os.readlink(srcname) os.symlink(linkto, dstname) elif os.path.isdir(srcname): copytree(srcname, dstname, symlinks, ignore) else: # Will raise a SpecialFileError for unsupported file types copy2(srcname, dstname) # catch the Error from the recursive copytree so that we can # continue with other files except Error, err: errors.extend(err.args[0]) except EnvironmentError, why: errors.append((srcname, dstname, str(why))) try: copystat(src, dst) except OSError, why: if WindowsError is not None and isinstance(why, WindowsError): # Copying file access times may fail on Windows pass else: errors.append((src, dst, str(why))) if errors: raise Error, errors
shutil.rmtree(path[, ignore_errors[, onerror]])
def rmtree(path, ignore_errors=False, onerror=None): """Recursively delete a directory tree. If ignore_errors is set, errors are ignored; otherwise, if onerror is set, it is called to handle the error with arguments (func, path, exc_info) where func is os.listdir, os.remove, or os.rmdir; path is the argument to that function that caused it to fail; and exc_info is a tuple returned by sys.exc_info(). If ignore_errors is false and onerror is None, an exception is raised. """ if ignore_errors: def onerror(*args): pass elif onerror is None: def onerror(*args): raise try: if os.path.islink(path): # symlinks to directories are forbidden, see bug #1669 raise OSError("Cannot call rmtree on a symbolic link") except OSError: onerror(os.path.islink, path, sys.exc_info()) # can‘t continue even if onerror hook returns return names = [] try: names = os.listdir(path) except os.error, err: onerror(os.listdir, path, sys.exc_info()) for name in names: fullname = os.path.join(path, name) try: mode = os.lstat(fullname).st_mode except os.error: mode = 0 if stat.S_ISDIR(mode): rmtree(fullname, ignore_errors, onerror) else: try: os.remove(fullname) except os.error, err: onerror(os.remove, fullname, sys.exc_info()) try: os.rmdir(path) except os.error: onerror(os.rmdir, path, sys.exc_info())
shutil.move(src, dst)
def move(src, dst): """Recursively move a file or directory to another location. This is similar to the Unix "mv" command. If the destination is a directory or a symlink to a directory, the source is moved inside the directory. The destination path must not already exist. If the destination already exists but is not a directory, it may be overwritten depending on os.rename() semantics. If the destination is on our current filesystem, then rename() is used. Otherwise, src is copied to the destination and then removed. A lot more could be done here... A look at a mv.c shows a lot of the issues this implementation glosses over. """ real_dst = dst if os.path.isdir(dst): if _samefile(src, dst): # We might be on a case insensitive filesystem, # perform the rename anyway. os.rename(src, dst) return real_dst = os.path.join(dst, _basename(src)) if os.path.exists(real_dst): raise Error, "Destination path ‘%s‘ already exists" % real_dst try: os.rename(src, real_dst) except OSError: if os.path.isdir(src): if _destinsrc(src, dst): raise Error, "Cannot move a directory ‘%s‘ into itself ‘%s‘." % (src, dst) copytree(src, real_dst, symlinks=True) rmtree(src) else: copy2(src, real_dst) os.unlink(src)
shutil.make_archive(base_name, format,...)
- base_name: 压缩包的文件名,也可以是压缩包的路径。只是文件名时,则保存至当前目录,否则保存至指定路径,
如:www =>保存至当前路径
如:/Users/wupeiqi/www =>保存至/Users/wupeiqi/ - format: 压缩包种类,“zip”, “tar”, “bztar”,“gztar”
- root_dir: 要压缩的文件夹路径(默认当前目录)
- owner: 用户,默认当前用户
- group: 组,默认当前组
- logger: 用于记录日志,通常是logging.Logger对象
#将 /Users/wupeiqi/Downloads/test 下的文件打包放置当前程序目录 import shutil ret = shutil.make_archive("wwwwwwwwww", ‘gztar‘, root_dir=‘/Users/wupeiqi/Downloads/test‘) #将 /Users/wupeiqi/Downloads/test 下的文件打包放置 /Users/wupeiqi/目录 import shutil ret = shutil.make_archive("/Users/wupeiqi/wwwwwwwwww", ‘gztar‘, root_dir=‘/Users/wupeiqi/Downloads/test‘)
def make_archive(base_name, format, root_dir=None, base_dir=None, verbose=0, dry_run=0, owner=None, group=None, logger=None): """Create an archive file (eg. zip or tar). ‘base_name‘ is the name of the file to create, minus any format-specific extension; ‘format‘ is the archive format: one of "zip", "tar", "bztar" or "gztar". ‘root_dir‘ is a directory that will be the root directory of the archive; ie. we typically chdir into ‘root_dir‘ before creating the archive. ‘base_dir‘ is the directory where we start archiving from; ie. ‘base_dir‘ will be the common prefix of all files and directories in the archive. ‘root_dir‘ and ‘base_dir‘ both default to the current directory. Returns the name of the archive file. ‘owner‘ and ‘group‘ are used when creating a tar archive. By default, uses the current owner and group. """ save_cwd = os.getcwd() if root_dir is not None: if logger is not None: logger.debug("changing into ‘%s‘", root_dir) base_name = os.path.abspath(base_name) if not dry_run: os.chdir(root_dir) if base_dir is None: base_dir = os.curdir kwargs = {‘dry_run‘: dry_run, ‘logger‘: logger} try: format_info = _ARCHIVE_FORMATS[format] except KeyError: raise ValueError, "unknown archive format ‘%s‘" % format func = format_info[0] for arg, val in format_info[1]: kwargs[arg] = val if format != ‘zip‘: kwargs[‘owner‘] = owner kwargs[‘group‘] = group try: filename = func(base_name, base_dir, **kwargs) finally: if root_dir is not None: if logger is not None: logger.debug("changing back to ‘%s‘", save_cwd) os.chdir(save_cwd) return filename
shutil 对压缩包的处理是调用 ZipFile 和 TarFile 两个模块来进行的,详细:
import zipfile # 压缩 z = zipfile.ZipFile(‘‘, ‘w‘) z.write(‘a.log‘) z.write(‘‘) z.close() # 解压 z = zipfile.ZipFile(‘‘, ‘r‘) z.extractall() z.close()
- tarfile 压缩解压
import tarfile # 压缩 tar =‘your.tar‘,‘w‘) tar.add(‘/Users/wupeiqi/PycharmProjects/‘, arcname=‘‘) tar.add(‘/Users/wupeiqi/PycharmProjects/‘, arcname=‘‘) tar.close() # 解压 tar =‘your.tar‘,‘r‘) tar.extractall() # 可设置解压地址 tar.close()
- ZipFile
class ZipFile(object): """ Class with methods to open, read, write, close, list zip files. z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=False) file: Either the path to the file, or a file-like object. If it is a path, the file will be opened and closed by ZipFile. mode: The mode can be either read "r", write "w" or append "a". compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib). allowZip64: if True ZipFile will create files with ZIP64 extensions when needed, otherwise it will raise an exception when this would be necessary. """ fp = None # Set here since __del__ checks it def __init__(self, file, mode="r", compression=ZIP_STORED, allowZip64=False): """Open the ZIP file with mode read "r", write "w" or append "a".""" if mode not in ("r", "w", "a"): raise RuntimeError(‘ZipFile() requires mode "r", "w", or "a"‘) if compression == ZIP_STORED: pass elif compression == ZIP_DEFLATED: if not zlib: raise RuntimeError, "Compression requires the (missing) zlib module" else: raise RuntimeError, "That compression method is not supported" self._allowZip64 = allowZip64 self._didModify = False self.debug = 0 # Level of printing: 0 through 3 self.NameToInfo = {} # Find file info given name self.filelist = [] # List of ZipInfo instances for archive self.compression = compression # Method of compression self.mode = key = mode.replace(‘b‘, ‘‘)[0] self.pwd = None self._comment = ‘‘ # Check if we were passed a file-like object if isinstance(file, basestring): self._filePassed = 0 self.filename = file modeDict = {‘r‘ : ‘rb‘, ‘w‘: ‘wb‘, ‘a‘ : ‘r+b‘} try: self.fp = open(file, modeDict[mode]) except IOError: if mode == ‘a‘: mode = key = ‘w‘ self.fp = open(file, modeDict[mode]) else: raise else: self._filePassed = 1 self.fp = file self.filename = getattr(file, ‘name‘, None) try: if key == ‘r‘: self._RealGetContents() elif key == ‘w‘: # set the modified flag so central directory gets written # even if no files are added to the archive self._didModify = True elif key == ‘a‘: try: # See if file is a zip file self._RealGetContents() # seek to start of directory and overwrite, 0) except BadZipfile: # file is not a zip file, just append, 2) # set the modified flag so central directory gets written # even if no files are added to the archive self._didModify = True else: raise RuntimeError(‘Mode must be "r", "w" or "a"‘) except: fp = self.fp self.fp = None if not self._filePassed: fp.close() raise def __enter__(self): return self def __exit__(self, type, value, traceback): self.close() def _RealGetContents(self): """Read in the table of contents for the ZIP file.""" fp = self.fp try: endrec = _EndRecData(fp) except IOError: raise BadZipfile("File is not a zip file") if not endrec: raise BadZipfile, "File is not a zip file" if self.debug > 1: print endrec size_cd = endrec[_ECD_SIZE] # bytes in central directory offset_cd = endrec[_ECD_OFFSET] # offset of central directory self._comment = endrec[_ECD_COMMENT] # archive comment # "concat" is zero, unless zip was concatenated to another file concat = endrec[_ECD_LOCATION] - size_cd - offset_cd if endrec[_ECD_SIGNATURE] == stringEndArchive64: # If Zip64 extension structures are present, account for them concat -= (sizeEndCentDir64 + sizeEndCentDir64Locator) if self.debug > 2: inferred = concat + offset_cd print "given, inferred, offset", offset_cd, inferred, concat # self.start_dir: Position of start of central directory self.start_dir = offset_cd + concat, 0) data = fp = cStringIO.StringIO(data) total = 0 while total < size_cd: centdir = if len(centdir) != sizeCentralDir: raise BadZipfile("Truncated central directory") centdir = struct.unpack(structCentralDir, centdir) if centdir[_CD_SIGNATURE] != stringCentralDir: raise BadZipfile("Bad magic number for central directory") if self.debug > 2: print centdir filename =[_CD_FILENAME_LENGTH]) # Create ZipInfo instance to store file information x = ZipInfo(filename) x.extra =[_CD_EXTRA_FIELD_LENGTH]) x.comment =[_CD_COMMENT_LENGTH]) x.header_offset = centdir[_CD_LOCAL_HEADER_OFFSET] (x.create_version, x.create_system, x.extract_version, x.reserved, x.flag_bits, x.compress_type, t, d, x.CRC, x.compress_size, x.file_size) = centdir[1:12] x.volume, x.internal_attr, x.external_attr = centdir[15:18] # Convert date/time code to (year, month, day, hour, min, sec) x._raw_time = t x.date_time = ( (d>>9)+1980, (d>>5)&0xF, d&0x1F, t>>11, (t>>5)&0x3F, (t&0x1F) * 2 ) x._decodeExtra() x.header_offset = x.header_offset + concat x.filename = x._decodeFilename() self.filelist.append(x) self.NameToInfo[x.filename] = x # update total bytes read from central directory total = (total + sizeCentralDir + centdir[_CD_FILENAME_LENGTH] + centdir[_CD_EXTRA_FIELD_LENGTH] + centdir[_CD_COMMENT_LENGTH]) if self.debug > 2: print "total", total def namelist(self): """Return a list of file names in the archive.""" l = [] for data in self.filelist: l.append(data.filename) return l def infolist(self): """Return a list of class ZipInfo instances for files in the archive.""" return self.filelist def printdir(self): """Print a table of contents for the zip file.""" print "%-46s %19s %12s" % ("File Name", "Modified ", "Size") for zinfo in self.filelist: date = "%d-%02d-%02d %02d:%02d:%02d" % zinfo.date_time[:6] print "%-46s %s %12d" % (zinfo.filename, date, zinfo.file_size) def testzip(self): """Read all the files and check the CRC.""" chunk_size = 2 ** 20 for zinfo in self.filelist: try: # Read by chunks, to avoid an OverflowError or a # MemoryError with very large embedded files. with, "r") as f: while # Check CRC-32 pass except BadZipfile: return zinfo.filename def getinfo(self, name): """Return the instance of ZipInfo given ‘name‘.""" info = self.NameToInfo.get(name) if info is None: raise KeyError( ‘There is no item named %r in the archive‘ % name) return info def setpassword(self, pwd): """Set default password for encrypted files.""" self.pwd = pwd @property def comment(self): """The comment text associated with the ZIP file.""" return self._comment @comment.setter def comment(self, comment): # check for valid comment length if len(comment) > ZIP_MAX_COMMENT: import warnings warnings.warn(‘Archive comment is too long; truncating to %d bytes‘ % ZIP_MAX_COMMENT, stacklevel=2) comment = comment[:ZIP_MAX_COMMENT] self._comment = comment self._didModify = True def read(self, name, pwd=None): """Return file bytes (as a string) for name.""" return, "r", pwd).read() def open(self, name, mode="r", pwd=None): """Return file-like object for ‘name‘.""" if mode not in ("r", "U", "rU"): raise RuntimeError, ‘open() requires mode "r", "U", or "rU"‘ if not self.fp: raise RuntimeError, "Attempt to read ZIP archive that was already closed" # Only open a new file for instances where we were not # given a file object in the constructor if self._filePassed: zef_file = self.fp should_close = False else: zef_file = open(self.filename, ‘rb‘) should_close = True try: # Make sure we have an info object if isinstance(name, ZipInfo): # ‘name‘ is already an info object zinfo = name else: # Get info object for name zinfo = self.getinfo(name), 0) # Skip the file header: fheader = if len(fheader) != sizeFileHeader: raise BadZipfile("Truncated file header") fheader = struct.unpack(structFileHeader, fheader) if fheader[_FH_SIGNATURE] != stringFileHeader: raise BadZipfile("Bad magic number for file header") fname =[_FH_FILENAME_LENGTH]) if fheader[_FH_EXTRA_FIELD_LENGTH]:[_FH_EXTRA_FIELD_LENGTH]) if fname != zinfo.orig_filename: raise BadZipfile, ‘File name in directory "%s" and header "%s" differ.‘ % ( zinfo.orig_filename, fname) # check for encrypted flag & handle password is_encrypted = zinfo.flag_bits & 0x1 zd = None if is_encrypted: if not pwd: pwd = self.pwd if not pwd: raise RuntimeError, "File %s is encrypted, " "password required for extraction" % name zd = _ZipDecrypter(pwd) # The first 12 bytes in the cypher stream is an encryption header # used to strengthen the algorithm. The first 11 bytes are
                # completely random, while the 12th contains the MSB of the CRC,
                # or the MSB of the file time depending on the header type
                # and is used to check the correctness of the password.
                bytes =
                h = map(zd, bytes[0:12])
                if zinfo.flag_bits & 0x8:
                    # compare against the file type from extended local headers
                    check_byte = (zinfo._raw_time >> 8) & 0xff
                else:
                    # compare against the CRC otherwise
                    check_byte = (zinfo.CRC >> 24) & 0xff
                if ord(h[11]) != check_byte:
                    raise RuntimeError("Bad password for file", name)

                return ZipExtFile(zef_file, mode, zinfo, zd,
                        close_fileobj=should_close)
        except:
            if should_close:
                zef_file.close()
            raise

    def extract(self, member, path=None, pwd=None):
        """Extract a member from the archive to the current working directory,
           using its full name. Its file information is