Python编程之文件操作

Posted 2023-02-27 -飞鹤-

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Python编程之文件操作相关的知识，希望对你有一定的参考价值。

1. 基本数据结构

1.1. bytes和bytearray

bytes是一个数据结构类，是描述字节流序列对象，内存分配是连续的，是不可变类型。
bytearray是一个数据结构类，是描述字节流序列对象，内存分配是连续的，是可变类型。
bytes类型与bytearray类型可以相互转换。

>>> b=b'abc'
>>> type(b)
<class 'bytes'>
>>> b=b"abc"
>>> type(b)
<class 'bytes'>
>>>content = bytes([1, 2, 3, 4])
>>> content
b'\\x01\\x02\\x03\\x04'
>>> type(content)
<class 'bytes'>
bytearray(b'\\x01\\x02\\x03\\x04')
>>> print(content)
bytearray(b'\\x01\\x02\\x03\\x04')
>>> content[1]=5
>>> content
bytearray(b'\\x01\\x05\\x03\\x04')
>>> content_array = bytearray(content)
>>> type(content_array)
<class 'bytearray'>
>>> content = bytes(content_array)
>>> type(content)
<class 'bytes'>

1.2. str和list

str是字符串类，可以直接通过’和“来创建，str是字符流对象，内存是连续的，str是不可变类型。
list是列表类，可以用[]来创建，其内存分配是不连续的，是可变类型。

>>> nums=['ww','22','2s']
>>> nums
['ww', '22', '2s']
>>> type(nums)
<class 'list'>
>>> name = "mike"
>>> type(name)
<class 'str'>
>>> name
'mike'

str到list

>>> name_list = list(name)
>>> name_list
['m', 'i', 'k', 'e']

list到str

>>> nums = [1, 2, 3, 4]
>>> nums_list = [str(x) for x in nums]
>>> nums_list
['1', '2', '3', '4']
>>> str().join(str_nums)
'1234'

1.3. str和bytes比较

str和bytes都是不可变类型，并且都是连续内存。虽然它们的结构是相同的，甚至存储的数据也可以完全一样，但是str是字符流，表现形式是字符串，而bytes是字节流，表现形式是数字序列。encode和decode默认是以utf-8编码来进行转换的，可以指定gbk编码。

>>> name = "mike"
>>> name_bytes = name.encode()
>>> name_bytes
b'mike'
>>> name = name_bytes.decode()
>>> name
'mike'
>>> text = "我是谁"
>>> text.encode('gbk')                                    
b'\\xce\\xd2\\xca\\xc7\\xcb\\xad'
>>> text.encode('gbk')
b'\\xce\\xd2\\xca\\xc7\\xcb\\xad'
>>> text_bytes = text.encode('gbk')
>>> text_bytes
b'\\xce\\xd2\\xca\\xc7\\xcb\\xad'
>>> new_text = text_bytes.decode('gbk')
>>> new_text
'我是谁'

1.4. int和bytes的相互转换

# int转bytes
val = 0x123456789abc
bytes_val = val.to_bytes(4, "big")
bytes_val = val.to_bytes(5, "little")
# bytes转int
val = int.from_bytes(bytes_val, "little")

2. 数字与字符串的相互转换

2.1. 数字转换为字符串

数字转换为字符串主要是通过str类的成员函数format或f字符串语法来完成。

>>> name = 'Peter'
>>> age = 23
>>> print(' is  years old'.format(name, age))
Peter is 23 years old
>>> print(f'name is age years old')
Peter is 23 years old
>>> a = 123.456
>>> f"a if a:8.2f"
'a if   123.46'

2.2. 字符串转换为数字

str到int

>>> phone_number = "13988888888"
>>> int(phone_number, 10)
13988888888
>>> int(phone_number)
13988888888
>>> hex_str = "0x1234"
>>> int(hex_str, 16)
4660

3. 文件操作

3.1. 创建删除文件

创建文件

os.mknod("test.txt")
open("test.txt", w)

# 创建单目录
os.mkdir("test")
# 创建多层目录
os.makedirs(r"d:\\abc\\def")
# pathlib创建目录
pathlib.Path("temp/").mkdir(parents=True, exist_ok=True)

删除文件

os.remove("test.txt")

删除空文件夹

os.removedirs("test")
os.rmdir("test")

删除文件夹及其中文件

import shutil
shutil.rmtree('/path/to/your/dir/')

3.2. 遍历所有文件及目录

>>> import pathlib
>>> p = pathlib.Path('.')
>>> [x for x in p.iterdir() if x.is_dir()]

遍历所有文件

files = pathlib.Path(".").rglob("*.py")
for file in files:
    print(file)

p = pathlib.Path('.')
[print(x) for x in p.iterdir()]

3.3. 读写文件

二进制读写文件

with open("scan.bin", "rb+") as bin_file:
    content = bin_file.read()
    content_array = bytearray(content)
    content_array[2] = 5
    with open("new.bin", "wb") as new_file:
        new_file.write(content_array)

文本读写文件

# utf-8编码
with open("test6.py", "r", encoding='utf-8') as file:
    text = file.read()
    print(text)
# gbk编码
with open("a.c", "r", encoding='gb18030') as file:
    text = file.read()
    print(text)

pathlib读写

>>> import pathlib
>>> p = pathlib.Path('a.bin')
>>> p.write_bytes(b'Binary file contents')
20
>>> p.read_bytes()
b'Binary file contents'
>>> p = pathlib.Path('my_text_file.txt')
>>> p.write_text('Text file contents')
18
>>> p.read_text()
'Text file contents'

按行读取

with open("test6.py", "r", encoding='utf-8') as file:
    print(file.readline()) # 读取第1行
	print(file.readline()) # 读取第2行
	
# 遍历输出所有文本
with open("test6.py", "r", encoding='utf-8') as file:
    lines = file.readlines()
    [print(line) for line in lines]

4. 应用

4.1. 2进制文件转换为C代码数组。

#coding=utf-8
import sys


lines = []
lines.append("#pragma once\\n")
lines.append("\\n")
lines.append("const static char binArr[] = \\n")

with open(sys.argv[1], "rb") as bin_file:
    content = bin_file.read()
    size = len(content)
    per_line_size = 16
    for n in range(0, size):
        if (0 == n%per_line_size):
            line = ["    "]
        
        if (n == size-1):
            line.append(f"0xcontent[n]:02x")
            lines.extend(line)
        else:
            line.append(f"0xcontent[n]:02x, ")
            
        if (15 == n % per_line_size):
            line.append("\\n")
            lines.extend(line)
            line.clear()
            
    with open(sys.argv[2], "w") as file:
        lines.append(";\\n");
        file.writelines(lines)

效果如下图：

以上是关于Python编程之文件操作的主要内容，如果未能解决你的问题，请参考以下文章

Python3 IO编程之文件读写

Python之IO编程——文件读写StringIO/BytesIO操作文件和目录序列化

Python编程之文件操作

python之socket编程