按行读取文本文件的头部、尾部和向后
Posted
技术标签:
【中文标题】按行读取文本文件的头部、尾部和向后【英文标题】:head, tail and backward read by lines of a text file 【发布时间】:2011-08-19 06:13:31 【问题描述】:如何在python中实现'head'和'tail'命令并通过文本文件的行向后读取?
【问题讨论】:
Read a file in reverse order using python 的可能重复项 我需要向后读取一个大日志文件 我猜你不熟悉tac,因为你的问题只是“在python中实现tac”。 Get last n lines of a file with Python, similar to tail 的可能副本 【参考方案1】:这是我的个人文件类 ;-)
class File(file):
""" An helper class for file reading """
def __init__(self, *args, **kwargs):
super(File, self).__init__(*args, **kwargs)
self.BLOCKSIZE = 4096
def head(self, lines_2find=1):
self.seek(0) #Rewind file
return [super(File, self).next() for x in xrange(lines_2find)]
def tail(self, lines_2find=1):
self.seek(0, 2) #Go to end of file
bytes_in_file = self.tell()
lines_found, total_bytes_scanned = 0, 0
while (lines_2find + 1 > lines_found and
bytes_in_file > total_bytes_scanned):
byte_block = min(
self.BLOCKSIZE,
bytes_in_file - total_bytes_scanned)
self.seek( -(byte_block + total_bytes_scanned), 2)
total_bytes_scanned += byte_block
lines_found += self.read(self.BLOCKSIZE).count('\n')
self.seek(-total_bytes_scanned, 2)
line_list = list(self.readlines())
return line_list[-lines_2find:]
def backward(self):
self.seek(0, 2) #Go to end of file
blocksize = self.BLOCKSIZE
last_row = ''
while self.tell() != 0:
try:
self.seek(-blocksize, 1)
except IOError:
blocksize = self.tell()
self.seek(-blocksize, 1)
block = self.read(blocksize)
self.seek(-blocksize, 1)
rows = block.split('\n')
rows[-1] = rows[-1] + last_row
while rows:
last_row = rows.pop(-1)
if rows and last_row:
yield last_row
yield last_row
示例用法:
with File('file.name') as f:
print f.head(5)
print f.tail(5)
for row in f.backward():
print row
【讨论】:
有人有这个的 Python 3 版本吗?我得到: NameError: name 'file' is not defined【参考方案2】:head
很简单:
from itertools import islice
with open("file") as f:
for line in islice(f, n):
print line
tail
如果您不想将整个文件保存在内存中,则更难。如果输入是文件,您可以从文件末尾开始读取块。如果输入是管道,原始的tail
也可以工作,因此更通用的解决方案是读取并丢弃整个输入,除了最后几行。一个简单的方法是collections.deque
:
from collections import deque
with open("file") as f:
for line in deque(f, maxlen=n):
print line
在这两个代码 sn-ps 中,n
是要打印的行数。
【讨论】:
非常优雅,但是尾部使用双端队列和巨大的日志文件(数百 MB)太慢了【参考方案3】:尾巴:
def tail(fname, lines):
"""Read last N lines from file fname."""
f = open(fname, 'r')
BUFSIZ = 1024
f.seek(0, os.SEEK_END)
fsize = f.tell()
block = -1
data = ""
exit = False
while not exit:
step = (block * BUFSIZ)
if abs(step) >= fsize:
f.seek(0)
exit = True
else:
f.seek(step, os.SEEK_END)
data = f.read().strip()
if data.count('\n') >= lines:
break
else:
block -= 1
return data.splitlines()[-lines:]
【讨论】:
以上是关于按行读取文本文件的头部、尾部和向后的主要内容,如果未能解决你的问题,请参考以下文章
c#程序,textbox如何设置文本居中显示?如何按行读取文本文件内容?