将文件中所有奇数行求和的 Pythonic 方法

Posted 2023-02-25

技术标签:

【中文标题】将文件中所有奇数行求和的 Pythonic 方法【英文标题】：Pythonic method to sum all the odd-numbered lines in a file 【发布时间】：2013-04-15 18:55:04 【问题描述】：

我正在学习 Python 是为了参加研究生院的编程分班考试，这实际上是我编写的第一个小脚本来感受它。我的背景主要是 C# 和 php，但我不能在测试中使用任何一种语言。

我的测试脚本读入下面的文本文件 (test_file1.txt)。偶数行包含样本大小，奇数行包含样本中每个测试的“结果”。 EOF 标记为 0。我想读入文件，输出样本量，并对每个测试的结果求和。您将如何使用 Python 执行此任务？我觉得我试图强迫 python 像 PHP 或 C# 一样，根据我的研究，我猜想有非常“Python”的方式来做这些事情。

test_file1.txt：

3
13 15 18
5 
19 52 87 55 1
4
11 8 63 4
2
99 3
0

我的简单脚本：

file = open("test_file1.txt", "r")

i=0
for line in file:
    if i % 2 == 0:
        #num is even
        if line == '0':
            #EOF
            print 'End of experiment'   
    else:
        #num is odd
        numList = line.split( )
        numList = [int(x) for x in numList]
        print 'Sample size: ' + str(len(numList)) + ' Results: ' + str(sum(numList))
    i += 1

file.close()

我的结果：

Sample size: 3 Results: 46
Sample size: 5 Results: 214
Sample size: 4 Results: 86
Sample size: 2 Results: 102
End of experiment

谢谢！

【问题讨论】：

所以这是一个测试？还有代码审查？使用enumerate给行编号：for i, line in enumerate(file) @sr2222，是的，差不多（好吧，我凭空想出了一个示例测试问题）。我想比较一下在没有 Python 知识的情况下我是如何处理这个问题的，与知道 Python 的人相比。 【参考方案1】：

将文件用作迭代器，然后使用iterators.islice() 获取每一行：

from itertools import islice

with open("test_file1.txt", "r") as f:
   for line in islice(f, 1, None, 2):
       nums = [int(n) for n in line.split()]
       print 'Sample size:   Results: '.format(len(nums), sum(nums))

islice(f, 1, None, 2) 跳过第一行 (start=1)，然后遍历所有行 (stop=None)，每隔一行返回 (step=2)。

这适用于你扔给它的任何文件大小；它不需要比内部迭代器缓冲区更多的内存。

测试文件的输出：

Sample size: 3  Results: 46
Sample size: 5  Results: 214
Sample size: 4  Results: 86
Sample size: 2  Results: 102

【讨论】：

itertools 是第三方库还是 Python 内置的？它是一个标准库；它与 Python 一起提供。我在答案中链接到该方法的文档。【参考方案2】：

你可以这样做：

with open("test_file1.txt", "r") as inf:
    lines = inf.readlines()
    for l in lines[1::2]:  # read only alternating lines
        numList = map(int, line.split())
        print "Sample size:", len(numList), "Results:", sum(numList)

【讨论】：

我看到@MartijnPieters 的反对意见很多。但我想知道：1）这有关系吗？ 2) 当内存不是问题时，islice() 方法是否更快？除非您绝对确定输入文件总是很小，否则最好避免一次将整个文件读入内存。 islice() 完全用 C 实现，因此使用它的开销很小，最重要的是，它不创建列表的副本。 lines[1::2] 必须首先创建一个新列表，其中包含所有奇数行，比文件的初始列表增加 50% 的内存使用量，并且创建该列表也需要时间。不将整个文件读入 mem 是我试图避免使用我的简单解决方案做的一件事，但只要你先检查文件大小以确保它是 【参考方案3】：

这样的事情怎么样，相当 Pythonic 恕我直言：

with open('test.txt') as fh:
    for i, line in enumerate(fh):
        if i % 2:
            nums = map(int, line.split())
            print 'Sample size: %d, Results: %d' % (len(nums), sum(nums))
        elif line == '0':
            print 'End of experiment'

【讨论】：

我喜欢这个解决方案。使用内置函数 (enumerate()) 和简单构造 (模数)，因此不需要了解 itertools.islice()。【参考方案4】：

我不确定pythonic 的人是如何找到这个的，但我发现zip、map 和reduce 是一种非常方便的方式，可以以紧凑的方式执行此操作。但是，它可能有点混淆。

with open("test.txt") as fd:                                                                                                           
   lines = [map(int, s.strip().split()) for s in fd.readlines()]                                                                      
   print "\n".join("Sample Size: %d \t Results: %d"%tuple(map(sum,(d[0],d[1])))                                                       
                   for d in zip(lines, lines[1:], range(len(lines)))                                                                  
                   if d[2] % 2 == 0)

【讨论】：

tuple 不需要顺便说一句，你可以简单地使用(...) 来代替。如果你把 (...) 放在它周围，它不会转换为元组，它只会阻止执行顺序，即作为括号。这将导致字符串格式错误，因为格式正在查找数字并且格式的输入将是一个列表："%d"%(map(...)) 将导致错误。

以上是关于将文件中所有奇数行求和的 Pythonic 方法的主要内容，如果未能解决你的问题，请参考以下文章