Takewhile lambda函数无法识别字符串

Posted 2023-02-21

技术标签:

【中文标题】Takewhile lambda函数无法识别字符串【英文标题】：Takewhile lambda function not recognizing string 【发布时间】：2022-01-11 23:25:34 【问题描述】：

所以我在文件的开头有一个评论部分。我想要的是要拉出以“#Description:”开头的行。但是由于某种原因，我不明白，它不起作用。输入 '#' 得到了我的期望，'# NOTE' 也是如此，但是 '#Description:' 甚至 '#D' 似乎什么都没有返回。有人可以帮我理解这一点吗？

这是我文件的评论部分：

# NOTE: successive whitespace characters treated as single delimiter
# NOTE: all lines beginning with '#' treated as comments
# NOTE: Description must come after '# Description: ' to be recognized
#
# Description: High dispersion optics with O-16 (4+) at 6 MeV/nucleon. Provided by <first, last> on <datetime>.
#
#

这是我正在使用的代码：

from itertools import takewhile
with open(pathname, 'r') as fobj:
    # takewhile returns an iterator over all the lines
    # that start with the comment string
    headiter = takewhile(lambda s: s.startswith('# Description: '), fobj)
    description = list(headiter)

【问题讨论】：

好像该功能只适用于前 '# NOTE:' 行，之后就没有了。尽管 '#' 返回列表中的描述行，但 '#' 不会。我用什么字符串替换“描述”似乎并不重要。删除第三个 NOTE 行和 Description 行之间的空白注释行时，使用字符串 '#' 也会返回描述行，尽管使用 '#D' 或 '#Description:' 仍然没有不。我还是不明白为什么。 【参考方案1】：

takewhile 将保留迭代器中的元素，直到条件为 False。在您的情况下，条件在开始时为 False，仅在第三行变为 True。

你想使用dropwhile和next：

from itertools import dropwhile
with open(pathname, 'r') as fobj:
    # dropping lines until they don't start with "# Description:"
    headiter = dropwhile(lambda s: not s.startswith('# Description: '), fobj)
    # getting next element
    description = next(headiter)

输出：

'# Description: High dispersion optics with O-16 (4+) at 6 MeV/nucleon. Provided by <first, last> on <datetime>.'

【讨论】：

非常感谢！这行得通！我不太明白 next() 在做什么。它是否只取下一项，在这种情况下，是列表的第一项？ @DanielCrisp next 正在获取迭代中的下一个元素，尝试再次运行它，您将获得后面的行，等等，直到迭代器耗尽，此时它返回StopIteration 异常太棒了，这很有道理。再次感谢，我真的很感激！

以上是关于Takewhile lambda函数无法识别字符串的主要内容，如果未能解决你的问题，请参考以下文章