Python - 删除字符串的前两行

Posted 2023-02-23

技术标签:

【中文标题】Python - 删除字符串的前两行【英文标题】：Python - Deleting the first 2 lines of a string 【发布时间】：2015-08-30 06:06:34 【问题描述】：

我在这里搜索了很多关于删除字符串前两行的线程，但我似乎无法让它与我尝试过的每个解决方案一起工作。

这是我的字符串的样子：

version 1.00
6992
[-4.32063, -9.1198, -106.59][0.00064, 0.99993, -0.01210][etc...]

我想为我正在使用的脚本删除这个 Roblox 网格文件的前两行。我该怎么做？

【问题讨论】：

your_string.split('\n')[2:] 【参考方案1】：

这对我有用：

first_line = text.find('\n') + 1
second_line = text.find('\n', first_line) + 1
text = text[second_line:]

【讨论】：

【参考方案2】：

''.join(x.splitlines(keepends=True)[2:])

splitlines 产生一个字符串列表。如果给定keepends=True，则在结果列表中包含换行符l，''.join(l) 可用于重现原始字符串。

请注意，splitlines 适用于许多不同的行边界，例如\u2028

>>> x = 'a\u2028b\u2028c\u2028'
>>> ''.join(x.splitlines(keepends=True)[2:])
'c\u2028'

虽然split('\n') 在这种情况下失败：

>>> x = 'a\u2028b\u2028c\u2028'
>>> x.split('\n',2)[2]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

另请注意，如果在空字符串或以换行符结尾的字符串上调用 splitlines 和 split('\n')，它们的行为会有所不同。比较以下示例（复制自splitlines 的文档）：

>>> "".splitlines()
[]
>>> "One line\n".splitlines()
['One line']

>>> ''.split('\n')
['']
>>> 'Two lines\n'.split('\n')
['Two lines', '']

但是，如果给出keepends=True，则会保留尾随的换行符：

>>> "One line\n".splitlines(keepends=True)
['One line\n']

更多示例和splitlines 视为线边界的列表可在此处找到： https://docs.python.org/3/library/stdtypes.html?highlight=split#str.splitlines

【讨论】：

【参考方案3】：

我不知道你的最终角色是什么，但是像这样的东西呢

postString = inputString.split("\n",2)[2]

结束字符可能需要转义，但这是我要开始的。

【讨论】：

谢谢，您的回答是最清楚、最容易理解的。【参考方案4】：

可以先找到'\n'的索引，然后忽略；然后，从主字符串中第二个 '\n' 子字符串的末尾开始新字符串。

import re


def find_sub_string_index(string, sub_string, offset=0, ignore=0):
    start = 0
    swap = len(sub_string)
    ignore += 1 # find first at least
    try:
        if start < 0:
            return -1 # Not Found
        if offset > 0:
            # Start main string from offset (offset is begining of check)
            string = string[offset:]
        for i in range(ignore):
            # swap: end of substring index
            # start is end of sub-string index in main string
            start += re.search(sub_string, string).start() + swap
            string = string[start:]
        return start
    except:
        return -1 # Got Error


string = """The first line.
The second line.
The third line.
The forth line.
The fifth line."""
sub_string = "\n"

ignore = 1 # Ignore times

start = find_sub_string_index(string, sub_string, ignore=1)

print("Finding sub-string '0' from main text.".format(sub_string))
print("Ignore 0 times.".format(ignore))
print("Start index:", start)
print("Result:")
print(string[start:])

结果是：

$ python3 test.py
Finding sub-string '
' from main text.
Ignore 1 times.
Start index: 33
Result:
The third line.
The forth line.
The fifth line.
$
$
$
$ python3 test.py
Finding sub-string 'The' from main text.
Ignore 2 times.
Start index: 19
Result:
 second line.
The third line.
The forth line.
The fifth line.
$

【讨论】：

【参考方案5】：

如果字符串很大，我宁愿不拆分字符串，并在之后维护换行符类型。

删除前n行：

def find_nth(haystack, needle, n):
    start = haystack.find(needle)
    while start >= 0 and n > 1:
        start = haystack.find(needle, start+len(needle))
        n -= 1
    return start
assert s[find_nth(s, '\n', 2) + 1:] == 'c\nd\n'

另请参阅：Find the nth occurrence of substring in a string

或者只删除一个：

s = 'a\nb\nc\nd\n'
assert s[s.find('\n') + 1:] == 'b\nc\nd\n'

在 Python 3.6.6 上测试。

【讨论】：

【参考方案6】：

你可以使用一些规则，比如只考虑那些以'['字符lines = [line for line in lines if line.startswith('[')]开头的行

【讨论】：

【参考方案7】：

x="""version 1.00
6992
[-4.32063, -9.1198, -106.59][0.00064, 0.99993, -0.01210][etc...]
abc
asdda"""
print "\n".join(x.split("\n")[2:])

你可以简单地做到这一点。

【讨论】：

【参考方案8】：

删除带有split的行：

lines = """version 1.00
6992
[-4.32063, -9.1198, -106.59][0.00064, 0.99993, -0.01210][etc...]"""

lines = lines.split('\n',2)[-1]

【讨论】：

以上是关于Python - 删除字符串的前两行的主要内容，如果未能解决你的问题，请参考以下文章

使用 PIG 或 HIVE 从 CSV 中删除前两行

7.2

如何选择每组的前两行并在一列中计算它们之间的差异？

表格怎么冻结前两行

用shell提取txt文本中的前4个字母，赋值到另一个变量

Linux命令----moreless