检查最后一个索引号时索引超出范围怎么办？

Posted 2023-02-23

技术标签:

【中文标题】检查最后一个索引号时索引超出范围怎么办？【英文标题】：what to do when index is out of range when checking last index number? 【发布时间】：2020-11-05 03:27:00 【问题描述】：

你好，我是 python 的新手，我正在编写一个模块，它应该将字符串作为输入，输出应该是每个单词、数字或符号的列表，没有空格。即（'10个甜苹果'）-> ['十'，'甜'，'苹果']。为此，我有一个标记当前索引号的起始值和一个结束值，只要字符串中的下一个内容是字母或数字，它就会递增。到目前为止，我已经成功地将单词、数字、符号等添加到要在 for 循环结束时返回的列表中。

当我处于最后一个索引号时，就会出现问题。我有这个代码：

def tokenize (lines):
    tokenizedList = []
    for line in lines:
        endValue = 0
        startValue = 0
        while startValue < len(line):   

            if line[endValue].isalpha():
                while line[endValue].isalpha():
                    endValue = endValue + 1
                word = line[startValue : endValue]
                tokenizedList.append(word)
                startValue = endValue
                
            elif line[endValue].isdigit():
                while line[endValue].isdigit():
                    endValue = endValue + 1
                word = line[startValue : endValue]
                tokenizedList.append(word)
                startValue = endValue
            
            elif line[endValue].isspace():
                while line[endValue].isspace():
                    startValue += 1
                    endValue = startValue
            
            else:
                endValue += 1
                word = line[startValue : endValue]
                tokenizedList.append(word)
                startValue = endValue
    
        return tokenizedList

由于 if 语句中的 while 循环递增 endValue，它最终会超出索引范围。我不知道如何阻止此错误的发生以及应该如何更改 while 循环，以便它知道何时停止检查最后一个字母。有什么想法吗？

【问题讨论】：

【参考方案1】：

您可以简单地使用内置的拆分方法：

tokenizedList = ' my 3 words'.split(' ')

返回 ['my', '3', 'words']

但是，如果你想坚持你的代码，你可以在增加 endValue 之前添加另一个条件：

if line[endValue].isalpha():
    while line[endValue].isalpha() and endValue < len(line)-1:
        endValue += 1
    word = line[startValue : endValue]

别忘了相应地更改数字的代码。

【讨论】：

谢谢！但是这些解决方案中的任何一个是否包含我列表中的最后一个词？除了实现拆分方法的那个 aah 我看到您在增加 endValue 后立即附加列表。抱歉错过了那行，我会在一分钟内编辑我的答案。或者我理解错了，我以为“break”会完全退出方法，谢谢你的帮助！！现在可以了:) break 只是“中断”内部循环。另一个注意事项：当 endValue 高于 len(line) 时，您也可以将其设置为 -1。 line[-1] 将返回 line 的最后一个元素。（-2 倒数第二个，依此类推）我刚刚再次编辑了答案，以便更清楚地了解哪些行必须更改并简化当时的逻辑。

以上是关于检查最后一个索引号时索引超出范围怎么办？的主要内容，如果未能解决你的问题，请参考以下文章