IndexError：列表索引超出范围。无法理解问题出在哪里？

Posted 2023-02-16

技术标签:

【中文标题】IndexError：列表索引超出范围。无法理解问题出在哪里？【英文标题】：IndexError: list index out of range. Cannot understand where the issue is? 【发布时间】：2020-07-28 05:59:28 【问题描述】：

这是我收到 IndexError 的代码。

# importing the required libraries
import pandas as pd

# Visualisation libraries
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import folium 
from folium import plugins

# Manipulating the default plot size
plt.rcParams['figure.figsize'] = 10, 12

# Disable warnings 
import warnings
warnings.filterwarnings('ignore')
# for date and time opeations
from datetime import datetime
# for file and folder operations
import os
# for regular expression opeations
import re
# for listing files in a folder
import glob
# for getting web contents
import requests 
# for scraping web contents
from bs4 import BeautifulSoup
# get data

# link at which web data recides
link = 'https://www.mohfw.gov.in/'
# get web data
req = requests.get(link)
# parse web data
soup = BeautifulSoup(req.content, "html.parser")
# find the table
# ==============
# our target table is the last table in the page

# get the table head
# table head may contain the column names, titles, subtitles
thead = soup.find_all('thead')[-1]
# print(thead)

# get all the rows in table head
# it usually have only one row, which has the column names
head = thead.find_all('tr')
# print(head)

# get the table tbody
# it contains the contents
tbody = soup.find_all('tbody')[-1]
# print(tbody)

# get all the rows in table body
# each row is each state's entry
body = tbody.find_all('tr')
# print(body)

IndexError

Traceback (most recent call last)
<ipython-input-7-eda41c6e195c> in <module>
     15 # get the table tbody
     16 # it contains the contents
---> 17 tbody = soup.find_all('tbody')[-1]
     18 # print(tbody)
     19 

IndexError: list index out of range

【问题讨论】：

[-1] 这是获取索引-1 的元素，这对于列表没有意义，因此会出现错误。可能这应该是[::-1]，这是用于反转列表顺序的切片符号。您想从站点中提取表格信息？ @HymnsForDisco [-1] 索引在 Python 中确实有效——它获取列表中的最后一个元素 :) 实际上它非常有用。 OP 可能出错的地方是列表中没有元素的情况。在这种情况下，没有“最后一个元素”，所以 Python 会抛出错误。 @GrantSchulte 好点。似乎有些时间在更严格的语言中让我忘记了 Python 的一些技巧在我修复了问题 [:-1] 之后，我在这一行遇到了另一个错误::: body = tbody.find_all('tr') AttributeError: 'list' object has no attribute 'find_all ' 【参考方案1】：

由于列表为空而发生此错误。当不确定列表是否为空时，请进行检查。对于列表 l:-

if len(l) != 0:
    k = l[-1]
else:
    k = None

【讨论】：

【参考方案2】：

当您提取表格时，table 中没有 tbody 标记。

当您正确分析网站时，您会发现该网站进行了 ajax 调用以获取表格信息。以下脚本将 json 数据保存到文件中。美妙之处在于您无需传递任何内容即可获取此数据。这总是返回最新的数据。

import requests, json

url = 'https://www.mohfw.gov.in/data/datanew.json'
res = requests.get(url)

with open("data.json", "w") as f:
    json.dump(res.json(), f)

【讨论】：

以上是关于IndexError：列表索引超出范围。无法理解问题出在哪里？的主要内容，如果未能解决你的问题，请参考以下文章