((Python)Beautifull汤和编码(utf-8,cp1252,ascii ...)

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了((Python)Beautifull汤和编码(utf-8,cp1252,ascii ...)相关的知识,希望对你有一定的参考价值。

请帮助,我现在很放松。自从我开始学习Python以来,我遇到了这个问题。总是遇到相同的问题,任何人都无法在线给出任何有效答案

我的代码:

from bs4 import BeautifulSoup
import requests

page = requests.get(
    'https://forecast.weather.gov/MapClick.php?lat=34.05349000000007&lon=-118.24531999999999#.XswiwMCxWUk')
soup = BeautifulSoup(page.content, 'html.parser')
week = soup.find(id='seven-day-forecast-body')
items = week.find_all(class_='forecast-tombstone')

print(items[0].find(class_='period-name').get_text())
print(items[0].find(class_='short-desc').get_text())
print(items[0].find(class_='temp temp-high').get_text())

period_names = [item.find(class_='period-name').get_text() for item in items]
short_descp = [item.find(class_='short-desc').get_text() for item in items]
temp = [item.find(class_='temp temp-high').get_text() for item in items]
print(period_names)
print(short_descp)
print(temp)

输出:

[Running] python -u "c:UsersdukasuDocumentsPython	est.py"
ThisAfternoon
Partly Sunny
High: 76 �F
Traceback (most recent call last):
  File "c:UsersdukasuDocumentsPython	est.py", line 20, in <module>
    temp = [item.find(class_='temp temp-high').get_text() for item in items]
  File "c:UsersdukasuDocumentsPython	est.py", line 20, in <listcomp>
    temp = [item.find(class_='temp temp-high').get_text() for item in items]
AttributeError: 'NoneType' object has no attribute 'get_text'

[Done] exited with code=1 in 0.69 seconds

问题是由于utf-8编码(我的电脑在cp1252上),但是如何最终解决它(我认为问题是cos无法使用度数符号进行操作)。 Python 2中有一个简单的代码,但是如何在Python 3.xx中解决它。如何在代码开始时设置编码,而忽略此问题。anp请原谅我的英语,这不是我的母语。

答案

错误来自类名,仅使用class_='temp不使用class_='temp temp-high

示例

temp = [item.find(class_='temp').get_text() for item in items]

完整代码

from bs4 import BeautifulSoup
import requests

page = requests.get(
    'https://forecast.weather.gov/MapClick.php?lat=34.05349000000007&lon=-118.24531999999999#.XswiwMCxWUk')
soup = BeautifulSoup(page.content, 'html.parser')
week = soup.find(id='seven-day-forecast-body')
items = week.find_all(class_='forecast-tombstone')

print(items[0].find(class_='period-name').get_text())
print(items[0].find(class_='short-desc').get_text())
print(items[0].find(class_='temp temp-high').get_text())

period_names = [item.find(class_='period-name').get_text() for item in items]
short_descp = [item.find(class_='short-desc').get_text() for item in items]
temp = [item.find(class_='temp').get_text() for item in items]
print(period_names)
print(short_descp)
print(temp)

打印输出

ThisAfternoon
Partly Sunny
High: 76 °F
['ThisAfternoon', 'Tonight', 'Saturday', 'SaturdayNight', 'Sunday', 'SundayNight', 'Monday', 'MondayNight', 'Tuesday']
['Partly Sunny', 'Patchy Fog', 'Patchy Fogthen MostlySunny', 'Patchy Fog', 'Patchy Fogthen PartlySunny', 'Patchy Fog', 'Patchy Fogthen MostlyCloudy', 'Mostly Cloudy', 'Partly Sunny']
['High: 76 °F', 'Low: 58 °F', 'High: 75 °F', 'Low: 59 °F', 'High: 80 °F', 'Low: 61 °F', 'High: 78 °F', 'Low: 61 °F', 'High: 77 °F']

以上是关于((Python)Beautifull汤和编码(utf-8,cp1252,ascii ...)的主要内容,如果未能解决你的问题,请参考以下文章

美丽的汤和uTidy

美丽的汤和提取价值

python 编码问题 u'汉字'

python 2 与 python 3 —— 转义及编码(u x)

美丽的汤和正则表达式

python:unicodeEncodeError:“charpmap”编解码器无法编码字符“\u2026”