UnicodeDecodeError：'utf-8'编解码器无法解码位置1中的字节0x8b：无效的起始字节

Question

我试图通过遵循udacity课程在python中创建一个爬虫。我有这个方法get_page()返回页面的内容。

def get_page(url):
    '''
    Open the given url and return the content of the page.
    '''

    data = urlopen(url)
    html = data.read()
    return html.decode('utf8')

原来的方法只是返回data.read()，但这样我就不能像str.find()那样进行操作。快速搜索后，我发现我需要解码数据。但现在我收到了这个错误

UnicodeDecodeError：'utf-8'编解码器无法解码位置1中的字节0x8b：无效的起始字节

我在SO中发现了类似的问题，但没有一个专门用于此。请帮忙。

Answer 1

另一答案