YouTube 订阅列表抓取

Posted 2023-02-23

技术标签:

【中文标题】YouTube 订阅列表抓取【英文标题】：YouTube Subscriptions List Scraping 【发布时间】：2022-01-23 00:32:23 【问题描述】：

我想将我的 YouTube 订阅列表剪贴到一个 csv 文件中。我输入了这段代码（但我还没有完成编码）：

import requests
from bs4 import BeautifulSoup
import csv

url = 'https://www.youtube.com/feed/channels'
source = requests.get(url)
soup = BeautifulSoup(source, 'lxml')

我发现了这个错误：

文件“/Users/hendy/YouTube subscriptions scraping.py”，第 7 行，在汤 = BeautifulSoup(source, 'lxml') File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/bs4/init.py", 第 312 行，在 init 中 elif len(markup)

我不知道是什么问题。

【问题讨论】：

【参考方案1】：

会发生什么？

您使用整个 response 对象并将其推送到 BeautifulSoup 是行不通的。

如何解决？

要生成BeautifulSoup 对象，请使用您的回复中的content 或text：

BeautifulSoup(source.content, 'lxml')

示例

from bs4 import BeautifulSoup
import requests
headers ='User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (Khtml, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
url = 'https://www.youtube.com/feed/channels'
source = requests.get(url, headers=headers)
soup = BeautifulSoup(source.content, 'lxml')

【讨论】：

以上是关于YouTube 订阅列表抓取的主要内容，如果未能解决你的问题，请参考以下文章