如何使用 youtube-dl 从播放列表中的 Youtube 视频中提取上传日期、标题、URL 和持续时间?
Posted
技术标签:
【中文标题】如何使用 youtube-dl 从播放列表中的 Youtube 视频中提取上传日期、标题、URL 和持续时间?【英文标题】:How to Extract The upload dates, Titles, URLs and Durations from Youtube videos in a Playlist with youtube-dl? 【发布时间】:2021-07-01 18:13:38 【问题描述】:我正在尝试从具有youtube-dl
的特定播放列表的所有 Youtube 视频中提取 Upload Dates
、Titles
、URLs
和 Durations
,我不需要这些视频 - 只是上面的数据。
到目前为止,我已经测试了Alen Paul Varghese 建议的以下两种方法:
Youtube-dl's GitHub Doc Used as reference
The Playlist URL used for testing
方法 #1
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD > example.json
和
方法 #2
youtube-dl --get-upload_date https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD > example.txt
方法 #1 输出一个完整的 json 转储 - 每个视频大约 3000 行 - 对于大量 Youtube 视频播放列表非常不方便处理 - 但它包含 4 个所需的数据。
APPROACH #2 返回以下错误:
youtube-dl: error: no such option: --get-upload_date
我想支持 APPROACH #2 将输出数据限制为仅需要的数据(upload dates
、Titles
、URLs
和 Durations
),跟随 Alen Paul Varghese's 2nd suggestion 和在检查upload_date
是一个有效的youtube-dl
选项后Youtube-dl's GitHub Doc Used as reference。
为什么upload_data
选项没有得到验证?
为了限制数据有什么替代方法?
非常感谢您的有用建议。
这是 json 转储文件: example.json
编辑(感谢@PIERPY 伟大的指导 - 完整记录的免费流程 - 对其他人有帮助):
我已经按照Download jq - Windows的要求成功安装了Chocolatey NuGet和Admin CMD
来安装jq 1.5和chocolatey install jq
我的Chocolatey NuGet
安装输出:
Microsoft Windows [Version 10.0.19042.867]
(c) 2020 Microsoft Corporation. All rights reserved.
C:\WINDOWS\system32>@"%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe" -NoProfile -InputFormat None -ExecutionPolicy Bypass -Command "iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))" && SET "PATH=%PATH%;%ALLUSERSPROFILE%\chocolatey\bin"
Forcing web requests to allow TLS v1.2 (Required for requests to Chocolatey.org)
Getting latest version of the Chocolatey package for download.
Not using proxy.
Getting Chocolatey from https://community.chocolatey.org/api/v2/package/chocolatey/0.10.15.
Downloading https://community.chocolatey.org/api/v2/package/chocolatey/0.10.15 to C:\Users\###\AppData\Local\Temp\chocolatey\chocoInstall\chocolatey.zip
Not using proxy.
Extracting C:\Users\###\AppData\Local\Temp\chocolatey\chocoInstall\chocolatey.zip to C:\Users\###\AppData\Local\Temp\chocolatey\chocoInstall
Installing Chocolatey on the local machine
Creating ChocolateyInstall as an environment variable (targeting 'Machine')
Setting ChocolateyInstall to 'C:\ProgramData\chocolatey'
WARNING: It's very likely you will need to close and reopen your shell
before you can use choco.
Restricting write permissions to Administrators
We are setting up the Chocolatey package repository.
The packages themselves go to 'C:\ProgramData\chocolatey\lib'
(i.e. C:\ProgramData\chocolatey\lib\yourPackageName).
A shim file for the command line goes to 'C:\ProgramData\chocolatey\bin'
and points to an executable in 'C:\ProgramData\chocolatey\lib\yourPackageName'.
Creating Chocolatey folders if they do not already exist.
WARNING: You can safely ignore errors related to missing log files when
upgrading from a version of Chocolatey less than 0.9.9.
'Batch file could not be found' is also safe to ignore.
'The system cannot find the file specified' - also safe.
chocolatey.nupkg file not installed in lib.
Attempting to locate it from bootstrapper.
PATH environment variable does not have C:\ProgramData\chocolatey\bin in it. Adding...
WARNING: Not setting tab completion: Profile file does not exist at 'C:\Users\###\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1'.
Chocolatey (choco.exe) is now ready.
You can call choco from anywhere, command line or powershell by typing choco.
Run choco /? for a list of functions.
You may need to shut down and restart powershell and/or consoles
first prior to using choco.
Ensuring Chocolatey commands are on the path
Ensuring chocolatey.nupkg is in the lib folder
C:\WINDOWS\system32>
然后我运行chocolatey install jq
并成功安装:
我的jq
安装输出:
C:\WINDOWS\system32>chocolatey install jq
Chocolatey v0.10.15
Installing the following packages:
jq
By installing you accept licenses for the packages.
Progress: Downloading jq 1.6... 100%
jq v1.6 [Approved]
jq package files install completed. Performing other installation steps.
The package jq wants to run 'chocolateyinstall.ps1'.
Note: If you don't run this script, the installation will fail.
Note: To confirm automatically next time, use '-y' or consider:
choco feature enable -n allowGlobalConfirmation
Do you want to run the script?([Y]es/[A]ll - yes to all/[N]o/[P]rint): Y
Downloading jq 64 bit
from 'https://github.com/stedolan/jq/releases/download/jq-1.6/jq-win64.exe'
Progress: 100% - Completed download of C:\ProgramData\chocolatey\lib\jq\tools\jq.exe (3.36 MB).
Download of jq.exe (3.36 MB) completed.
Hashes match.
C:\ProgramData\chocolatey\lib\jq\tools\jq.exe
ShimGen has successfully created a shim for jq.exe
The install of jq was successful.
Software install location not explicitly set, could be in package or
default install location if installer.
Chocolatey installed 1/1 packages.
See the log for details (C:\ProgramData\chocolatey\logs\chocolatey.log).
然后我运行了您的 @pierpy
youtube-dl 命令:
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq '"date": .upload_date,"title": .title,"URL": .url,"duration": .duration'
出现语法错误,输出如下:
Microsoft Windows [Version 10.0.19042.867]
(c) 2020 Microsoft Corporation. All rights reserved.
C:\Users\###>cd documents
C:\Users\###\Documents>cd youtube-dl
C:\Users\###\Documents\youtube-dl>youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq '"date": .upload_date,"title": .title,"URL": .url,"duration": .duration'
jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Windows cmd shell quoting issues?) at <top-level>, line 1:
'date:
jq: 1 compile error
Traceback (most recent call last):
File "__main__.py", line 19, in <module>
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\__init__.py", line 475, in main
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\__init__.py", line 465, in _real_main
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 2060, in download
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 799, in extract_info
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 806, in wrapper
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 838, in __extract_info
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 924, in process_ie_result
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 1058, in __process_playlist
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 806, in wrapper
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 1068, in __process_iterable_entry
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 910, in process_ie_result
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 872, in process_ie_result
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 1683, in process_video_result
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 1793, in process_info
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 1765, in __forced_printings
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 520, in to_stdout
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 509, in _write_string
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\utils.py", line 3180, in write_string
OSError: [Errno 22] Invalid argument
C:\Users\###\Documents\youtube-dl>
然后我用谷歌搜索了错误
jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Windows cmd shell quoting issues?)
并从这个建议中获得洞察:
It's all about the quoting
然后我相应地将您的 @pierpy
youtube-dl 命令单引号改为双引号:
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq ""date": .upload_date,"title": .title,"URL": .url,"duration": .duration"
现在它会根据需要输出数据Upload Dates
、Titles
、URLs
和Durations
。
最终输出:
C:\Users\###\Documents\youtube-dl>youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq ""date": .upload_date,"title": .title,"URL": .url,"duration": .duration"
"date": "20150717",
"title": "3.1: Flow (setup and draw) - Processing Tutorial",
"URL": "https://r1---sn-n0ogpnx-b85s.googlevideo.com/videoplayback?expire=1617730292&ei=lEZsYKDoEZmAp-oP3ayk8AI&ip=188.154.162.181&id=o-AHFxnOR5c5xqmgtu1JG4FbL6lJW0gz1pJQN77cr2-27T&itag=22&source=youtube&requiressl=yes&mh=m6&mm=31%2C29&mn=sn-n0ogpnx-b85s%2Csn-1gieen7e&ms=au%2Crdu&mv=m&mvi=1&pl=23&initcwndbps=1578750&vprv=1&mime=video%2Fmp4&ns=r3pR-nwt6FkDQa33iQQu-qgF&ratebypass=yes&dur=944.007&lmt=1607684088067796&mt=1617708538&fvip=5&fexp=24001373%2C24007246&beids=9466585&c=WEB&txp=5432434&n=3P6HQoLfY8ktFLG5&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRgIhAMiNOv8QDjfsn7yxicEOtSjcEYjZlX3CfrI8D-HGBd63AiEA4E6rKv_kYti6rAeieJzPAdTYjoh05Az_11Kcxt-0jBg%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRAIgD43F71OxMExfQyN9FeNWfZX_aiGAD3SKlKOLNR14NT8CICEuD_Ry0oymKZmFfHuP4F6v9MKCrmRI0x27sLG8fvyG",
"duration": 944
"date": "20150717",
"title": "3.2: Built-in Variables (mouseX, mouseY) - Processing Tutorial",
"URL": "https://r4---sn-n0ogpnx-b85l.googlevideo.com/videoplayback?expire=1617730293&ei=lEZsYMO2OczSWaPiueAC&ip=188.154.162.181&id=o-ANuT73vsKQLvQqynOeh00stVP-zqbq3x-iUrdDiYwg8E&itag=22&source=youtube&requiressl=yes&mh=kE&mm=31%2C29&mn=sn-n0ogpnx-b85l%2Csn-1gieen7e&ms=au%2Crdu&mv=m&mvi=4&pl=23&initcwndbps=1617500&vprv=1&mime=video%2Fmp4&ns=tPtC_l82gq-yi-rk_oQXatAF&cnr=14&ratebypass=yes&dur=814.207&lmt=1551720899437893&mt=1617708538&fvip=5&fexp=24001373%2C24007246&beids=9466585&c=WEB&txp=5432432&n=LhJHXWU8TGNOrD9u&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Ccnr%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRAIgSHTlBPN0j49hoB02SYDeF3-9fe1iSz1KRiv9iFy8nj0CIHEafdAOBefsos8kO5FGhDljsKpOV7ZQ9dY1BEzQQ0n0&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRgIhAJkd-9posqapJekca_35YNG0g3nLgxTfW06EqRM-a3wDAiEApSrsS5wPlMPXjlI_bvOh53cjxlrHfNSKD4XbhyDyZ6w%3D",
"duration": 815
"date": "20150717",
"title": "3.3: Events (mousePressed, keyPressed) - Processing Tutorial",
"URL": "https://r4---sn-n0ogpnx-b85l.googlevideo.com/videoplayback?expire=1617730293&ei=lUZsYK6WJ4TeWaeflbgF&ip=188.154.162.181&id=o-AD1WgS46WiFogy00v3aHRp6aZXkd_ACN-_m76lPoQvA8&itag=22&source=youtube&requiressl=yes&mh=it&mm=31%2C29&mn=sn-n0ogpnx-b85l%2Csn-1gieen7e&ms=au%2Crdu&mv=m&mvi=4&pl=23&initcwndbps=1617500&vprv=1&mime=video%2Fmp4&ns=AlyS4uv2BH5ENfp_nP53I-sF&cnr=14&ratebypass=yes&dur=441.225&lmt=1472343659978757&mt=1617708538&fvip=4&fexp=24001373%2C24007246&beids=9466585&c=WEB&n=np6rmmeSKhYEvG1K&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Ccnr%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRgIhAIRmvxmY-VidN3LPhnzCNQ2TLsUB_7i1yU0QOMBVUS6AAiEAm9DE-Kk6cCNb8FC0we4c2O8299n2_2jGnQfzYzz0igo%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRQIgZzrGEwMcb0Vrj9FleanW2apPMu_55OdH2SRdw66DQ1QCIQCDsAz7X5RxczKtWzokBhyUNcyXLXeZF-ENufpjA0BP2Q%3D%3D",
"duration": 442
C:\Users\###\Documents\youtube-dl>
上一期:
得到的URLs
不显示标准视频。
为什么不呢?
在Youtube-dl's GitHub Doc Used as reference 中声明:
url (string): Video URL
如何检索标准的 Youtube 视频 URL?
最后一个问题的答案:
我刚刚查看了我昨天生成的 example.json
文件,发现标准 Youtube 视频 URL 接受 webpage_url
代替 url
。
YOUTUBE-DL 最终输出:
C:\Users\###\Documents\youtube-dl>youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq ""date": .upload_date,"title": .title,"URL": .webpage_url,"duration": .duration"
"date": "20150717",
"title": "3.1: Flow (setup and draw) - Processing Tutorial",
"URL": "https://www.youtube.com/watch?v=o8dffrZ86gs",
"duration": 944
"date": "20150717",
"title": "3.2: Built-in Variables (mouseX, mouseY) - Processing Tutorial",
"URL": "https://www.youtube.com/watch?v=ibW4oA7-n8I",
"duration": 815
"date": "20150717",
"title": "3.3: Events (mousePressed, keyPressed) - Processing Tutorial",
"URL": "https://www.youtube.com/watch?v=UvSjtiW-RH8",
"duration": 442
C:\Users\###\Documents\youtube-dl>
在 JSON 文件中获取最终输出:
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq ""date": .upload_date,"title": .title,"URL": .webpage_url,"duration": .duration" > example.json
【问题讨论】:
--get-upload_date
不是一个有效的选项。 upload_date
用于命名输出文件
好的。将这些选项添加为有效选项会很棒。
【参考方案1】:
您需要使用方便的工具过滤输出,例如 jq
:
粘贴此命令行:youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq '"date": .upload_date,"title": .title,"URL": .url,"duration": .duration'
可以从https://stedolan.github.io/jq/download/获取jq
更新:
键 "webpage_url"
包含标准的 YouTube 网址(如果需要)。
有关各种可能键的完整列表,请运行:youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq keys
这给出了原始 JSON 中的完整键名。
【讨论】:
非常感谢您的大力帮助。只是最后一个问题(在我上面的问题编辑中也提到过)。这些 URL 不是标准的 Youtube 视频。获取标准 URL 的合适选项是什么? 我刚刚查看了我昨天生成的example.json
文件,发现标准的 Youtube 视频 URL 接受 webpage_url
代替 url
。有用!再次感谢您的出色指导,非常感谢您的帮助。身体健康!
@Lod 我正在为此写另一个答案,但你在我之前做到了:-D
非常感谢您跟进可能的密钥,非常有帮助。身体健康!
@Lod 很高兴为您提供帮助!以上是关于如何使用 youtube-dl 从播放列表中的 Youtube 视频中提取上传日期、标题、URL 和持续时间?的主要内容,如果未能解决你的问题,请参考以下文章
如何使用 ffplay 和 youtube-dl 在终端中播放 youtube 歌曲
youtube-dl:通过忽略 archive.txt 中指定的视频,将 youtube 视频 info.json 下载到播放列表中