python爬虫实例--网易云音乐排行榜爬虫
Posted ```...简单点
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python爬虫实例--网易云音乐排行榜爬虫相关的知识,希望对你有一定的参考价值。
网易云音乐,以前是有个api 链接的json下载的,现在没了, 只有音乐id,title , 只能看播放请求了, 但是播放请求都是加密的值,好坑...
进过各种努力, 终于找到了个大神写的博客,3.6版本的python
python 完美破解网易云音乐: https://segmentfault.com/a/1190000012818254
python 代码下载地址: https://github.com/imyxuan/Netease
在运行大神的代码时遇到了各种错误:
from Crypto.Cipher import AES
需要安装: pip install pycrypto
第一次错误: 提示没有安装Visual Studio 2015 Microsoft Build Tools 14.0
下载地址: https://www.microsoft.com/en-us/download/details.aspx?id=48159
第二次错误: 安装vs 又提示.NET Framework 版本低,于是又安装4.6版本
下载地址: https://www.microsoft.com/en-us/download/details.aspx?id=48130
第三次错误: 使用cmd 安装pip install pycrypto 提示cl.exe 运行无法正常运行,继续各种百度, 下载个c++的修复软件Directx:
下载地址: http://soft.duote.org/directx_3.7.zip
运行修复工具修复完了后提示c++有异常 ,需要扩展修复.....然后工具-->选项-->扩展-->开始扩展.然后再修复一遍,提示c++修复成功!!
第四次错误: 本以为这样就好了,mmp , cmd 运行又出错了
intmax_t C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.10240.0\\ucrt\\inttypes.
只能百度了:https://blog.csdn.net/u010377372/article/details/78470824 ,安装这个弄的:
解决方案:
1.设置Microsoft Visual Studio 14.0的环境变量 VCINSTALLDIR 变量值 C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC(默认安装位置,请根据自己的安装位置确定)如图:
2.在上面安装路径中执行 vcvarsall.bat 脚本
3.在命令行(cmd)中执行 set CL=-FI"%VCINSTALLDIR%\\INCLUDE\\stdint.h"
4.重新安装 pip install pycrypto
终于安装成功了, 然后信心满满的重启了我的PyCharm, 结果运行的时候还是找不到那个Crypto模块,这是已经过去好几个小时了没办法, 只能百度了..
检查了python的模块安装目录, 发现已经存在了Crypto, 但是项目里面加载不了, 应该是项目里面的引用不对, 果然, 项目里面没有那个模块, 于是, 把安装的Crypto复制一份到了项目的目录下, 此时完美解决了
python 的Crypto 安装目录: C:\\Users\\Administrator\\AppData\\Local\\Programs\\Python\\Python36\\Lib\\site-packages
项目的python 模块目录:
将安装目录下的site-packages所有文件复制到项目的site-packages去, 就完美解决了;欢迎加入QQ: 1095737364 QQ群:123300273
下面是代码:
# coding: utf-8 #https://segmentfault.com/a/1190000012818254 声明: 是这个大神写的, 和没啥关系,我只是改了改 from Crypto.Cipher import AES import base64 import requests import sys headers = { \'Cookie\': \'appver=1.5.0.75771;\', \'Referer\': \'http://music.163.com/\', \'User-Agent\': \'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36\' } first_param = "{\\"ids\\":\\"[%d]\\",\\"br\\":128000,\\"csrf_token\\":\\"\\"}" second_param = "010001" third_param = "00e0b509f6259df8642dbc35662901477df22677ec152b5ff68ace615bb7b725152b3ab17a876aea8a5aa76d2e417629ec4ee341f56135fccf695280104e0312ecbda92557c93870114af6c9d05c4f7f0c3685b7a46bee255932575cce10b424d813cfe4875d3e82047b97ddef52741d546b8e289dc6935b3ece0462db0a22b8e7" forth_param = "0CoJUm6Qyw8W8jud" def get_params(): iv = "0102030405060708" first_key = forth_param second_key = 16 * \'F\' h_encText = AES_encrypt(first_param, first_key, iv) h_encText = AES_encrypt(h_encText, second_key, iv) return h_encText def get_encSecKey(): encSecKey = "257348aecb5e556c066de214e531faadd1c55d814f9be95fd06d6bff9f4c7a41f831f6394d5a3fd2e3881736d94a02ca919d952872e7d0a50ebfa1769a7a62d512f5f1ca21aec60bc3819a9c3ffca5eca9a0dba6d6f7249b06f5965ecfff3695b54e1c28f3f624750ed39e7de08fc8493242e26dbc4484a01c76f739e135637c" return encSecKey def AES_encrypt(text, key, iv): pad = 16 - len(text) % 16 if isinstance(text, str): text = text + pad * chr(pad) else: text = text.decode(\'utf-8\') + pad * chr(pad) encryptor = AES.new(key, AES.MODE_CBC, iv) encrypt_text = encryptor.encrypt(text) encrypt_text = base64.b64encode(encrypt_text) return encrypt_text def get_json(url, params, encSecKey): data = { "params": params, "encSecKey": encSecKey } response = requests.post(url, headers=headers, data=data).json() return response[\'data\'] # 榜单歌曲批量下载 # r = requests.get(\'http://music.163.com/api/playlist/detail?id=2884035\') # 网易原创歌曲榜 # r = requests.get(\'http://music.163.com/api/playlist/detail?id=19723756\') # 云音乐飙升榜 # r = requests.get(\'http://music.163.com/api/playlist/detail?id=3778678\') # 云音乐热歌榜 #r = requests.get(\'http://music.163.com/api/playlist/detail?id=3779629\') # 云音乐新歌榜 # 歌单歌曲批量下载 # r = requests.get(\'http://music.163.com/api/playlist/detail?id=123415635\') # 云音乐歌单——【华语】中国风的韵律,中国人的印记 # r = requests.get(\'http://music.163.com/api/playlist/detail?id=122732380\') # 云音乐歌单——那不是爱,只是寂寞说的谎 r=requests.get("http://music.163.com/api/playlist/detail?id=2884035",headers=headers) arr=r.json()[\'result\'][\'tracks\'] for i in range(100): toplistMP3ID =str(arr[i][\'id\']) toplistMP3Title = str(arr[i][\'name\']) music_id = toplistMP3ID first_param = "{\\"ids\\":\\"[%d]\\",\\"br\\":128000,\\"csrf_token\\":\\"\\"}" % int(music_id) url = \'https://music.163.com/weapi/song/enhance/player/url?csrf_token=\' params = get_params() encSecKey = get_encSecKey() """ rsp:{ \'data\': [{ \'gain\': 2.3073, \'type\': \'mp3\', \'url\': \'http://m10.music.126.net/20180111133509/24c79548414f7aa7407985818cb16a39/ymusic/333c/66b1/e5ec/ 72aeb13aca24c989295e58e8384e3f97.mp3\', \'md5\': \'72aeb13aca24c989295e58e8384e3f97\', \'flag\': 0, \'code\': 200, \'payed\': 0, \'id\': 151619, \'expi\': 1200, \'size\': 3868307, \'uf\': None, \'br\': 128000, \'fee\': 0, \'canExtend\': False}], \'code\': 200} """ rsp = get_json(url, params, encSecKey) music_url = rsp[0].get(\'url\') if music_url: music = requests.get(music_url) name = sys.path[0] + "/mp3/%s.mp3" % toplistMP3Title print(name) with open(name, "wb") as code: code.write(music.content) # music_id = input(\'请输入歌曲ID:\')
以上是关于python爬虫实例--网易云音乐排行榜爬虫的主要内容,如果未能解决你的问题,请参考以下文章