华为云照片的爬虫程序更新(python3.6)

Posted 朝花夕拾

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了华为云照片的爬虫程序更新(python3.6)相关的知识,希望对你有一定的参考价值。

一、背景:

每年终都有一个习惯,就是整理资料进行归档,结果发现手机照片全备份在华为云里,在官网上找了一圈,没找到官方的pc工具用来同步照片。

于是找出上次写的程序,看看能不能爬到数据,然而……果然不好用。因为华为在登录上又增加了一些验证机制,譬如:账号保护

抓了一下报文,发现逻辑变复杂了很多,部分逻辑还封装在js里。

算了,懒得琢磨了,直接用selenium吧。

二、实现思路:

1、用Python + selenium +浏览器 ,人工登录,保存cookie及签名信息。

2、再调用requests加第一步保存的cookie和前面,直接向后台发post请求,获取数据。

思路确定,开干。

三、开发环境:

1、python3.6,在最近的一个项目中由于多次遇到中文问题,实在是烦不胜烦,所以就把开发工具升级到了py3,确实方便多了。

说到py2升到py3,虽然还是有些写法调整,有些包在py3下不支持,但总体来说,迁移很平稳,写法问题,百度一下基本就可以解决。

我用的Anaconda的python包。

3.6.3 |Anaconda custom (64-bit)| (default, Oct 15 2017, 03:27:45) [MSC v.1900 64 bit (AMD64)]
Python Type "help", "copyright", "credits" or "license" for more information.

 

2、selenium 3.9.0,用conda现安装的。

conda install selenium 

3、浏览器,试用了firefox,edge,chrome,phantomjs,分别版本如下:

firefox: 58.0.2 (64 位)
edge: Microsoft Edge 41.16299.248.0 ,Microsoft Edge 41.16299.248.0
chrome: 版本 63.0.3239.132(正式版本) (32 位)
phantomjs: 2.1.1 

另外,操作系统:Microsoft Windows [版本 10.0.16299.248]

 

 

4、浏览器驱动:

firefox驱动,https://github.com/mozilla/geckodriver/releases/,支持 Firefox 55及以上版本。

edge驱动,https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/#downloads,最新版本 Release 16299,Version: 5.16299,支持 Edge version supported: 16.16299 。注意edge驱动只有在edge浏览器未启动的情况下才能正常运行,否则会报错。

chrome驱动,https://sites.google.com/a/chromium.org/chromedriver/downloads,这里需要注意的是:最新版本是2.35(不是2.9),2.35才支持chrome 61-63版本。

phantomjs,http://phantomjs.org/download.html,phantomjs可以理解成没有界面的浏览器,所以驱动跟浏览器是一体的。

驱动版本一定要选对,否则会有奇奇怪怪的问题。

 

四、实现代码

huaweiphoto_sele.py,如下:

#-*-coding=utf-8-*-
#Create by : zhongtang 
#Create Date : 2018.2.28

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.common.proxy import ProxyType
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from PIL import Image
import json,re,os,time,requests,socket

#下载函数
from  huaweiphoto_py3 import HuaWei


class hwSele:
    SeleBrowser=None
    TimeOUT=30
    Headers=None
    Username=\'*****\'
    Passwd=\'****\'
    DriverType="Edge".lower()
    def __init__(self,ip=None,port=None,SeleDriver="Edge",SeleHeader=None):        
        print (u\'proxy %s %s...\' %(ip,port))
        if not SeleHeader :
            self.Headers = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:56.0) Gecko/20100101 Firefox/56.0"
        else:
            self.Headers = SeleHeader
        
        if SeleDriver: 
            self.DriverType= SeleDriver.lower()
        
        #加代理的目的是为了更便于抓报文。
        if self.DriverType==\'chrome\' :
            chromeOptions = webdriver.ChromeOptions()
            if ip:
                chromeOptions.add_argument(\'--proxy-server=http://%s:%s\' %(ip,port))  
                self.SeleBrowser = webdriver.Chrome(chrome_options=chromeOptions)
            else:
                self.SeleBrowser = webdriver.Chrome()
        #DriverType=\'Edge\'
        elif self.DriverType==\'phantomjs\':
            #设置userAgent
            dcap = dict(DesiredCapabilities.PHANTOMJS)
            dcap["phantomjs.page.settings.userAgent"] = (self.Headers)    
            self.SeleBrowser = webdriver.PhantomJS(executable_path=r\'D:\\python\\toupiao\\phantomjs\\bin\\phantomjs.exe\',desired_capabilities=dcap)
            if ip:
                proxy=webdriver.Proxy()
                proxy.proxy_type=ProxyType.MANUAL
                proxy.http_proxy=\'%s:%s\' %(ip,port)            
                proxy.add_to_capabilities(webdriver.DesiredCapabilities.PHANTOMJS)
            else:
                self.SeleBrowser.start_session(webdriver.DesiredCapabilities.PHANTOMJS)
        elif self.DriverType==\'edge\':
            self.KillSeleProc() #edge,默认先kill掉已启动的浏览器。
            self.SeleBrowser = webdriver.Edge()
        elif self.DriverType==\'firefox\':
            webdriver.DesiredCapabilities.FIREFOX[\'firefox.page.settings.userAgent\'] = self.Headers
            profile = webdriver.FirefoxProfile()
            if ip: 
                profile.set_preference(\'network.proxy.type\', 1)  # 默认值0,就是直接连接;1就是手工配置代理。  
                profile.set_preference(\'network.proxy.http\', ip)  
                profile.set_preference(\'network.proxy.http_port\', port)  
                profile.set_preference(\'network.proxy.ssl\', ip)  
                profile.set_preference(\'network.proxy.ssl_port\', port)
                profile.update_preferences()  
                self.SeleBrowser = webdriver.Firefox(profile)
            else:
                self.SeleBrowser = webdriver.Firefox()        
        socket.setdefaulttimeout(self.TimeOUT) 
        # 设置10秒页面超时返回,类似于requests.get()的timeout选项,driver.get()没有timeout选项  
        # 以前遇到过driver.get(url)一直不返回,但也不报错的问题,这时程序会卡住,设置超时选项能解决这个问题。  
        self.SeleBrowser.set_page_load_timeout(self.TimeOUT)  
        # 设置10秒脚本超时时间
        self.SeleBrowser.set_script_timeout(self.TimeOUT)
        # 隐式等待30秒,可以自己调节 
        self.SeleBrowser.implicitly_wait(self.TimeOUT)

    def KillSeleProc(self):
        if self.DriverType==\'edge\':
            command = \'taskkill /F /IM MicrosoftWebDriver.exe & taskkill /F /IM MicrosoftEdge.exe\'
            #比如这里关闭edge进程
        elif self.DriverType==\'chrome\':
            command = \'taskkill /F /IM chromedriver.exe & taskkill /F /IM chrome.exe\'
        elif self.DriverType==\'firefox\':
            command = \'taskkill /F /IM geckodriver.exe & taskkill /F /IM firefox.exe\'
        elif self.DriverType=="phantomjs":
            command = \'taskkill /F /IM phantomjs.exe \'
        if command: os.system(command)
        
            
    def QuitSele(self,e,mess=None,iRet= -1):
        print (mess,e)        
        if self.SeleBrowser: 
            self.SeleBrowser.save_screenshot(\'error.png\')
            self.SeleBrowser.close()
        self.KillSeleProc()
        return iRet
        
    def LoginHW(self):
        \'\'\'
        try:
            element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "loadedButton")))
        finally:
            print(driver.find_element_by_id("content").text)
            driver.close()
            
        #等待页面加载完毕1,显示等待
        try:            
            auth_img = WebDriverWait(self.SeleBrowser, 5).until(EC.presence_of_element_located((By.ID, "randomCodeImg")))      
        except Exception as e: 
            print (u\'加载验证码超时...\',e)
            SeleBrowser.save_screenshot(r\'d:\\python\\toupiao\\error.jpg\')
            self.SeleBrowser.close()
            return -1 
            
            
        #等待页面加载完毕2,隐式等待
        dr=WebDriverWait(self.SeleBrowser,20,0.5)
        dr.until(lambda the_driver:the_driver.find_element_by_xpath("//img[@id=\'randomCodeImg\']").is_displayed())       
        \'\'\'
        try:
            self.SeleBrowser.get(\'http://cloud.huawei.com\')
        except Exception as e:  
            return self.QuitSele(e,"打开主页出错!")
        
        try:
            #等待页面加载完毕
            dr=WebDriverWait(self.SeleBrowser,self.TimeOUT,0.5)
            dr.until(lambda the_driver:the_driver.find_element_by_id("randomCodeImg").is_displayed())        
        except Exception as e:
            return self.QuitSele(e,"加载验证码超时!")

        elem_user = self.SeleBrowser.find_element_by_id("login_userName")
        elem_user.clear()
        elem_user.send_keys(self.Username)
        
        elem_pwd =  self.SeleBrowser.find_element_by_id("login_password")
        elem_pwd.clear()
        elem_pwd.send_keys(self.Passwd)        

        auth_img = self.SeleBrowser.find_element_by_id("randomCodeImg")
        if not auth_img.is_displayed() :
            if not auth_img.is_displayed():
                return self.QuitSele(e,"验证码未正常显示!")
        
        if self.DriverType==\'firefox\':
            #firefox驱动支持直接 元素另存图片
            auth_img.screenshot("captcha.png")
            im = Image.open(\'captcha.png\')
        else:
            #chrome ,edge 都不支持,phantomjs存的还是整个窗口
            self.SeleBrowser.save_screenshot(\'captcha.png\')  
            im = Image.open(\'captcha.png\')
            x= eval(auth_img.get_attribute("x"))
            y= eval(auth_img.get_attribute("y"))
            width= eval(auth_img.get_attribute("width"))
            height= eval(auth_img.get_attribute("height"))
            im = im.crop((x, y, x+width, y+height))
        #这里采用最原始、最准确的方法:显示图片,人工识别^_^,智能输入验证码。
        #当然也可以调用三方的图像识别api进行识别,譬如pytesseract或者鹅厂的图像识别api,不复杂,但懒得写了。
        im.show()
        authCode= input(u\'请输入验证码:\')
        
        # 先获取焦点,再赋值,再点击登录
        \'\'\'
        js= \'$("#randomCode").attr("value","%s");$("#randomCode").trigger("onchange");\' %authCode
        self.SeleBrowser.execute_script(js)
        
        js= \'$("#btnLogin").trigger("click");\' 
        self.SeleBrowser.execute_script(js)  
        \'\'\'
        randomCode = self.SeleBrowser.find_element_by_id("randomCode")
        randomCode.clear()
        randomCode.send_keys(authCode)
        
        #休息五秒,等待完成后台预验证交互
        time.sleep(5)
        
        btnLogin = self.SeleBrowser.find_element_by_id("btnLogin")
        btnLogin.click()
            
        #账号保护有时候会提示
        \'\'\'
        <div class="global_dialog_confirm_main" style="display: block; margin-top: -163.5px;">
        <div class="global_dialog_confirm_title"> 
        <h3 class="ellipsis" title="帐号保护">帐号保护</h3>    </div>    
        <div class="global_dialog_confirm_content" style="padding-bottom: 0px;"><div>
        <div id="authenDialog"><p class="inptips2">您已开启帐号保护,请输入验证码以完成登录。</p>
        <div class="margin10-EMUI5"><div id="accountDiv" class="fixAccountDrt ddrop-EMU5">        
        \'\'\'
        try :
            #loginConfirm = self.SeleBrowser.find_element_by_class_name("global_dialog_confirm_main")
            loginConfirm =WebDriverWait(self.SeleBrowser, 5, 0.5).until(EC.presence_of_element_located((By.CLASS_NAME, \'global_dialog_confirm_main\') ))                  
            #需要验证,这块懒得实现了,休眠60秒,手动操作吧。
            if loginConfirm.is_displayed():
                time.sleep(self.TimeOUT*2)
        except:
            #不需要验证,直接下一步
            pass
                
        #等待页面加载完毕
        \'\'\'
        <span class="index-span" data-bind="lang.common.album">图库</span>
        \'\'\'
        try :
            #loginConfirm = self.SeleBrowser.find_element_by_class_name("global_dialog_confirm_main")
            success =WebDriverWait(self.SeleBrowser, 20, 0.5).until(EC.presence_of_element_located((By.XPATH, \'//span[@data-bind="lang.common.album"]\') ))                  
        except Exception as e:
            #登录失败
            return self.QuitSele(e,"登录失败!",iRet=-999)
        
        #判断登录结果
        if not success.is_displayed(): return self.QuitSele(None,"登录失败!",iRet=-999)
     
        #再次判断,增加一次意外处理
        source_code =self.SeleBrowser.page_source
        if \'联系人\' not in source_code or \'图库\' not in source_code : 
            return self.QuitSele(None,"登录失败!",iRet = -9999 )

        cookie = [item["name"] + "=" + item["value"] for item in self.SeleBrowser.get_cookies()]  
        cookiestr = \';\'.join(item for item in cookie)
        #保存CSRFToken
        pattern = re.compile(\'CSRFToken = "(.*?)"\',re.S)        
        content = re.search(pattern,source_code)
        if content :
            CSRFToken = content.group(1)
        else :
            print (\'获取CSRFToken出错!\')
        self.Headers={
            \'User-Agent\': \'%s\' %self.Headers,
            \'CSRFToken\': \'%s\' %CSRFToken,
            \'Cookie\': \'%s\' %cookiestr
        }
        return 1 
            
if __name__ == \'__main__\':
    photohw= HuaWei()
    count =0 
    while (count <100):
        count += 1
        selehw= hwSele(SeleDriver=\'edge\')
        iRet = selehw.LoginHW()
        if iRet !=1:
            print( \'登录华为失败!!!\\n\\n\')
            continue        
        photohw.loginHeaders = selehw.Headers 
        page = photohw.getAlbumList()
        if page==\'\' :
            print( \'获取到相册列表失败!!!\\n\\n\')
            break
        #保存相册列表
        iRet = photohw.getFileList(page,\'albumList\',\'albumId\')
        if iRet <=0 :
            print(\'保存相册出错,重新登录\')
            continue
        #保存公共相册列表
        iRet = photohw.getFileList(page,\'ownShareList\',\'shareId\')
        if iRet ==0 :
            print(\'运行结束,可以用迅雷打开相册文件进行批量下载到本地!!!\\n\\n\')
            #运行结束
            selehw.QuitSele(None)
            break
        else:
            continue  

 

 

huaweiphoto_py3.py如下:

# -*- coding=utf-8 -*-
# Create by : zhongtang 
# Create date : 2018.2.28


import json
import requests
from requests.adapters import HTTPAdapter
import html

class HuaWei:
    #华为云服务登录
    def __init__(self):
        self.getalbumsUrl= \'https://www.hicloud.com/album/getCloudAlbums.action\'
        self.getalbumfileUrl = \'https://www.hicloud.com/album/getCloudFiles.action\'
        self.loginHeaders = { }
        self.SReq=requests.session()
        self.SReq.mount(\'http://\', HTTPAdapter(max_retries=3))
        self.SReq.mount(\'https://\', HTTPAdapter(max_retries=3))
        self.OnceMaxFile=100 #单次最大获取文件数量
        self.FileNum=0
        self.AlbumList={}
    

    #保存相册照片地址到文件 ,不同相册保存到不同的文件
    def saveFileList2Txt(self,filename,hjsondata,flag):
        if len(hjsondata)<= 0 : return -1
        hjson2 = {}
        try:
            hjson2 = json.loads(hjsondata)
        except:
            print(\'获取相册明细出错\\n\')
            return -1
        
        lfilename = filename+u".txt"
        if flag == 0 : #新建文件
            print( u\'创建相册文件\'+lfilename+"\\n")
            #新建文件,代表新的相册重新开始计数
            self.FileNum = 0
            f = open(lfilename, \'w\')
        else: #追加文件
            f = open(lfilename, \'a\')
        i = 0             
        if hjson2.get("fileList"):
            for each in hjson2["fileList"]:
                fileurl= html.unescape(hjson2["fileList"][i]["fileUrl"])
                f.write(fileurl+"\\n")
                #每一千行分页
                self.FileNum += 1
                if self.FileNum%1000 ==0 :f.write(\'\\n\\n\\n\\n\\n\\n--------------------page %s ------------------\\n\\n\\n\\n\\n\\n\' %(int(self.FileNum/1000)))
                i += 1
        f.close()
        return i
    
    #循环读取相册文件
    def getFileList(self,hjsondata,parentkey,childkey):
        #step 3 getCoverFiles.action,循环取相册文件列表,单次最多取100条记录。
        #每次count都是最大数量49,不管实际数量是否够,每次currentnum递增,直到返回空列表。
        #albumIds[]=default-album-2&ownerId=220086000029851117&height=300&width=300&count=49&currentNum=0&thumbType=imgcropa&fileType=0
        #albumIds[]=default-album-1&ownerId=220086000029851117&height=300&width=300&count=49&currentNum=49&thumbType=imgcropa&fileType=0
        #albumIds[]=default-album-1&ownerId=220086000029851117&height=300&width=300&count=49&currentNum=98&thumbType=imgcropa&fileType=0
        #albumIds[]=default-album-2&ownerId=220086000029851117&height=300&width=300&count=49&currentNum=101&thumbType=imgcropa&fileType=0
        #最后一次返回 空列表
        #{"albumSortFlag":true,"code":0,"info":"success!","fileList":[]}
        #第一次取文件时,例如文件总数量只有2个,count也是放最大值49。
        #albumIds[]=default-album-102-220086000029851117&ownerId=220086000029851117&height=300&width=300&count=49&currentNum=0&thumbType=imgcropa&fileType=0        
        #[{u\'photoNum\': 2518, u\'albumName\': u\'default-album-1\', u\'iversion\': -1, u\'albumId\': u\'default-album-1\', u\'flversion\': -1, u\'createTime\': 1448065264550L, u\'size\': 0},
        #{u\'photoNum\': 100, u\'albumName\': u\'default-album-2\', u\'iversion\': -1, u\'albumId\': u\'default-album-2\', u\'flversion\': -1, u\'createTime\': 1453090781646L, u\'size\': 0}]        
        try:
            hjson = json.loads(hjsondata)
        except Exception:
            print (\'加载json出错!\')
            return -1
        
        #字典获取出错
        if not hjson.get(parentkey):
            print (\'加载json根节点[%s]出错!\' %parentkey)
            return -1
        
        #初始化全局 albumlist
        if not self.AlbumList :
            self.AlbumList=hjson
        
        for idx,album in enumerate(self.AlbumList[parentkey]):
            if \'currentNum\' not in self.AlbumList[parentkey][idx].keys():
                self.AlbumList[parentkey][idx][\'currentNum\']=0
                
        #循环保存相册
        for each in hjson[parentkey]:
            #该相册已经进入记录                
            paraAlbum={}
            paraAlbum[\'albumIds[]\'] = each[childkey]
            paraAlbum[\'ownerId\'] = hjson[\'ownerId\']
            paraAlbum[\'height\'] = \'300\'
            paraAlbum[\'width\'] = \'300\'
            paraAlbum[\'count\'] = self.OnceMaxFile
            paraAlbum[\'thumbType\'] = \'imgcropa\'
            paraAlbum[\'fileType\'] = \'0\'            
            itotal= each[\'photoNum\']
            
            #取当前节点的当前记录
            for idx,album in enumerate(self.AlbumList[parentkey]):
                if each[childkey]==album[childkey]:
                    icurrentnum = self.AlbumList[parentkey][idx][\'currentNum\'] 
                    break
            
            #保存相册中所有文件
            while icurrentnum<itotal:                
                paraAlbum[\'currentNum\'] = icurrentnum
                response=self.SReq.post(self.getalbumfileUrl,headers=self.loginHeaders,data=paraAlbum,verify=False)
                page = response.text
                #保存下载地址到文本文件中,但不下载文件
                iret  = self.saveFileList2Txt(each[childkey],page,icurrentnum)
                if iret >0 :
                    self.AlbumList[parentkey][idx][\'currentNum\']  += iret 
                    icurrentnum = self.AlbumList[parentkey][idx][\'currentNum\'] 
                else:
                    #出错!!!
                    return -1           
        return 1

    #step 1 getCloudAlbums,取相册列表
    def getAlbumList(self):
        response=self.SReq.post(self.getalbumsUrl,headers=self.loginHeaders,verify=False)
        page=response.text
        \'\'\'#返回报文
        {"ownerId":"220086000029851117","code":0,
        "albumList":[{"albumId":"default-album-1","albumName":"default-album-1","createTime":1448065264550,"photoNum":2521,"flversion":-1,"iversion":-1,"size":0},
                     {"albumId":"default-album-2","albumName":"default-album-2","createTime":1453090781646,"photoNum":101,"flversion":-1,"iversion":-1,"size":0}],
        "ownShareList":[{"ownerId":"220086000029851117","resource":"album","shareId":"default-album-102-220086000029851117","shareName":"微信","photoNum":2,"flversion":-1,"iversion":-1,"createTime":1448070407055,"source":"HUAWEI MT7-TL00","size":0,"ownerAcc":"****","receiverList":[]}],
        "recShareList":[]}\'
        \'\'\'
        if len(page)<=0 :
            print( u\'取相册列表出错,无返回报文!!!\\n\\n\')
        return page

 

五、运行结果:

程序会在当前目录生成华为云相册照片下载地址文件,内容如下:

https://d167.g03.dbankcloud.com/file/MDAwMTZBODissQaaaaaaaaaaaaaaaaaaaaQc2CR-znjyRnw../162807b277aaaaaaaaaaaaaaaaaa9ee1/IMG_20170606_141952.jpg?key=AWqIQFqVkEaaaaaaaaaaaaaaaaaaaaaaWNLIosPR_EKv8VQ..&a=220086000029851117-3da1ab76-92808-5840&nsp_ver=3.0
https://d167.g03.dbankcloud.com/file/MDAwMTZBODhhoFaaaaaaaaaaaaaaaaaaaa7r6jPU67bWTQA../4039a1be5caaaaaaaaaaaaaaaaaac726/IMG_20170605_203519.jpg?key=AWqIQFqVkEaaaaaaaaaaaaaaaaaaaaaajvduIL8cXufhNhQ..&a=220086000029851117-3da1ab76-92808-5840&nsp_ver=3.0
https://d167.g03.dbankcloud.com/file/MDAwMTZBODgpoWaaaaaaaaaaaaaaaaaaaaaciAQlIVHRbXg../9e336da286aaaaaaaaaaaaaaaaaaf89d/IMG_20170604_171032.jpg?key=AWqIQFqVkEaaaaaaaaaaaaaaaaaaaaaaThkDgJKHpBtiG5w..&a=220086000029851117-3da1ab76-92808-5840&nsp_ver=3.0
https://d167.g03.dbankcloud.com/file/MDAwMTZBODgUz2aaaaaaaaaaaaaaaaaaaatyFMDr71YpXGg../b3c17582ccaaaaaaaaaaaaaaaaac278b/IMG_20170603_134831.jpg?key=AWqIQFqVkEaaaaaaaaaaaaaaaaaaaaaaTI_xPSzF_VzUsJA..&a=220086000029851117-3da1ab76-92808-5840&nsp_ver=3.0
https://d167.g03.dbankcloud.com/file/MDAwMTZBODgGfwaaaaaaaaaaaaaaaaaaaad7DLSHH4rwKVA../2722df087baaaaaaaaaaaaaaaaa915e4/IMG_20170603_133833.jpg?key=AWqIQFqVkEaaaaaaaaaaaaaaaaaaaaaaluTQ8grDHok9BzQ..&a=220086000029851117-3da1ab76-92808-5840&nsp_ver=3.0
https://d167.g03.dbankcloud.com/file/MDAwMTZBODgsrHaaaaaaaaaaaaaaaaaaaatq2yOJ-OnkDtQ../77e0ef0560aaaaaaaaaaaaaaaaa44702/IMG_20170602_183736.jpg?key=AWqIQFqVkEaaaaaaaaaaaaaaaaaaaaaa20WJbfxn-qoqIeQ..&a=220086000029851117-3da1ab76-92808-5840&nsp_ver=3.0
https://d167.g03.dbankcloud.com/file/MDAwMTZBODiAcDaaaaaaaaaaaaaaaaaaaaEXW-ONoF0Shuw../df033e69ffaaaaaaaaaaaaaaaaa8c1b1/IMG_20170601_185446.jpg?key=AWqIQFqVkEaaaaaaaaaaaaaaaaaaaaaaxIC-spleDG_xxVg..&a=220086000029851117-3da1ab76-92808-5840&nsp_ver=3.0
https://d167.g03.dbankcloud.com/file/MDAwMTZBODjY8EaaaaaaaaaaaaaaaaaaaaVbr9kC-JU0M8g../d5230d2032aaaaaaaaaaaaaaaaa903b3/IMG_20170601_102059.jpg?key=AWqIQFqVkEaaaaaaaaaaaaaaaaaaaaaa49iM03bK-Cm-Z9g..&a=220086000029851117-3da1ab76-92808-5840&nsp_ver=3.0
https://d167.g03.dbankcloud.com/file/MDAwMTZBODi04Caaaaaaaaaaaaaaaaaaaaaaw41bSNB4pxBw../6ee510e28aaaaaaaaaaaaaaaaa57cb5a/IMG_20170601_102042.jpg?key=AWqIQFqVkaaaaaaaaaaaaaaaaaaaaaaxlapdsHLoRCSITVw..&a=220086000029851117-3da1ab76-92808-5840&nsp_ver=3.0

把上述下载链接复制到迅雷,添加批量任务就可以下载图片到本地。

 

以上,-- End -- 

以上是关于华为云照片的爬虫程序更新(python3.6)的主要内容,如果未能解决你的问题,请参考以下文章

学习猿地 python教程 django教程6 华为云部署

华为云技术分享40行代码教你利用Python网络爬虫批量抓取小视频

爬虫使用分享:风云2号卫星气象照片

华为云部署Centos7.6 Django+Gunicorn+Gevent+Supervisor+Nginx

华为云技术分享Python爬虫偷懒神器 — 快速构造请求头!

python-网易云简单爬虫