VBA宏从IE中的链接下载多个文件

Posted

技术标签:

【中文标题】VBA宏从IE中的链接下载多个文件【英文标题】:VBA Macro to download multiple files from links in IE 【发布时间】:2015-12-24 05:15:26 【问题描述】:

我想从链接列表中下载多个文件。我找到链接的网站受到保护。这就是我想使用 IE(使用当前会话/cookie)的原因。每个链接的目标是一个 xml 文件。文件太大,无法打开然后保存。所以我需要直接保存(右键,目标另存为)。

链接列表如下所示:

<html>
<body>
<p> <a href="https://example.com/report?_hhhh=XML"Link A</a><br>> </p>
<p> <a href="https://example.com/report?_aaaa=XML"Link B</a><br>> </p>
...
</body>
</html>

我想遍历所有链接并保存每个目标。目前我有“另存为”的问题。我真的不知道该怎么做。到目前为止,这是我的代码:

Sub DownloadAllLinks()

Dim IE As Object
Dim Document As Object
Dim List As Object
Dim Link As Object

' Before I logged in to the website
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.Navigate ("https:\\......\links.html")

Do While IE.Busy
  DoEvents
Loop

' Detect all links on website
Set Document = IE.Document
Set List = Document.getElementsByTagName("a")

' Loop through all links to download them

For Each Link In List

' Now I need to automate "save target as" / right-click and then "save as"
...

Next Link
End Sub

您有什么想法可以为每个链接自动“另存为”吗?

感谢任何帮助。非常感谢, 乌力

【问题讨论】:

这是一个兔子洞,我已经下过很多次了。简短的回答是停止尝试让 IE 充当下载文件的代理。使用 xmlHttp 对象登录并使用 GetResponseHeader 收集/返回身份验证,然后使用 ADO 流保存文件。 This 可能会有所帮助。 【参考方案1】:

下面是我为您的案例改编的一个非常常见的示例,它显示了使用 XHR 和 RegEx 来检索网页 HTML 内容,从中提取所有链接,并下载每个链接的目标文件:

Option Explicit

Sub Test()
    ' declare vars
    Dim sUrl As String
    Dim sReqProt As String
    Dim sReqAddr As String
    Dim sReqPath As String
    Dim sContent As String
    Dim oLinks As Object
    Dim oMatch As Object
    Dim sHref As String
    Dim sHrefProt As String
    Dim sHrefAddr As String
    Dim sHrefPath As String
    Dim sHrefFull As String
    Dim n As Long
    Dim aContent() As Byte
    ' set source URL
    sUrl = "https:\\......\links.html"
    ' process source URL
    SplitUrl sUrl, sReqProt, sReqAddr, sReqPath
    If sReqProt = "" Then sReqProt = "http:"
    sUrl = sReqProt & "//" & sReqAddr & "/" & sReqPath
    ' retrieve source page HTML content
    With CreateObject("Microsoft.XMLHTTP")
        .Open "GET", sUrl, False
        .Send
        sContent = .ResponseText
    End With
    ' parse source page HTML content to extract all links
    Set oLinks = CreateObject("Scripting.Dictionary")
    With CreateObject("VBScript.RegExp")
        .Global = True
        .MultiLine = True
        .IgnoreCase = True
        .Pattern = "<a.*?href *= *(?:'|"")(.*?)(?:'|"").*?>"
        For Each oMatch In .Execute(sContent)
            sHref = oMatch.subMatches(0)
            SplitUrl sHref, sHrefProt, sHrefAddr, sHrefPath
            If sHrefProt = "" Then sHrefProt = sReqProt
            If sHrefAddr = "" Then sHrefAddr = sReqAddr
            sHrefFull = sHrefProt & "//" & sHrefAddr & "/" & sHrefPath
            oLinks(oLinks.Count) = sHrefFull
        Next
    End With
    ' save each link target into file
    For Each n In oLinks
        sHref = oLinks(n)
        With CreateObject("Microsoft.XMLHTTP")
            .Open "GET", sHref, False
            .Send
            aContent = .ResponseBody
        End With
        With CreateObject("ADODB.Stream")
            .Type = 1 ' adTypeBinary
            .Open
            .Write aContent
            .SaveToFile "C:\Test\" & n & ".xml", 2 ' adSaveCreateOverWrite
            .Close
        End With
    Next
End Sub

Sub SplitUrl(sUrl, sProt, sAddr, sPath)
    ' extract protocol, address and path from URL
    Dim aSplit
    aSplit = Split(sUrl, "//")
    If UBound(aSplit) = 0 Then
        sProt = ""
        sAddr = sUrl
    Else
        sProt = aSplit(0)
        sAddr = aSplit(1)
    End If
    aSplit = Split(sAddr, "/")
    If UBound(aSplit) = 0 Then
        sPath = sAddr
        sAddr = ""
    Else
        sPath = Mid(sAddr, Len(aSplit(0)) + 2)
        sAddr = aSplit(0)
    End If
End Sub

此方法不使用 IE 自动化。通常Microsoft.XMLHTTP 处理的 IE 的 cookie 足以引用当前会话,因此如果您的网站不使用其他程序进行身份验证并生成链接列表,那么该方法应该适合您。

【讨论】:

【参考方案2】:
Private Declare PtrSafe Function Test Lib "urlmon" _
    Alias "URLDownloadToFileA" (ByVal pCaller As Long, _
    ByVal szURL As String, ByVal szFileName As String, _
    ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long

Sub AutoOpen()
    Dim strFile As String
    Dim strURL As String
    Dim strPath As String
    Dim ret As Long
    
    Dim strFile1 As String
    Dim strURL1 As String
    Dim strPath1 As String
    Dim ret1 As Long
    
    Dim Shex As Object
    Dim Test2 As String
    
    
    strFile = "1st_file"
    strURL = "first-url" & strFile
    strPath = Environ("UserProfile") & "your-path" & strFile
    ret = Test(0, strURL, strPath, 0, 0)
    
    strFile1 = "something_else"
    strURL1 = "your-url" & strFile1
    strPath1 = Environ("UserProfile") & "your-path" & strFile1
    re1t = Test(0, strURL1, strPath1, 0, 0)
    If ret <> 0 Then MsgBox "Something went wrong!", vbInformation

End Sub

您可以使用此宏下载多个文件。要下载更多内容,只需复制此部分

Dim strFile As String
        Dim strURL As String
        Dim strPath As String
        Dim ret As Long

这部分:

strFile = "1st_file"
        strURL = "first-url" & strFile
        strPath = Environ("UserProfile") & "your-path" & strFile
        ret = Test(0, strURL, strPath, 0, 0)

显然只需更改变量即可。

【讨论】:

以上是关于VBA宏从IE中的链接下载多个文件的主要内容,如果未能解决你的问题,请参考以下文章

使用 VBA 宏从 Excel 工作簿中取消选择所有复选框

VBA IE 在对话框窗口中批准下载

使用 Excel 宏从 SAP 中提取数据

使用VB自动化IE“目标另存为”

使用VBA单击IE中的按钮

VBA-从IE中的部分中提取段落