将模块的输出连接到另一个并循环此逻辑以获得值列表

Posted

技术标签:

【中文标题】将模块的输出连接到另一个并循环此逻辑以获得值列表【英文标题】:Connecting output of module to another and looping this logic for a list of values 【发布时间】:2022-01-12 01:56:01 【问题描述】:

我正在尝试在 vba 中连接 2 个模块,以便第一个模块 (geturl) 的输出馈送到另一个 (getdata)。

获取 URL 以在线查找档案 URL,以查找在 A 列中输入的物质,例如可以使用Acetone 或 B 列中的 CAS 编号(见下图)。注意:目前仅在 A1 或 B1 中查找物质信息。

Public Function GetUrl() As String
    
    Const Url = "https://echa.europa.eu/information-on-chemicals/registered-substances?p_p_id=dis-s-registeredsubstances_WAR_dis-s-regsubsportlet&p_p_lifecycle=1&p_p_state=normal&p_p_mode=view&_dis-s-registeredsubstances_WAR_dis-s-regsubsportlet_javax.portlet.action=dissRegisteredSubstancesAction"
    
    Set ohtml = New HTMLDocument
    Set oHttp = CreateObject("MSXML2.XMLHTTP")
    Set MyDict = CreateObject("Scripting.Dictionary")
    
    SubstanceName = Cells(1, 1)
    CASNumber = Cells(1, 2)
        
    MyDict("_dis-s-registeredsubstances_WAR_dis-s-regsubsportlet_disreg_name") = SubstanceName
    MyDict("_dis-s-registeredsubstances_WAR_dis-s-regsubsportlet_disreg_cas-number") = CASNumber
    MyDict("_disssimplesearchhomepage_WAR_disssearchportlet_disclaimer") = "true"
    MyDict("_disssimplesearchhomepage_WAR_disssearchportlet_disclaimerCheckbox") = "on"
    
    payload = vbNullString
        
    For Each DictKey In MyDict
        payload = IIf(Len(DictKey) = 0, WorksheetFunction.EncodeURL(DictKey) & "=" & WorksheetFunction.EncodeURL(MyDict(DictKey)), _
                      payload & "&" & WorksheetFunction.EncodeURL(DictKey) & "=" & WorksheetFunction.EncodeURL(MyDict(DictKey)))
    Next DictKey
        
    With oHttp
        .Open "POST", Url, False
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36"
        .setRequestHeader "Content-type", "application/x-www-form-urlencoded"
        .send (payload)
        oHtml.body.innerHTML = .responseText
    End With
        
    
    GetUrl = oHtml.querySelector(".details").getAttribute("href")
    
    Debug.Print oHtml.querySelector(".substanceNameLink ").innerText
    Debug.Print oHtml.querySelector(".details").getAttribute("href")
  
End Function

如果运行应该返回 Acetone https://echa.europa.eu/registration-dossier/-/registered-dossier/15460

Get Data 使用来自 geturl 的 Url 返回“DNEL”值:

Sub GetData()
        
'Start ECHA Search via XML HTTP Request

Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument


Dim Route(1 To 3) As String

Route(1) = "sGeneralPopulationHazardViaInhalationRoute"
Route(2) = "sGeneralPopulationHazardViaDermalRoute"
Route(3) = "sGeneralPopulationHazardViaOralRoute"


XMLReq.Open "Get", GetUrl & "/7/1", False
XMLReq.send
 
If XMLReq.Status <> 200 Then
        
    MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
    Exit Sub

    End If
 
HTMLDoc.body.innerHTML = XMLReq.responseText

'Loops through each element

For c = 1 To UBound(Route, 1)

Set Info = HTMLDoc.getElementById(Route(c))
Debug.Print Info.innerText

Set Info = HTMLDoc.getElementById(Route(c)).NextSibling.NextSibling.NextSibling
Set Data = Info.getElementsByTagName("dd")(0)
Debug.Print Data.innerText

Set Data = Info.getElementsByTagName("dd")(1)
Debug.Print Data.innerText

'Cells(r, c + 2) = Data.innerText

Next c

End Sub

对于 Cell(1,1) 中的丙酮,这应该返回:

Acetone 
https://echa.europa.eu/registration-dossier/-/registered-dossier/15460
General Population - Hazard via inhalation route
DNEL (Derived No Effect Level)
200 mg/m³
General Population - Hazard via dermal route
DNEL (Derived No Effect Level)
62 mg/kg bw/day
General Population - Hazard via oral route
DNEL (Derived No Effect Level)
62 mg/kg bw/day

然而,我不希望仅仅依赖单元格 A1,而是希望每个单元格的整个代码循环在 columnA/ColumnB 中都有一个物质。所以在这种情况下,找到了丙酮的 URL,然后提取了相应的数据,然后 Oxydipropanol 也会发生同样的情况。

此图片中的注意事项 可以使用物质名称、B 列中的 CAS 编号或两者的组合在线查找物质。

试图连接这两个模块,到目前为止,我只能让 geturl 模块循环遍历每种物质。我也尝试将两者合并到 1 个模块中,但无法弄清楚如何正确嵌套 for 循环。

一个快速的谷歌搜索表明你不能在 vba 中嵌套函数。这让我想知道我正在做的事情是否是解决这个问题的正确方法。但我在过去看到过类似的事情,所以我确信这是可能的。

注意:如果进行测试,请使用示例物质进行测试。使用随机化学物质说苯可能会导致错误,因为该物质的毒性特征不存在。我仍然需要实现处理错误,但现在可以忽略。

如果有任何进一步的进展,我会在这里更新您,谢谢。

【问题讨论】:

【参考方案1】:

这对我有用:

Sub PopulateExposures()
    Dim url, rw As Range
    
    Set rw = Sheets("data").Range("A1:E1") 'first row with inputs
    Do While Application.CountA(rw) > 0
        url = SubstanceUrl(rw.Cells(1).Value, rw.Cells(2).Value) 'get the URL
        rw.Cells(3).Resize(1, 3).Value = ExposureData(url) 'get exposure data (as array) and add to row
        Set rw = rw.Offset(1, 0) 'next substance
    Loop

End Sub

Public Function SubstanceUrl(SubstanceName, CASNumber) As String
    
    Const url = "https://echa.europa.eu/information-on-chemicals/registered-substances?" & _
                "p_p_id=dis-s-registeredsubstances_WAR_dis-s-regsubsportlet&p_p_lifecycle=1&" & _
                "p_p_state=normal&p_p_mode=view&" & _
                "__dis-s-registeredsubstances_WAR_dis-s-regsubsportlet_javax.portlet.action=dissRegisteredSubstancesAction"
    
    Dim oHTML, oHttp, MyDict, payload, DictKey, sep
    
    Set oHTML = New HTMLDocument
    Set oHttp = CreateObject("MSXML2.XMLHTTP")
    Set MyDict = CreateObject("Scripting.Dictionary")
    
    MyDict("_dis-s-registeredsubstances_WAR_dis-s-regsubsportlet_disreg_name") = SubstanceName
    MyDict("_dis-s-registeredsubstances_WAR_dis-s-regsubsportlet_disreg_cas-number") = CASNumber
    MyDict("_disssimplesearchhomepage_WAR_disssearchportlet_disclaimer") = "true"
    MyDict("_disssimplesearchhomepage_WAR_disssearchportlet_disclaimerCheckbox") = "on"
    
    payload = ""
    For Each DictKey In MyDict
        payload = payload & sep & DictKey & "=" & WorksheetFunction.EncodeURL(MyDict(DictKey))
        sep = "&"
    Next DictKey
        
    With oHttp
        .Open "POST", url, False
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36"
        .setRequestHeader "Content-type", "application/x-www-form-urlencoded"
        .send payload
        oHTML.body.innerHTML = .responseText
    End With
    SubstanceUrl = oHTML.querySelector(".details").getAttribute("href")
End Function

Function ExposureData(urlToGet)
    
    Dim XMLReq As New MSXML2.XMLHTTP60
    Dim HTMLDoc As HTMLDocument, dds
    Dim Route(1 To 3) As String, Results(1 To 3) As String, c, Info, Data
    
    Route(1) = "sGeneralPopulationHazardViaInhalationRoute"
    Route(2) = "sGeneralPopulationHazardViaDermalRoute"
    Route(3) = "sGeneralPopulationHazardViaOralRoute"
    
    XMLReq.Open "Get", urlToGet & "/7/1", False
    XMLReq.send
     
    If XMLReq.Status <> 200 Then
        Results(1) = "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
    Else
        Set HTMLDoc = New HTMLDocument
        HTMLDoc.body.innerHTML = XMLReq.responseText
        For c = 1 To UBound(Route, 1)
            Set Info = HTMLDoc.getElementById(Route(c))
            If Not Info Is Nothing Then
                Set Info = Info.NextSibling.NextSibling.NextSibling
                Set dds = Info.getElementsByTagName("dd")
                If dds.Length > 1 Then
                    Results(c) = dds(1).innerText
                Else
                    Results(c) = "hazard unknown"
                End If
            Else
                Results(c) = "no info"
            End If
        Next c
    End If
    ExposureData = Results
End Function

【讨论】:

非常感谢您 你知道我如何在这段代码中添加错误处理,以便如果我有一个化学名称,例如Benzene。只是- 作为三列的曝光数据返回?如果不覆盖有效物质的结果,我就无法在 for 循环中进行错误处理。 您是指苯或任何没有毒性特征的化学品? 在最后一个函数中添加了一些检查——你应该明白了 是的,任何没有 Tox Profile 的值。苯只是一个例子。

以上是关于将模块的输出连接到另一个并循环此逻辑以获得值列表的主要内容,如果未能解决你的问题,请参考以下文章

遇到多线程问题同时连接到多个设备

循环遍历数据框列表以动态创建新列

如何将带有列表值的熊猫列连接到一个列表中?

进程连接到另一个进程以收集配置文件信息?

将输入类型连接到模型并在asp net中为它们分配一个值[重复]

如何将一个模块内的类中的 pyqtSignal 连接到另一个模块内的类中的 pyqtSlot?