用 Kanna 快速解析 html

Posted 2023-02-24

技术标签:

【中文标题】用 Kanna 快速解析 html【英文标题】：Parsing html in swift with Kanna 【发布时间】：2016-11-22 18:07:15 【问题描述】：

我正在尝试使用 Kanna 将 html 解析到我的 swift 项目中，我使用此 What is the best practice to parse html in swift? 作为指南。

这是我用来解析 html 的代码：

if let doc = Kanna.HTML(html: myHTMLString, encoding: String.Encoding.utf8) 
    var bodyNode = doc.body

    if let inputNodes = bodyNode?.xpath("//a/@href[ends-with(.,'.txt')]") 
        for node in inputNodes 
            print(node.content)

现在我对此没有任何经验，但我相信我必须更改.xpath("//a/@href[ends-with(.,'.txt')]")才能获得我需要的具体信息。

这是我试图解析的 html：

查看源代码：https://en.wikipedia.org/wiki/List_of_inorganic_compounds

我想从这行得到标题：“锑化铝”和化学式：“AlSb”。

谁能告诉我在.xpath(...) 中写什么，或者向我解释它是如何工作的？

【问题讨论】：

【参考方案1】：

斯威夫特 3

通过循环获取所有项目

for item in doc.xpath("//div[@class='mw-content-ltr']/ul/li") 
    print(item.at_xpath("a")?["title"])
    print(item.text) // this returns the whole text, you may need further actions here

或访问特定项目

print(doc.xpath("//div[@class='mw-content-ltr']/ul/li")[0].at_xpath("a")?["title"])
print(doc.xpath("//div[@class='mw-content-ltr']/ul/li")[0].text)

您可以查看 xpath 教程和文档以获取更多信息。

【讨论】：

以上是关于用 Kanna 快速解析 html的主要内容，如果未能解决你的问题，请参考以下文章

用DOM解析XML ，用xpath快速查询XML节点

iOS快速解析Model

从 HTML 中提取 JavaScript 代码中的变量值

一个快速的python HTML解析器[关闭]

快速将 HTML 字符串解析为 JSON

用于解析和操作HTML和XML的快速、灵活和优雅的库。