使用 XPath 进行 Python XML 过滤 [重复]

Posted 2023-02-24

技术标签:

【中文标题】使用 XPath 进行 Python XML 过滤 [重复]【英文标题】：Python XML filtering using XPath [duplicate] 【发布时间】：2013-09-06 12:34:03 【问题描述】：

我正在努力让这件事发挥作用。我有一个 XML 文件，我需要使用 XPath 过滤元素“标题”。之后我需要将 C 元素下的所有内容复制到外部文件，但这不是现在的重点。我需要使用 xml.etree.cElementTree 或 xml.etree.ElementTree 来运行它。我已经在 *** 和其他网站上阅读了一堆帖子，但被卡住了。 Soo..首先是XML结构：

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<delivery xmlns="http://url" publicationdate="2013-08-28T09:10:32Z">
    <A>
        <B>
            <C>
                <Cid>XXXXXXXXX</Cid>
                <cref>111111-2222222</cref>
                <D>
                    <E/>
                    <F/>
                    <G/>
                    <H>
                        <Href>XXXXXXXXXXXX</Href>
                        <hcont name="XXXXXX" country="EN"/>
                    </H>
                    <I/>
                    <J/>
                    <K>XXXXXXXXX</K>
                    <oldK>XXXXXXX</oldK>
                    <title>
                        <content lang="en">TITLE</content>
                    </title>
                    <L>
                        <isL>false</isL>
                    </L>
                </D>
                <M>
                    <startTime>2013-08-28T03:00:00Z</startTime>
                    <endTime>2013-08-29T00:58:00Z</endTime>
                </M>
            </C>
        </B>
    </A>
</delivery>

我什至无法通过 XPath 找到 Cid 元素。脚本不断返回“无”或 [] 或什么都没有。

import xml.etree.ElementTree as ET

doc = ET.ElementTree(file='short.xml') 
for x in doc.findall('./A/B/C'):
  print x.get('Cid').text

这个没有任何回报。如何让这个工作？如何“找到”甚至 Cid 元素？

【问题讨论】：

【参考方案1】：

您应该将namespaces 参数传递给findall()：

namespaces = name_space_name_here: 'http://url'
for x in doc.findall('./A/B/C', namespaces=namespaces):
    # do smth

不过，这不适用于默认命名空间（仅xmlns，就像您的情况一样）。

在这种情况下，您可以将命名空间显式传递给 xpath：

for x in tree.findall('.//%(uri)sC' % 'uri': 'http://url'):

另见：

Parsing XML with namespace in Python via 'ElementTree' ElementTree: Working with Namespaces and Qualified Names

【讨论】：

以上是关于使用 XPath 进行 Python XML 过滤 [重复]的主要内容，如果未能解决你的问题，请参考以下文章