无法使用 DOM 解析器读取带有命名空间前缀的 xml

Posted 2023-02-24

技术标签:

【中文标题】无法使用 DOM 解析器读取带有命名空间前缀的 xml【英文标题】：Unable to read xml with namespace prefix using DOM parser 【发布时间】：2013-05-13 02:10:21 【问题描述】：

这是输入 XML：

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
   <SOAP-ENV:Header/>
   <SOAP-ENV:Body>
      <ns2:SendResponse xmlns:ns2="http://mycompany.com/schema/">
         <ns2:SendResult>
            <ns2:Token>A00179-02</ns2:Token>
         </ns2:SendResult>
      </ns2:SendResponse>
   </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

这是我用来读取 XML 的代码（变量 xmlString 包含上面的 XML）：

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(xmlString));
Document doc = db.parse(is);

System.out.println("Element :" + doc.getElementsByTagName("Token").item(0));
System.out.println("Element :" + doc.getElementsByTagName("ns2:Token").item(0));

输出：

Element :null
Element :[ns2:Token: null]

如果我使用“ns2:Token”作为标签名称，我可以读取元素，但我不想在我的代码中使用前缀，因为我不确定它是否相同或将来改变。有什么方法可以读取 xml 元素而无需在标签名称中硬编码命名空间？

【问题讨论】：

【参考方案1】：

先获取命名空间

docFactory.setNamespaceAware(true);
StringBuilder nameSpace = new StringBuilder(
                    doc.getDocumentElement().getPrefix() != null ? doc.getDocumentElement().getPrefix() + ":" : "");

然后相应地使用nameSpace变量

例如：

Node node= doc.getElementsByTagName(nameSpace + "Node1").item(0)
                    .getFirstChild();

【讨论】：

【参考方案2】：

尝试使用 XPath 表达式。请参阅下面的示例代码。

Document doc = dBuilder.parse(new ByteArrayInputStream(responseXML.getBytes()));
doc.getDocumentElement().normalize();
XPath xPath =  XPathFactory.newInstance().newXPath();

String expression = "/ns6:ReadPersonReturn/ns6:object/ns3:Person/ns3:Phone/ns3:item";
NodeList nodes = (NodeList) xPath.compile(expression).evaluate(doc, XPathConstants.NODESET);
Element secondNode = null;
if(nodes != null && nodes.getLength() > 0)
    secondNode = (Element) leadCloudPingRecordNodes.item(i);

【讨论】：

【参考方案3】：

命名空间元素的W3C dom 方法：

getElementsByTagNameNS

NodeList getElementsByTagNameNS(String namespaceURI,
                                String localName)

    Returns a NodeList of all the Elements with a given local name and namespace URI in document order.

    Parameters:
        namespaceURI - The namespace URI of the elements to match on. The special value "*" matches all namespaces.
        localName - The local name of the elements to match on. The special value "*" matches all local names. 
    Returns:
        A new NodeList object containing all the matched Elements.
    Since:
        DOM Level 2

IIRC 早期版本的 W3C DOM 对命名空间的支持很差，所以我不使用它。但是，如果您将上述内容与完整的 namespaceURI http://schemas.xmlsoap.org/soap/envelope/ 一起使用，它应该可以工作。前缀并不重要——它在使用它的文档之外没有永久性。

那就试试吧：

System.out.println("Element :" + doc.getElementsByTagNameNS(
        "http://schemas.xmlsoap.org/soap/envelope/", "Token").item(0));

【讨论】：

【参考方案4】：

您始终可以将命名空间分配给一个变量，这将允许在未来随时更改它。

【讨论】：

以上是关于无法使用 DOM 解析器读取带有命名空间前缀的 xml的主要内容，如果未能解决你的问题，请参考以下文章