如何从结构类似于Java的XML文件中获取特定元素

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何从结构类似于Java的XML文件中获取特定元素相关的知识,希望对你有一定的参考价值。

我有一个.sic-File,其结构类似于XML,但不完全相同。那里有我要阅读一些元素的Channel2部分。该部分是这样的:

.
.
.
<SI name = "Channel2" type = "list">
           <SI name = "SecsPortConfig" type = "list">
              <SI name = "PortType" type = "string">'XXX'</SI>
              <SI name = "Protocol" type = "string">'XXX'</SI>
              <SI name = "Serial" type = "list">
                 <SI name = "Port" type = "int">'XXX'</SI>
                 <SI name = "Speed" type = "int">'XXXX'</SI>
              </SI>
              <SI name = "Socket" type = "list">
                 <SI name = "ConnectionMode" type = "string">'XXX'</SI>
                 <SI name = "LocalHost" type = "string">'XXX.XXX.XXX.XXX'</SI>
                 <SI name = "LocalPort" type = "int">'XXX'</SI>
                 <SI name = "RemoteHost" type = "string">'XXX.XXX.XXX'</SI>
                 <SI name = "RemotePort" type = "int">'XXX'</SI>
              </SI>
              <SI name = "HSMS" type = "list">
                 <SI name = "T5" type = "int">'XXX'</SI>
                 <SI name = "T6" type = "int">'XXX'</SI>
                 <SI name = "T7" type = "int">'XXX'</SI>
                 <SI name = "T8" type = "int">'XXX'</SI>
                 <SI name = "LinkTestTime" type = "int">'XXX'</SI>
              </SI>
              <SI name = "SECSI" type = "list">
                 <SI name = "T1" type = "int">'XXX'</SI>
                 <SI name = "T2" type = "int">'XXX'</SI>
                 <SI name = "T4" type = "int">'XXX'</SI>
                 <SI name = "RTY" type = "int">'XXX'</SI>
                 <SI name = "IsHost" type = "bool">'XXX'</SI>
                 <SI name = "IsMaster" type = "bool">'XXX'</SI>
                 <SI name = "InterleaveBlocks" type = "bool">'XXX'</SI>
              </SI>
              <SI name = "SECSII" type = "list">
                 <SI name = "DeviceID" type = "int">'XXX'</SI>
                 <SI name = "T3" type = "int">'XXX'</SI>
                 <SI name = "MultipleOpen" type = "bool">'XXX'</SI>
                 <SI name = "AutoDeviceID" type = "bool">'XXX'</SI>
              </SI>
              <SI name = "Log" type = "list">
                 <SI name = "LogCharError" type = "bool">'XXX'</SI>
                 <SI name = "LogCharEvent" type = "bool">'XXX'</SI>
                 <SI name = "LogCharReceive" type = "bool">'XXX'</SI>
                 <SI name = "LogCharSend" type = "bool">'XXX'</SI>
                 <SI name = "LogSecsIHsmsError" type = "bool">'XXX'</SI>
                 <SI name = "LogSecsIHsmsEvent" type = "bool">'XXX'</SI>
                 <SI name = "LogSecsIHsmsReceive" type = "bool">'XXX'</SI>
                 <SI name = "LogSecsIHsmsSend" type = "bool">'XXX'</SI>
                 <SI name = "LogSecsIIError" type = "bool">'XXX'</SI>
                 <SI name = "LogSecsIIEvent" type = "bool">'XXX'</SI>
                 <SI name = "LogSecsIIReceive" type = "bool">'XXX'</SI>
                 <SI name = "LogSecsIISend" type = "bool">'XXX'</SI>
              </SI>
           </SI>
           <SI name = "UseSeparateSECSLogFile" type = "bool">'XXX'</SI>
           <SI name = "Connected" type = "bool">'XXX'</SI>
           <SI name = "MessageFilters" type = "list">
              <SI name = "DeviceIDList" type = "list"/>
              <SI name = "StreamFunctionList" type = "list"/>
           </SI>
           <SI name = "SafeMessageFilters" type = "list">
              <SI name = "DeviceIDList" type = "list"/>
              <SI name = "StreamFunctionList" type = "list"/>
           </SI>
        </SI>
        .
        .
        .

如果它将是一个xml文件,我可以解析它并读出元素,但是如何使用这种文件呢?我想提取元素RemoteHostRemotePort。我现在使用BufferedReader进行了尝试,并将此节插入到字符串中,从而从文件中获得了节Channel2,但是如何提取所需元素的特定值?我可能可以用子字符串和其他一些String方法来做到这一点,但是没有更简单的方法吗?到目前为止,这是我的代码:

    File file = new File("C:\\Users\\but\\Desktop\\ExternalswPassThroughSrv.sic");

    int counter = 0;

    BufferedReader br = new BufferedReader(new FileReader(file));

    String cl;
    String finalString = "";
    while ((cl = br.readLine()) != null) 
        if (cl.contains("Channel2")) 
            counter = 63;
        
        if(counter != 0)
            //System.out.println(cl);
            finalString += cl + "\n";
            counter--;
        
    
    System.out.println(finalString);
答案
Document _myDoc = null;

LSInput input  = implLS.createLSInput();

input.setStringData(requestXML);

_myDoc = parser.parse(input);

SI = ((NodeList)_myDoc.getElementsByTagName("MessageFilters")).item(0).getFirstChild().getNodeValue();

您可以通过使用getElementsByTagName获得XML元素的节点值。但是为此,您需要使用不同的元素名称。这不是答案。这不是答案。只是提示。请尝试一下。

另一答案

由于我们不知道整个文件是如何形成的:即使它不是完整的XML文档,也可以通过添加根元素从文件的其余部分中提取XML片段,并将其转换为格式良好的XML文档。

之后,您可以将其解析为文档,并使用XPath提取所需的信息。

这里有一些适合您的Java代码示例(为清楚起见,我没有包含xml)

import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import java.io.IOException;
import java.io.StringReader;

public class ConvertXml 
    public static void main(String[] args) throws ParserConfigurationException, IOException, SAXException, TransformerException, XPathExpressionException 
        // Your XML-like content
        String xmlString = "xml here";

        // transform xml-Fragment into well-formed xml with root element
        String xmlStringWellformed = "<content>" + xmlString + "</content>";

        // parse well-formed xml
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document document = builder.parse(new InputSource(new StringReader(xmlStringWellformed)));

        // build xpath expression
        String xPathRemoteHost = "//SI[@name='Channel2']/SI[@name='SecsPortConfig']/SI[@name='Socket']/SI[@name='RemoteHost']/text()";
        String xPathRemotePort = "//SI[@name='Channel2']/SI[@name='SecsPortConfig']/SI[@name='Socket']/SI[@name='RemotePort']/text()";
        XPath xPath = XPathFactory.newInstance().newXPath();

        // Use XPath for extraction
        String remoteHost = (String) xPath.compile(xPathRemoteHost).evaluate(document, XPathConstants.STRING);
        String remotePort = (String) xPath.compile(xPathRemotePort).evaluate(document, XPathConstants.STRING);

        System.out.println("RemoteHost: " + remoteHost);
        System.out.println("RemotePort: " + remotePort);
    

来源:Baeldung - Intro to XPath with Java

以上是关于如何从结构类似于Java的XML文件中获取特定元素的主要内容,如果未能解决你的问题,请参考以下文章

如何从元素中具有相同名称的 xml 文件中获取特定值?

如何在 Java 中使用 XPath 从 XML 中获取特定节点?

如何从 SQL Server 中的 XML 元素获取特定属性

如何使用 C# ASP.Net 从 XML 文档中获取特定 XML 元素的列表?

从 Java DOM 中的 XML 获取元素名称

如何读取 XML 文件并删除一组标签