java中评估字符串上的xpath并返回结果字符串的简单方法是啥
Posted
技术标签:
【中文标题】java中评估字符串上的xpath并返回结果字符串的简单方法是啥【英文标题】:What's a simple way in java to evaluate an xpath on a string and return a result stringjava中评估字符串上的xpath并返回结果字符串的简单方法是什么 【发布时间】:2011-05-10 23:54:54 【问题描述】:一个简单的问题需要一个简单的答案。
例如:
String xml = "<car><manufacturer>toyota</manufacturer></car>";
String xpath = "/car/manufacturer";
assertEquals("toyota",evaluate(xml, xpath));
如何以简单易读的方式编写适用于任何给定格式良好的 xml 和 xpath 的评估方法。
显然有很多方法可以实现这一点,但大多数看起来都非常冗长。
我缺少任何可以实现此目的的简单方法/库?
对于返回多个节点的情况,我只想要这个的字符串表示。
【问题讨论】:
【参考方案1】:给你,用 Java SE 可以做到以下几点:
import java.io.StringReader;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathFactory;
import org.xml.sax.InputSource;
public class Demo
public static void main(String[] args) throws Exception
String xml = "<car><manufacturer>toyota</manufacturer></car>";
String xpath = "/car/manufacturer";
XPath xPath = XPathFactory.newInstance().newXPath();
assertEquals("toyota",xPath.evaluate(xpath, new InputSource(new StringReader(xml))));
【讨论】:
【参考方案2】:对于这个用例,XMLUnit 库可能是一个完美的选择: http://xmlunit.sourceforge.net/userguide/html/index.html#Xpath%20Tests
它提供了一些额外的断言方法。
例如:
assertXpathEvaluatesTo("toyota", "/car/manufacturer",
"<car><manufacturer>toyota</manufacturer></car>");
【讨论】:
【参考方案3】:使用来自https://github.com/guppy4j/libraries/tree/master/messaging-impl 的 Xml 类:
Xml xml = new Xml("<car><manufacturer>toyota</manufacturer></car>");
assertEquals("toyota", xml.get("/car/manufacturer"));
【讨论】:
【参考方案4】:到目前为止,我已经用三种语言写过assertXPath()
。 Ruby 和 Python 是最好的,因为它们还可以通过 libxml2 解析具有其特性的 HTML,然后在它们上运行 XPath。对于 XML,或者对于没有像 <
这样的 javascript“小于”的小故障的精心控制的 HTML,这是我的断言套件:
private static final XPathFactory xpathFactory = XPathFactory.newInstance();
private static final XPath xpath = xpathFactory.newXPath();
private static @NonNull Document assertHtml(@NonNull String xml)
try
try
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);
DocumentBuilder builder = factory.newDocumentBuilder();
ByteArrayInputStream stream = new ByteArrayInputStream(xml.replaceAll("i < len;", "i < len;").getBytes()); // Because JavaScript ruined HTML's ability to someday be real XML...
return builder.parse(stream);
catch (SAXParseException e)
if (e.getLocalizedMessage().startsWith("Unexpected token") && !xml.startsWith("<xml>"))
return assertHtml("<xml>" + xml + "</xml>");
throw e; // a GOTO to 2 lines down...
catch (Throwable e)
fail(e.getLocalizedMessage());
return null;
private static @NonNull List<String> assertXPaths(@NonNull Node node, @NonNull String xpathExpression)
NodeList nodes = evaluateXPath(node, xpathExpression);
List<String> values = new ArrayList<>();
if (nodes != null)
for (int i = 0; i < nodes.getLength(); i++)
Node item = nodes.item(i);
// item.getTextContent();
// item.getNodeName();
values.add(item.getNodeValue());
if (values.size() == 0)
fail("XPath not found: " + xpathExpression + "\n\nin: " + nodeToString(node) + "\n");
return values;
private static @NonNull Node assertXPath(@NonNull Node node, @NonNull String xpathExpression)
NodeList nodes = evaluateXPath(node, xpathExpression);
if (nodes != null && nodes.getLength() > 0)
return nodes.item(0);
fail("XPath not found: " + xpathExpression + "\n\nin: " + nodeToString(node) + "\n");
return null; // this can't happen
private static NodeList evaluateXPath(@NonNull Node node, @NonNull String xpathExpression)
NodeList nodes = null;
try
XPathExpression expr = xpath.compile(xpathExpression);
nodes = (NodeList) expr.evaluate(node, XPathConstants.NODESET);
catch (XPathExpressionException e)
fail(e.getLocalizedMessage());
return nodes;
private static void assertXPath(Node node, String xpathExpression, String reference)
List<String> nodes = assertXPaths(node, xpathExpression);
assertEquals(1, nodes.size()); // CONSIDER decorate these assertion diagnostics with nodeToString(). And don't check for one text() - join them all together
assertEquals(reference, nodes.get(0).trim()); // CONSIDER same complaint: We need to see the nodeToString() here
private static void refuteXPath(@NonNull Node node, @NonNull String xpathExpression)
NodeList nodes = evaluateXPath(node, xpathExpression);
if (nodes.getLength() != 0)
fail("XPath should not be found: " + xpathExpression); // CONSIDER decorate this with the contents of the node
private static @NonNull String nodeToString(@NonNull Node node)
StringWriter sw = new StringWriter();
Transformer t = null;
try
t = TransformerFactory.newInstance().newTransformer();
catch (TransformerConfigurationException e)
fail(e.getLocalizedMessage());
t.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
t.setOutputProperty(OutputKeys.INDENT, "yes");
try
t.transform(new DOMSource(node), new StreamResult(sw));
catch (TransformerException e)
fail(e.getLocalizedMessage());
return sw.toString();
递归地使用它们,像这样:
Document doc = assertHtml(myHtml);
Node blockquote = assertXPath(doc, "//blockquote[ 'summary_7' = @id ]");
assertXPath(blockquote, ".//span[ contains(., 'Mammal') and strong/text() = 'anteater' ]");
找到一个节点,然后断言相对于该节点的路径(通过.//
)的好处是在失败时nodeToString()
只会报告节点内容,例如我的<blockquote>
。断言诊断消息不会包含整个文档,因此非常易于阅读。
【讨论】:
以上是关于java中评估字符串上的xpath并返回结果字符串的简单方法是啥的主要内容,如果未能解决你的问题,请参考以下文章
如何在 VB6 中将连接字符串评估为索引 XPath 表达式
Java:如何通过 org.w3c.dom.document 上的 xpath 字符串定位元素