比较两个文档,父元素和子元素的顺序不同
Posted
技术标签:
【中文标题】比较两个文档,父元素和子元素的顺序不同【英文标题】:Compare two documents where both parent elements and child elements are ordered diffently 【发布时间】:2014-03-05 06:43:31 【问题描述】:我正在尝试对一些生成 xml 的方法进行单元测试。我有一个预期的 xml 字符串和结果字符串,在谷歌搜索和搜索堆栈溢出后,我找到了 XMLUnit。但是,它似乎无法处理一种特殊情况,即以不同顺序重复的元素包含不同顺序的元素。例如:
预期的 XML:
<graph>
<parent>
<foo>David</foo>
<bar>Rosalyn</bar>
</parent>
<parent>
<bar>Alexander</bar>
<foo>Linda</foo>
</parent>
</graph>
实际的 XML:
<graph>
<parent>
<foo>Linda</foo>
<bar>Alexander</bar>
</parent>
<parent>
<bar>Rosalyn</bar>
<foo>David</foo>
</parent>
</graph>
您可以看到父节点重复,其内容可以按任意顺序排列。这两个 xml 片段应该是等效的,但是我见过的 *** 示例中没有任何东西可以解决这个问题。 (Best way to compare 2 XML documents in Java) (How can I compare two similar XML files in XMLUnit)
我已经求助于从 xml 字符串创建文档,单步执行每个预期的父节点,然后将其与每个实际的父节点进行比较,以查看其中一个是否等效。
在我看来,这似乎是为了比较常见的比较而重新发明***。 XMLUnit 似乎做了很多,也许我错过了一些东西,但据我所知,在这种特殊情况下它还不够。
有没有更简单/更好的方法来做到这一点?
我的解决方案:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setCoalescing(true);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setIgnoringComments(true);
DocumentBuilder db = dbf.newDocumentBuilder();
// parse and normalize expected xml
Document expectedXMLDoc = db.parse(new ByteArrayInputStream(resultXML.getBytes()));
expectedXMLDoc.normalizeDocument();
// parse and normalize actual xml
Document actualXMLDoc = db.parse(new ByteArrayInputStream(actual.getXml().getBytes()));
actualXMLDoc.normalizeDocument();
// expected and actual parent nodes
NodeList expectedParentNodes = expectedXMLDoc.getLastChild().getChildNodes();
NodeList actualParentNodes = actualXMLDoc.getLastChild().getChildNodes();
// assert same amount of nodes in actual and expected
assertEquals("actual XML does not have expected amount of Parent nodes", expectedParentNodes.getLength(), actualParentNodes.getLength());
// loop through expected parent nodes
for(int i=0; i < expectedParentNodes.getLength(); i++)
// create doc from node
Node expectedParentNode = expectedParentNodes.item(i);
Document expectedParentDoc = db.newDocument();
Node importedExpectedNode = expectedParentDoc.importNode(expectedParentNode, true);
expectedParentDoc.appendChild(importedExpectedNode);
boolean hasSimilar = false;
StringBuilder messages = new StringBuilder();
// for each expected parent, find a similar parent
for(int j=0; j < actualParentNodes.getLength(); j++)
// create doc from node
Node actualParentNode = actualParentNodes.item(j);
Document actualParentDoc = db.newDocument();
Node importedActualNode = actualParentDoc.importNode(actualParentNode, true);
actualParentDoc.appendChild(importedActualNode);
// XMLUnit Diff
Diff diff = new Diff(expectedParentDoc, actualParentDoc);
messages.append(diff.toString());
boolean similar = diff.similar();
if(similar)
hasSimilar = true;
// assert it found a similar parent node
assertTrue("expected and actual XML nodes are not equivalent " + messages, hasSimilar);
【问题讨论】:
【参考方案1】:使用添加了<xsl:sort.../>
的XSL 身份转换按名称对每个文档中的节点重新排序,然后比较排序后的输出。您可能需要对某些节点(即***父节点)使用特定的排序键来对内部内容进行排序。
这是一个让您入门的框架:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<!-- Identity Transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()">
<xsl:sort select="name(.)"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<!-- Special handling for graph/parent nodes -->
<xsl:template match="graph">
<!-- Sort attributes using default above -->
<xsl:apply-templates select="@*"/>
<!-- Sort parent nodes by text of bar node -->
<xsl:apply-templates select="parent">
<xsl:sort select="bar/text()"/>
</xsl:apply-templates>
</xsl:template>
</xsl:stylesheet>
这适用于您发布的示例。根据实际数据进行必要的调整。
【讨论】:
这似乎会减少我喜欢的很多代码,我是否能够创建一个足够通用的 xsl 来对任何 xml 文档进行排序? 你不能,因为你仍然需要解决如何对一系列同名元素进行排序。如何对它们进行排序将取决于确定唯一排序顺序的内部数据,这将取决于实际的文档结构。【参考方案2】:你可以使用递归函数,所以它可以用于任何元素顺序不重要的xml结构,这里是一个伪代码:
public boolean isEqual(Node node1, Node node2)
if nodes are not from the same type
return false;
if values of them are not the same
return false;
if size of their children are not the same
return false;
if they have no children
return true;
//compares each children of the node1 with the first child of node2
for each child node of node1
if(isEqual(node2.child(0), node)
matchFound = true;
break;
if(!matchFound)
return false;
remove matched node from children of node1;
remove matched node from children of node2;
return isEqual(node1, node2)
【讨论】:
这基本上就是 XMLUnit 为我所做的。它还处理检查属性和跟踪差异。 是的,XMLUnit 应该这样做!但也许有时你可以比复杂的库更容易地调试你的简单方法。如果你想找到你的 XMLUnit 问题,也许重写 differenceFoundlinkmethod 会有所帮助。【参考方案3】:刚刚意识到我没有为此选择答案。我最终使用了与我的解决方案非常相似的东西。这是对我有用的最终解决方案。我已经将它封装在一个类中以与 junit 一起使用,因此这些方法可以像任何其他 junit 断言一样使用。
如果所有孩子都需要按顺序排列,就像我的情况一样,您可以运行
assertEquivalentXml(expectedXML, testXML, null, null);
如果期望某些节点具有随机顺序的子节点和/或需要忽略某些属性:
assertEquivalentXml(expectedXML, testXML,
new String[]"dataset", "categories", new String[]"color", "anchorBorderColor", "anchorBgColor");
课程如下:
/**
* A set of methods that assert XML equivalence specifically for XmlProvider classes. Extends
* <code>junit.framework.Assert</code>, meaning that these methods are recognised as assertions by junit.
*
* @author munick
*/
public class XmlProviderAssertions extends Assert
/**
* Asserts two xml strings are equivalent. Nodes are not expected to be in order. Order can be compared among the
* children of the top parent node by adding their names to nodesWithOrderedChildren
* (e.g. in <graph><dataset><set value="1"/><set value="2"/></dataset></graph> the top parent node is graph
* and we can expect the children of dataset to be in order by adding "dataset" to nodesWithOrderedChildren).
*
* All attribute names and values are compared unless their name is in attributesToIgnore in which case only the
* name is compared and any difference in value is ignored.
*
* @param expectedXML the expected xml string
* @param testXML the xml string being tested
* @param nodesWithOrderedChildren names of nodes who's children should be in order
* @param attributesToIgnore names of attributes who's values should be ignored
*/
public static void assertEquivalentXml(String expectedXML, String testXML, String[] nodesWithOrderedChildren, String[] attributesToIgnore)
Set<String> setOfNodesWithOrderedChildren = new HashSet<String>();
if(nodesWithOrderedChildren != null )
Collections.addAll(setOfNodesWithOrderedChildren, nodesWithOrderedChildren);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setCoalescing(true);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setIgnoringComments(true);
DocumentBuilder db = null;
try
db = dbf.newDocumentBuilder();
catch (ParserConfigurationException e)
fail("Error testing XML");
Document expectedXMLDoc = null;
Document testXMLDoc = null;
try
expectedXMLDoc = db.parse(new ByteArrayInputStream(expectedXML.getBytes()));
expectedXMLDoc.normalizeDocument();
testXMLDoc = db.parse(new ByteArrayInputStream(testXML.getBytes()));
testXMLDoc.normalizeDocument();
catch (SAXException e)
fail("Could not parse testXML");
catch (IOException e)
fail("Could not read testXML");
NodeList expectedChildNodes = expectedXMLDoc.getLastChild().getChildNodes();
NodeList testChildNodes = testXMLDoc.getLastChild().getChildNodes();
assertEquals("Test XML does not have expected amount of child nodes", expectedChildNodes.getLength(), testChildNodes.getLength());
//compare parent nodes
Document expectedDEDoc = getNodeAsDocument(expectedXMLDoc.getDocumentElement(), db, false);
Document testDEDoc = getNodeAsDocument(testXMLDoc.getDocumentElement(), db, false);
Diff diff = new Diff(expectedDEDoc, testDEDoc);
assertTrue("Test XML parent node doesn't match expected XML parent node. " + diff.toString(), diff.similar());
// compare child nodes
for(int i=0; i < expectedChildNodes.getLength(); i++)
// expected child node
Node expectedChildNode = expectedChildNodes.item(i);
// skip text nodes
if( expectedChildNode.getNodeType() == Node.TEXT_NODE )
continue;
// convert to document to use in Diff
Document expectedChildDoc = getNodeAsDocument(expectedChildNode, db, true);
boolean hasSimilar = false;
StringBuilder messages = new StringBuilder();
for(int j=0; j < testChildNodes.getLength(); j++)
// find child node in test xml
Node testChildNode = testChildNodes.item(j);
// skip text nodes
if( testChildNode.getNodeType() == Node.TEXT_NODE )
continue;
// create doc from node
Document testChildDoc = getNodeAsDocument(testChildNode, db, true);
diff = new Diff(expectedChildDoc, testChildDoc);
// if it doesn't contain order specific nodes, then use the elem and attribute qualifier, otherwise use the default
if( !setOfNodesWithOrderedChildren.contains( expectedChildDoc.getDocumentElement().getNodeName() ) )
diff.overrideElementQualifier(new ElementNameAndAttributeQualifier());
if(attributesToIgnore != null)
diff.overrideDifferenceListener(new IgnoreNamedAttributesDifferenceListener(attributesToIgnore));
messages.append(diff.toString());
boolean similar = diff.similar();
if(similar)
hasSimilar = true;
assertTrue("Test XML does not match expected XML. " + messages, hasSimilar);
private static Document getNodeAsDocument(Node node, DocumentBuilder db, boolean deep)
// create doc from node
Document nodeDoc = db.newDocument();
Node importedNode = nodeDoc.importNode(node, deep);
nodeDoc.appendChild(importedNode);
return nodeDoc;
/**
* Custom difference listener that ignores differences in attribute values for specified attribute names. Used to
* ignore color attribute differences in FusionChartXml equivalence.
*/
class IgnoreNamedAttributesDifferenceListener implements DifferenceListener
Set<String> attributeBlackList;
public IgnoreNamedAttributesDifferenceListener(String[] attributeNames)
attributeBlackList = new HashSet<String>();
Collections.addAll(attributeBlackList, attributeNames);
public int differenceFound(Difference difference)
int differenceId = difference.getId();
if (differenceId == DifferenceConstants.ATTR_VALUE_ID)
if(attributeBlackList.contains(difference.getControlNodeDetail().getNode().getNodeName()))
return DifferenceListener.RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL;
return DifferenceListener.RETURN_ACCEPT_DIFFERENCE;
public void skippedComparison(Node node, Node node1)
// left empty
【讨论】:
以上是关于比较两个文档,父元素和子元素的顺序不同的主要内容,如果未能解决你的问题,请参考以下文章
css父元素和子元素,我点击父元素让其隐藏,但为啥点击子元素也会隐藏?
如何在 vuejs 中隔离父元素和子元素之间的单击事件处理程序