比较两个文档,父元素和子元素的顺序不同

Posted

技术标签:

【中文标题】比较两个文档,父元素和子元素的顺序不同【英文标题】:Compare two documents where both parent elements and child elements are ordered diffently 【发布时间】:2014-03-05 06:43:31 【问题描述】:

我正在尝试对一些生成 xml 的方法进行单元测试。我有一个预期的 xml 字符串和结果字符串,在谷歌搜索和搜索堆栈溢出后,我找到了 XMLUnit。但是,它似乎无法处理一种特殊情况,即以不同顺序重复的元素包含不同顺序的元素。例如:

预期的 XML:

<graph>
  <parent>
    <foo>David</foo>
    <bar>Rosalyn</bar>
  </parent>
  <parent>
    <bar>Alexander</bar>
    <foo>Linda</foo>
  </parent>
</graph>

实际的 XML:

<graph>
  <parent>
    <foo>Linda</foo>
    <bar>Alexander</bar>
  </parent>
  <parent>
    <bar>Rosalyn</bar>
    <foo>David</foo>
  </parent>
</graph>

您可以看到父节点重复,其内容可以按任意顺序排列。这两个 xml 片段应该是等效的,但是我见过的 *** 示例中没有任何东西可以解决这个问题。 (Best way to compare 2 XML documents in Java) (How can I compare two similar XML files in XMLUnit)

我已经求助于从 xml 字符串创建文档,单步执行每个预期的父节点,然后将其与每个实际的父节点进行比较,以查看其中一个是否等效。

在我看来,这似乎是为了比较常见的比较而重新发明***。 XMLUnit 似乎做了很多,也许我错过了一些东西,但据我所知,在这种特殊情况下它还不够。

有没有更简单/更好的方法来做到这一点?

我的解决方案:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setCoalescing(true);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setIgnoringComments(true);
DocumentBuilder db = dbf.newDocumentBuilder();
// parse and normalize expected xml
Document expectedXMLDoc = db.parse(new ByteArrayInputStream(resultXML.getBytes()));
expectedXMLDoc.normalizeDocument();
// parse and normalize actual xml
Document actualXMLDoc = db.parse(new ByteArrayInputStream(actual.getXml().getBytes()));
actualXMLDoc.normalizeDocument();
// expected and actual parent nodes
NodeList expectedParentNodes = expectedXMLDoc.getLastChild().getChildNodes();
NodeList actualParentNodes = actualXMLDoc.getLastChild().getChildNodes();

// assert same amount of nodes in actual and expected
assertEquals("actual XML does not have expected amount of Parent nodes", expectedParentNodes.getLength(), actualParentNodes.getLength());

// loop through expected parent nodes
for(int i=0; i < expectedParentNodes.getLength(); i++) 
    // create doc from node
    Node expectedParentNode = expectedParentNodes.item(i);    
    Document expectedParentDoc = db.newDocument();
    Node importedExpectedNode = expectedParentDoc.importNode(expectedParentNode, true);
    expectedParentDoc.appendChild(importedExpectedNode);

    boolean hasSimilar = false;
    StringBuilder  messages = new StringBuilder();

    // for each expected parent, find a similar parent
    for(int j=0; j < actualParentNodes.getLength(); j++) 
        // create doc from node
        Node actualParentNode = actualParentNodes.item(j);
        Document actualParentDoc = db.newDocument();
        Node importedActualNode = actualParentDoc.importNode(actualParentNode, true);
        actualParentDoc.appendChild(importedActualNode);

        // XMLUnit Diff
        Diff diff = new Diff(expectedParentDoc, actualParentDoc);
        messages.append(diff.toString());
        boolean similar = diff.similar();
        if(similar) 
            hasSimilar = true;
        
    
    // assert it found a similar parent node
    assertTrue("expected and actual XML nodes are not equivalent " + messages, hasSimilar);        
    

【问题讨论】:

【参考方案1】:

使用添加了&lt;xsl:sort.../&gt; 的XSL 身份转换按名称对每个文档中的节点重新排序,然后比较排序后的输出。您可能需要对某些节点(即***父节点)使用特定的排序键来对内部内容进行排序。

这是一个让您入门的框架:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes"/>

    <!-- Identity Transform -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()">
                <xsl:sort select="name(.)"/>
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template>

    <!-- Special handling for graph/parent nodes -->
    <xsl:template match="graph">
        <!-- Sort attributes using default above -->
        <xsl:apply-templates select="@*"/>
        <!-- Sort parent nodes by text of bar node -->
        <xsl:apply-templates select="parent">
            <xsl:sort select="bar/text()"/>
        </xsl:apply-templates>
    </xsl:template>
</xsl:stylesheet>

这适用于您发布的示例。根据实际数据进行必要的调整。

【讨论】:

这似乎会减少我喜欢的很多代码,我是否能够创建一个足够通用的 xsl 来对任何 xml 文档进行排序? 你不能,因为你仍然需要解决如何对一系列同名元素进行排序。如何对它们进行排序将取决于确定唯一排序顺序的内部数据,这将取决于实际的文档结构。【参考方案2】:

你可以使用递归函数,所以它可以用于任何元素顺序不重要的xml结构,这里是一个伪代码:

public boolean isEqual(Node node1, Node node2)

    if nodes are not from the same type
        return false;
    if values of them are not the same
        return false;
    if size of their children are not the same
        return false;

    if they have no children
        return true;

    //compares each children of the node1 with the first child of node2
    for each child node of node1
        if(isEqual(node2.child(0), node)
        
             matchFound = true;
             break;
        

    if(!matchFound)
        return false;

    remove matched node from children of node1;
    remove matched node from children of node2;

    return isEqual(node1, node2)

【讨论】:

这基本上就是 XMLUnit 为我所做的。它还处理检查属性和跟踪差异。 是的,XMLUnit 应该这样做!但也许有时你可以比复杂的库更容易地调试你的简单方法。如果你想找到你的 XMLUnit 问题,也许重写 differenceFoundlinkmethod 会有所帮助。【参考方案3】:

刚刚意识到我没有为此选择答案。我最终使用了与我的解决方案非常相似的东西。这是对我有用的最终解决方案。我已经将它封装在一个类中以与 junit 一起使用,因此这些方法可以像任何其他 junit 断言一样使用。

如果所有孩子都需要按顺序排列,就像我的情况一样,您可以运行

assertEquivalentXml(expectedXML, testXML, null, null);

如果期望某些节点具有随机顺序的子节点和/或需要忽略某些属性:

assertEquivalentXml(expectedXML, testXML,
                new String[]"dataset", "categories", new String[]"color", "anchorBorderColor", "anchorBgColor");

课程如下:

/**
 * A set of methods that assert XML equivalence specifically for XmlProvider classes. Extends 
 * <code>junit.framework.Assert</code>, meaning that these methods are recognised as assertions by junit.
 *
 * @author munick
 */
public class XmlProviderAssertions extends Assert     

    /**
     * Asserts two xml strings are equivalent. Nodes are not expected to be in order. Order can be compared among the 
     * children of the top parent node by adding their names to nodesWithOrderedChildren 
     * (e.g. in <graph><dataset><set value="1"/><set value="2"/></dataset></graph> the top parent node is graph 
     * and we can expect the children of dataset to be in order by adding "dataset" to nodesWithOrderedChildren).
     * 
     * All attribute names and values are compared unless their name is in attributesToIgnore in which case only the 
     * name is compared and any difference in value is ignored.
     * 
     * @param expectedXML the expected xml string 
     * @param testXML the xml string being tested
     * @param nodesWithOrderedChildren names of nodes who's children should be in order
     * @param attributesToIgnore names of attributes who's values should be ignored
     */
    public static void assertEquivalentXml(String expectedXML, String testXML, String[] nodesWithOrderedChildren, String[] attributesToIgnore) 
        Set<String> setOfNodesWithOrderedChildren = new HashSet<String>();
        if(nodesWithOrderedChildren != null ) 
            Collections.addAll(setOfNodesWithOrderedChildren, nodesWithOrderedChildren);
        

        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setCoalescing(true);
        dbf.setIgnoringElementContentWhitespace(true);
        dbf.setIgnoringComments(true);
        DocumentBuilder db = null;
        try 
            db = dbf.newDocumentBuilder();
         catch (ParserConfigurationException e) 
            fail("Error testing XML");
        

        Document expectedXMLDoc = null;
        Document testXMLDoc = null;
        try 
            expectedXMLDoc = db.parse(new ByteArrayInputStream(expectedXML.getBytes()));
            expectedXMLDoc.normalizeDocument();

            testXMLDoc = db.parse(new ByteArrayInputStream(testXML.getBytes()));
            testXMLDoc.normalizeDocument();
         catch (SAXException e) 
            fail("Could not parse testXML");
         catch (IOException e) 
            fail("Could not read testXML");
        
        NodeList expectedChildNodes = expectedXMLDoc.getLastChild().getChildNodes();
        NodeList testChildNodes = testXMLDoc.getLastChild().getChildNodes();

        assertEquals("Test XML does not have expected amount of child nodes", expectedChildNodes.getLength(), testChildNodes.getLength());

        //compare parent nodes        
        Document expectedDEDoc = getNodeAsDocument(expectedXMLDoc.getDocumentElement(), db, false);        
        Document testDEDoc = getNodeAsDocument(testXMLDoc.getDocumentElement(), db, false);
        Diff diff = new Diff(expectedDEDoc, testDEDoc);
        assertTrue("Test XML parent node doesn't match expected XML parent node. " + diff.toString(), diff.similar());

        // compare child nodes
        for(int i=0; i < expectedChildNodes.getLength(); i++) 
            // expected child node
            Node expectedChildNode = expectedChildNodes.item(i);
            // skip text nodes
            if( expectedChildNode.getNodeType() == Node.TEXT_NODE ) 
                continue;
            
            // convert to document to use in Diff
            Document expectedChildDoc = getNodeAsDocument(expectedChildNode, db, true);

            boolean hasSimilar = false;
            StringBuilder  messages = new StringBuilder();

            for(int j=0; j < testChildNodes.getLength(); j++) 
                // find child node in test xml
                Node testChildNode = testChildNodes.item(j);
                // skip text nodes
                if( testChildNode.getNodeType() == Node.TEXT_NODE ) 
                    continue;
                
                // create doc from node
                Document testChildDoc = getNodeAsDocument(testChildNode, db, true);

                diff = new Diff(expectedChildDoc, testChildDoc);
                // if it doesn't contain order specific nodes, then use the elem and attribute qualifier, otherwise use the default
                if( !setOfNodesWithOrderedChildren.contains( expectedChildDoc.getDocumentElement().getNodeName() ) ) 
                    diff.overrideElementQualifier(new ElementNameAndAttributeQualifier());
                
                if(attributesToIgnore != null) 
                    diff.overrideDifferenceListener(new IgnoreNamedAttributesDifferenceListener(attributesToIgnore));
                
                messages.append(diff.toString());
                boolean similar = diff.similar();
                if(similar) 
                    hasSimilar = true;
                
            
            assertTrue("Test XML does not match expected XML. " + messages, hasSimilar);
        
    

    private static Document getNodeAsDocument(Node node, DocumentBuilder db, boolean deep) 
        // create doc from node
        Document nodeDoc = db.newDocument();
        Node importedNode = nodeDoc.importNode(node, deep);
        nodeDoc.appendChild(importedNode);
        return nodeDoc;
    



/**
 * Custom difference listener that ignores differences in attribute values for specified attribute names. Used to 
 * ignore color attribute differences in FusionChartXml equivalence.
 */
class IgnoreNamedAttributesDifferenceListener implements DifferenceListener 
    Set<String> attributeBlackList;

    public IgnoreNamedAttributesDifferenceListener(String[] attributeNames)         
        attributeBlackList = new HashSet<String>();
        Collections.addAll(attributeBlackList, attributeNames);
    

    public int differenceFound(Difference difference) 
        int differenceId = difference.getId();
        if (differenceId == DifferenceConstants.ATTR_VALUE_ID) 
            if(attributeBlackList.contains(difference.getControlNodeDetail().getNode().getNodeName())) 
                return DifferenceListener.RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL;
            
        

        return DifferenceListener.RETURN_ACCEPT_DIFFERENCE;
    

    public void skippedComparison(Node node, Node node1) 
        // left empty
    

【讨论】:

以上是关于比较两个文档,父元素和子元素的顺序不同的主要内容,如果未能解决你的问题,请参考以下文章

共享元素转换在父片段和子片段之间不起作用(嵌套片段)

css父元素和子元素,我点击父元素让其隐藏,但为啥点击子元素也会隐藏?

如何在 vuejs 中隔离父元素和子元素之间的单击事件处理程序

使用iframe父页面调用子页面和子页面调用父页面的元素与方法

如何在父元素和子元素上设置 CSS 悬停效果

iframe父页面和子页面获取元素和js变量