XSLT 删除 XML 的多个元素

Posted

技术标签:

【中文标题】XSLT 删除 XML 的多个元素【英文标题】:XSLT To Remove multiple Elements of an XML 【发布时间】:2019-08-14 06:47:47 【问题描述】:

我有以下 XML/KML 文件(见下文只是整个数据的一部分)。

我想通过 XSLT 删除特定元素及其内容(我正在使用带有插件 XML 工具的 Notepad++)。文件很大,必须使用 XSLT。

我想从<description> 元素中删除<Snippet> 元素和特定标签/内容:<p> 标签。

例如,一个原始条目是这样的:

<Placemark><name>Wando</name><Snippet>Record 325</Snippet><description><![CDATA[<p>Data source: <a href="https://mrdata.usgs.gov/ofr-2005-1294/" title="">Major mineral deposits worldwide</a></p><p>[<a href="https://mrdata.usgs.gov/major-deposits/show-ofr20051294.php?gid=325">All data for record 325</a>]</p><table border='1' padding='3' cellspacing='0'><tr valign='top'><th align='right' bgcolor='#ddffee' title='Generalized type of deposit'>Deposit type</th><td>Hydrothermal</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='Country in which the site is located'>Country</th><td>Korean Peninsula</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='State in which the site is located, for US sites'>State</th><td></td></tr></table>]]></description><styleUrl>#defaultStyleMap</styleUrl><Point><altitudeMode>relativeToGround</altitudeMode><coordinates>126.6833,34.35,0</coordinates></Point></Placemark>

XSLT之后我想实现的:

<Placemark><name>Wando</name><description><![CDATA[<table border='1' padding='3' cellspacing='0'><tr valign='top'><th align='right' bgcolor='#ddffee' title='Generalized type of deposit'>Deposit type</th><td>Hydrothermal</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='Country in which the site is located'>Country</th><td>Korean Peninsula</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='State in which the site is located, for US sites'>State</th><td></td></tr></table>]]></description><styleUrl>#defaultStyleMap</styleUrl><Point><altitudeMode>relativeToGround</altitudeMode><coordinates>126.6833,34.35,0</coordinates></Point></Placemark>

附:也可以删除&lt;![CDATA[ + 不带&lt;table&gt; + ]]&gt;

我确实需要&lt;table&gt;,例如:

<Placemark><name>Wando</name><description><table border='1' padding='3' cellspacing='0'><tr valign='top'><th align='right' bgcolor='#ddffee' title='Generalized type of deposit'>Deposit type</th><td>Hydrothermal</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='Country in which the site is located'>Country</th><td>Korean Peninsula</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='State in which the site is located, for US sites'>State</th><td></td></tr></table></description><styleUrl>#defaultStyleMap</styleUrl><Point><altitudeMode>relativeToGround</altitudeMode><coordinates>126.6833,34.35,0</coordinates></Point></Placemark>

整个RAW数据:

<?xml version="1.0" encoding="UTF-8"?> <kml xmlns="http://earth.google.com/kml/2.2/">   <Document>
    <name>Major mineral deposits of the world</name>
    <description>Regional locations and general geologic setting of known deposits of major nonfuel mineral commodities. Originally compiled in five parts by diverse authors, combined here for convenience despite likely inconsistencies among the regional reports.</description>
    <Placemark><name>Wando</name><Snippet>Record 325</Snippet><description><![CDATA[<p>Data source: <a href="https://mrdata.usgs.gov/ofr-2005-1294/" title="">Major mineral deposits worldwide</a></p><p>[<a href="https://mrdata.usgs.gov/major-deposits/show-ofr20051294.php?gid=325">All data for record 325</a>]</p><table border='1' padding='3' cellspacing='0'><tr valign='top'><th align='right' bgcolor='#ddffee' title='Generalized type of deposit'>Deposit type</th><td>Hydrothermal</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='Country in which the site is located'>Country</th><td>Korean Peninsula</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='State in which the site is located, for US sites'>State</th><td></td></tr></table>]]></description><styleUrl>#defaultStyleMap</styleUrl><Point><altitudeMode>relativeToGround</altitudeMode><coordinates>126.6833,34.35,0</coordinates></Point></Placemark>
    <Placemark><name>McDonald</name><Snippet>Record 549</Snippet><description><![CDATA[<p>Data source: <a href="https://mrdata.usgs.gov/ofr-2005-1294/" title="">Major mineral deposits worldwide</a></p><p>[<a href="https://mrdata.usgs.gov/major-deposits/show-ofr20051294.php?gid=549">All data for record 549</a>]</p><table border='1' padding='3' cellspacing='0'><tr valign='top'><th align='right' bgcolor='#ddffee' title='Generalized type of deposit'>Deposit type</th><td>Hydrothermal</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='Country in which the site is located'>Country</th><td>United States</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='State in which the site is located, for US sites'>State</th><td>Montana</td></tr></table>]]></description><styleUrl>#defaultStyleMap</styleUrl><Point><altitudeMode>relativeToGround</altitudeMode><coordinates>-112.525,47,0</coordinates></Point></Placemark>
    <Placemark><name>Montana Mountains</name><Snippet>Record 575</Snippet><description><![CDATA[<p>Data source: <a href="https://mrdata.usgs.gov/ofr-2005-1294/" title="">Major mineral deposits worldwide</a></p><p>[<a href="https://mrdata.usgs.gov/major-deposits/show-ofr20051294.php?gid=575">All data for record 575</a>]</p><table border='1' padding='3' cellspacing='0'><tr valign='top'><th align='right' bgcolor='#ddffee' title='Generalized type of deposit'>Deposit type</th><td>Hydrothermal</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='Country in which the site is located'>Country</th><td>United States</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='State in which the site is located, for US sites'>State</th><td>Nevada</td></tr></table>]]></description><styleUrl>#defaultStyleMap</styleUrl><Point><altitudeMode>relativeToGround</altitudeMode><coordinates>-118.108,41.767,0</coordinates></Point></Placemark>
    <Placemark><name>Basay</name><Snippet>Record 429</Snippet><description><![CDATA[<p>Data source: <a href="https://mrdata.usgs.gov/ofr-2005-1294/" title="">Major mineral deposits worldwide</a></p><p>[<a href="https://mrdata.usgs.gov/major-deposits/show-ofr20051294.php?gid=429">All data for record 429</a>]</p><table border='1' padding='3' cellspacing='0'><tr valign='top'><th align='right' bgcolor='#ddffee' title='Generalized type of deposit'>Deposit type</th><td>Hydrothermal</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='Country in which the site is located'>Country</th><td>Philippines</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='State in which the site is located, for US sites'>State</th><td></td></tr></table>]]></description><styleUrl>#defaultStyleMap</styleUrl><Point><altitudeMode>relativeToGround</altitudeMode><coordinates>122.6333,9.5667,0</coordinates></Point></Placemark>
    <Placemark><name>Georgina Basin</name><Snippet>Record 52</Snippet><description><![CDATA[<p>Data source: <a href="https://mrdata.usgs.gov/ofr-2005-1294/" title="">Major mineral deposits worldwide</a></p><p>[<a href="https://mrdata.usgs.gov/major-deposits/show-ofr20051294.php?gid=52">All data for record 52</a>]</p><table border='1' padding='3' cellspacing='0'><tr valign='top'><th align='right' bgcolor='#ddffee' title='Generalized type of deposit'>Deposit type</th><td>Sedimentary</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='Country in which the site is located'>Country</th><td>Australia</td></tr><tr valign='top'><th align='right' bgcolor='#ddffee' title='State in which the site is located, for US sites'>State</th><td></td></tr></table>]]></description><styleUrl>#defaultStyleMap</styleUrl><Point><altitudeMode>relativeToGround</altitudeMode><coordinates>139.9667,-21.8833,0</coordinates></Point></Placemark>
    <Style id="default_highlight"><BalloonStyle><text>Major Mineral Deposits</text></BalloonStyle><IconStyle><scale>1.5</scale><Icon><href>https://mrdata.usgs.gov/images/mine-32.png</href></Icon></IconStyle><LabelStyle><color>ffffffff</color></LabelStyle></Style><Style id="default_normal"><IconStyle><scale>1</scale><Icon><href>https://mrdata.usgs.gov/images/mine-32.png</href></Icon></IconStyle><LabelStyle><color>00ffffff</color></LabelStyle></Style><StyleMap id="defaultStyleMap"><Pair><key>normal</key><styleUrl>#default_normal</styleUrl></Pair><Pair><key>highlight</key><styleUrl>#default_highlight</styleUrl></Pair></StyleMap> </Document> </kml>

【问题讨论】:

【参考方案1】:

删除Snippet 元素很简单:使用identity transform 模板并添加一个与Snippet 匹配的空模板。

将 CDATA 部分中的 原始文本数据 转换为 标记 不是:在将输出写入文件时尝试使用 disable-output-escaping,然后使用另一个样式表进行处理结果文件。或者升级到支持 XSLT 3.0 的处理器(或具有启用转义标记序列化的扩展功能)。


演示:https://xsltfiddle.liberty-development.net/6r5Gh39


您可以考虑的另一个选择是通过使用简单的字符串操作切断table 部分之前的子字符串来“破解”转义标记:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="Snippet"/>

<xsl:template match="description">
    <xsl:copy>
        <xsl:variable name="len" select="string-length(substring-before(., '&lt;table'))" />
        <xsl:value-of select="substring(., $len + 1)" disable-output-escaping="yes"/>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

演示:https://xsltfiddle.liberty-development.net/6r5Gh39/1

【讨论】:

亲爱的@michael.hor257k 当我放多行时出现错误,一行一切正常。 xsltfiddle.liberty-development.net/6r5Gh39/4 正如错误消息所说,您的输入不是格式正确的 XML。 XML 文档必须有一个根元素。另请注意,如果您的输入是 KML 文档,则必须考虑 KML 命名空间 - 例如:***.com/questions/34758492/…【参考方案2】:
<xsl:stylesheet version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    <xsl:template match="Placemark">
        <xsl:element name="Placemark">
            <xsl:copy-of select="name"/>
            <xsl:element name="description">
                <xsl:variable name="finallenght" select="string-length(substring-before(description, '&lt;table'))" />
                <xsl:value-of select="substring(description, $finallenght + 1)" disable-output-escaping="yes"/>   
            </xsl:element>
            <xsl:copy-of select="styleUrl"/>
            <xsl:copy-of select="Point"/>
        </xsl:element>
    </xsl:template>
</xsl:stylesheet>

你也可以用这个

【讨论】:

那么你的答案和我的最大区别是什么?

以上是关于XSLT 删除 XML 的多个元素的主要内容,如果未能解决你的问题,请参考以下文章

基本 XML/XSLT - 存在多个同名元素时的值

使用 xslt 将 xml 复杂节点元素拆分为多个节点

如何使用带有样式表和 xsltproc 的 xslt 从 xml 中删除元素?

XSLT 转换从混合内容中删除 HTML 元素

使用 XSLT 删除 XML 消息中的空元素字段和默认值

XSLT - 在多个文件时选择唯一元素