修剪或删除元素内的前导/尾随空格[重复]
Posted
技术标签:
【中文标题】修剪或删除元素内的前导/尾随空格[重复]【英文标题】:Trim or remove leading/trialing spaces inside the element [duplicate] 【发布时间】:2021-07-09 15:29:53 【问题描述】:我将 html 转换为 xml。我正在努力消除空格。当我使用 normalize() 函数时,空格被删除,但文本和元素之间的单个空格,例如of<strong>Agricultural</strong>studies
、limited<i>according standard commercial</i>practices
也被删除。下面我定义了我的输入
<html>
<div class="Sec">
<p class="stitle">The need of <strong> Agricultural </strong> studies </p>
<div class="subs1"> (a) term for leases </div>
<div class="subs1"> (b) be limited <i> according standard commercial </i> practices with maximum </div>
<table class="table"><tr><td><p class="tablepara"> (1) General Lease </p></td>
<td><p class="tablepara"> 49 years </p></td></tr>
<tr><td><p class="tablepara"> General Permit </p></td><td/></tr>
<tr><td><p class="tablepara"> Forest<sup> 1 </sup> Management Agreement </p></td>
<td/></tr><tr><td><p class="tablepara"> (2) Agricultural Lease </p></td></tr></table>
</div>
</html>
我尝试过使用这个 xslt
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
<xsl:output indent="no" omit-xml-declaration="yes" method="html"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="normalize-space()"/>
</xsl:template>
</xsl:stylesheet>
我得到的输出是
<html>
<div class="Sec">
<p class="stitle">The need of<strong>Agricultural</strong>studies</p>
<div class="subs1">(a) term for leases</div>
<div class="subs1">(b) be limited<i>according standard commercial</i>practices with maximum</div>
<table class="table"><tr><td><p class="tablepara">(1) General Lease</p></td><td><p class="tablepara">49 years</p></td></tr>
<tr><td><p class="tablepara">General Permit</p></td><td></td></tr><tr><td><p class="tablepara">Forest<sup>1</sup>Management Agreement</p></td><td></td></tr>
<tr><td><p class="tablepara">(2) Agricultural Lease</p></td></tr></table></div>
</html>
我发现它还删除了文本附近的空格,即 <i>
元素和 <strong>
元素周围
of<strong>Agricultural</strong>studies, limited<i>according standard commercial</i>practices
我需要保留空间
of <strong>Agricultural</strong> studies, limited <i>according standard commercial</i> practices
我的预期输出是
<html>
<div class="Sec">
<p class="stitle">The need of <strong>Agricultural</strong> studies</p>
<div class="subs1">(a) term for leases</div>
<div class="subs1">(b) be limited <i>according standard commercial</i> practices with maximum</div>
<table class="table"><tr><td><p class="tablepara">(1) General Lease</p></td><td><p class="tablepara">49 years</p></td></tr>
<tr><td><p class="tablepara">General Permit</p></td><td></td></tr><tr><td><p class="tablepara">Forest<sup>1</sup> Management Agreement</p></td><td></td></tr>
<tr><td><p class="tablepara">(2) Agricultural Lease</p></td></tr></table></div>
</html>
请有人帮忙删除空格
【问题讨论】:
【参考方案1】:这似乎工作得相当好:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
<xsl:output indent="yes" omit-xml-declaration="yes" method="html"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()[preceding-sibling::* and following-sibling::*]">
<xsl:text> </xsl:text>
<xsl:value-of select="normalize-space()" />
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template match="text()[preceding-sibling::*]">
<xsl:text> </xsl:text>
<xsl:value-of select="normalize-space()" />
</xsl:template>
<xsl:template match="text()[following-sibling::*]">
<xsl:value-of select="normalize-space()" />
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="normalize-space()" />
</xsl:template>
</xsl:stylesheet>
输出(像您在问题中所做的那样包装,而不是像 XSLT 处理器创建的那样包装):
<html>
<div class="Sec"><p class="stitle">The need of <strong>Agricultural</strong> studies</p>
<div class="subs1">(a) term for leases</div>
<div class="subs1">(b) be limited <i>according standard commercial</i> practices with maximum</div>
<table class="table"><tr><td><p class="tablepara">(1) General Lease</p></td><td><p class="tablepara">49 years</p></td></tr>
<tr><td><p class="tablepara">General Permit</p></td><td></td></tr>
<tr><td><p class="tablepara">Forest <sup>1</sup> Management Agreement</p></td><td></td></tr><tr><td><p class="tablepara">(2) Agricultural Lease</p></td></tr></table></div>
</html>
【讨论】:
它很好,但只有很少的修正需要,这为 元素提供了额外的空间。即 Forest1 Management Agreemen---------------------输出为 Forest 1 Management Agreemen @Reegan 所以你想让some text <sup>1</sup>
变成some text<sup>1</sup>
?将<xsl:template match="text()[following-sibling::*]">
更改为<xsl:template match="text()[following-sibling::*[not(self::sup)]]">
是的,已知元素可以。但也有可能出现未知元素,例如 、text()[normalize-space() != '' and substring(., string-length(), 1) != ' ']
。继续试验,直到找到涵盖所有情况的解决方案。以上是关于修剪或删除元素内的前导/尾随空格[重复]的主要内容,如果未能解决你的问题,请参考以下文章