C# 使用 XmlReader 但不使用 XmlDocument 获取额外的空白值
Posted
技术标签:
【中文标题】C# 使用 XmlReader 但不使用 XmlDocument 获取额外的空白值【英文标题】:C# getting extra whitespace values with XmlReader but not with XmlDocument 【发布时间】:2018-02-08 05:14:45 【问题描述】:我有一个不太了解的情况。读取以下 XML 时:
<?xml version="1.0" encoding="utf-8" ?>
<Root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Countries>
<Country>
<CountryCode>CN</CountryCode>
<CurrentStatus>Active</CurrentStatus>
</Country>
</Countries>
<Countries>
<Country>
<CountryCode>AU</CountryCode>
<CurrentStatus>Cancelled</CurrentStatus>
</Country>
<Country>
<CountryCode>CN</CountryCode>
<CurrentStatus>Cancelled</CurrentStatus>
</Country>
<Country>
<CountryCode>US</CountryCode>
<CurrentStatus>Active</CurrentStatus>
</Country>
</Countries>
<Countries xsi:nil="true" />
</Root>
使用以下代码:
//No whitespace
string xml = File.ReadAllText(fileInfo.FullName);
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xml);
string json1 = JsonConvert.SerializeXmlNode(xmlDoc);
//With whitespace
XmlDocument doc = new XmlDocument();
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
using (XmlReader reader = XmlReader.Create(fileInfo.FullName, settings))
while (reader.Read())
if (reader.NodeType == XmlNodeType.Element)
XmlNode node = doc.ReadNode(reader);
string json2 = JsonConvert.SerializeXmlNode(node);
我得到的json
看起来像这样:
json1:
"?xml":"@version":"1.0","@encoding":"utf-8","Root":"@xmlns:xsi":"http://www.w3.org/2001/XMLSchema-instance","国家":["Country":"CountryCode":"CN","CurrentStatus":"Active","Country":["CountryCode":"AU","CurrentStatus":"Cancelled" ,"CountryCode":"CN","CurrentStatus":"Cancelled","CountryCode":"JP","CurrentStatus":"Cancelled","CountryCode":"SG","CurrentStatus" :"Cancelled","CountryCode":"US","CurrentStatus":"Active"],"@xsi:nil":"true"]
json2:
"Root":"@xmlns:xsi":"http://www.w3.org/2001/XMLSchema-instance","#whitespace":["\n ","\n ","\n ","\n"],"国家":["#whitespace":["\n ","\n "],"国家":"#whitespace":["\n ","\n ","\n "],"CountryCode":"CN","CurrentStatus":"Active","#whitespace":["\n ","\n ","\n ","\n ","\n ","\n "],"国家":["#whitespace":["\n ","\n ","\n "],"CountryCode":"AU","CurrentStatus":"Cancelled","#whitespace":["\n ","\n ","\n "],"CountryCode":"CN","CurrentStatus":"Cancelled","#whitespace":["\n ","\n ","\n "],"CountryCode":"JP","CurrentStatus":"Cancelled","#whitespace":["\n ","\n ","\n "],"CountryCode":"SG","CurrentStatus":"Cancelled","#whitespace":["\n ","\n ","\n "],"CountryCode":"US","CurrentStatus":"Active"],"@xsi:nil":"true"]
为什么XmlReader
会生成空白而XmlDocument
不会?考虑到 XML 值,我认为它们不应该存在。
【问题讨论】:
试试settings.IgnoreWhitespace = true;
。但基本上你已经有了答案。您真的需要 Reader,即您的数据 > 100MB 吗?
我不知道为什么 XmlReader 默认会这样,但我认为您只需将 XmlReaderSettings.IgnoreWhitespace 设置为 true。
@HenkHolterman 谢谢。我需要阅读器,因为当我读取数据时我的 XML 没有根元素,XmlDocument
会抛出错误。因为我的问题是关于空格的,所以我添加了根元素以显示 XmlReader
和 XmlDocument
之间的区别。
你看过 XElement 吗?通常更容易使用。
好的,让我们听听忽略设置完全有帮助。
【参考方案1】:
解决了:
settings.IgnoreWhitespace = true;
感谢@HenkHolterman 和@finrod。
【讨论】:
【参考方案2】: XmlDocument doc = new XmlDocument();
doc.PreserveWhitespace = false;
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Document;
settings.IgnoreWhitespace = true;
XmlReader reader = XmlReader.Create("XMLFile1.xml", settings);
while (reader.Read())
if (reader.NodeType == XmlNodeType.Element )
XmlNode node = doc.ReadNode(reader);
string json2 = JsonConvert.SerializeXmlNode(node);
Console.WriteLine(json2.Trim());
【讨论】:
以上是关于C# 使用 XmlReader 但不使用 XmlDocument 获取额外的空白值的主要内容,如果未能解决你的问题,请参考以下文章
使用 xmlReader 在 C# 中过滤特定元素值的大型 XML