XmlWriter 异步操作因 XmlWriterSettings.OutputMethod = Html 而失败

Posted

技术标签:

【中文标题】XmlWriter 异步操作因 XmlWriterSettings.OutputMethod = Html 而失败【英文标题】:XmlWriter Async operations fail with XmlWriterSettings.OutputMethod = Html 【发布时间】:2022-01-01 12:34:57 【问题描述】:

使用XmlWriterSettings.OutputMethod = OutputMethod.html 创建XmlWriter 时,异步操作会失败。当使用OutputMethod.AutoDetect(默认)创建相同的,异步操作成功。

失败代码(fiddle):

var transform = new XslCompiledTransform();
using var reader = XmlReader.Create(new StringReader(@"
  <xsl:stylesheet version=""1.0"" xmlns:xsl=""http://www.w3.org/1999/XSL/Transform"">
    <xsl:output method=""html"" indent=""yes"" doctype-system=""html""/>
    <xsl:template match=""/"">
      <bar/>
    </xsl:template>
  </xsl:stylesheet>"));
transform.Load(reader);

var settings = transform.OutputSettings.Clone();
settings.CloseOutput = false;
settings.Async = true;

using var stream = new MemoryStream();
using (var writer = XmlWriter.Create(stream, settings))

    await writer.WriteStartDocumentAsync();
    await writer.WriteStartElementAsync(null, "foo", null);
    await writer.WriteEndElementAsync();
    await writer.WriteEndDocumentAsync();

stream.Position = 0;
var content = new StreamReader(stream).ReadToEnd();
Assert.Contains("foo", content);

堆栈跟踪:

Message: 
System.NotImplementedException : The method or operation is not implemented.

  Stack Trace: 
XmlWriter.WriteStartElementAsync(String prefix, String localName, String ns)
XmlWellFormedWriter.WriteStartElementAsync_NoAdvanceState(String prefix, String localName, String ns)
XmlWellFormedWriter.WriteStartElementAsync(String prefix, String localName, String ns)
XmlAsyncCheckWriter.WriteStartElementAsync(String prefix, String localName, String ns)

工作代码(使用工作fiddle):

var settings = new XmlWriterSettings();
settings.CloseOutput = false;
settings.Async = true;

using var stream = new MemoryStream();
using (var writer = XmlWriter.Create(stream, settings))

    await writer.WriteStartDocumentAsync();
    await writer.WriteStartElementAsync(null, "foo", null);
    await writer.WriteEndElementAsync();
    await writer.WriteEndDocumentAsync();

stream.Position = 0;
var content = new StreamReader(stream).ReadToEnd();
Assert.Contains("foo", content);

在调试模式下检查各种东西,两条代码路径似乎都在后台使用System.Xml.XmlAsyncCheckWriter

【问题讨论】:

已提交issue 到 dotnet 运行时 repo。 【参考方案1】:

有趣的是,造成这种情况的不是OutputMethod,而是doctype-system(编辑:抱歉,它实际上是两者的结合,你会在下面看到)。删除该属性,您的异步调用将神奇地工作。

我可以告诉你发生了什么,但不能告诉你为什么他们选择这样做。


首先,作者是XmlWriterSettings.CreateWriter(Stream)创建的。剪掉所有的绒毛,它是这样的:

internal XmlWriter CreateWriter(Stream output)

    XmlWriter writer;
    if (Encoding.WebName == "utf-8") 
        switch (OutputMethod) 
            case XmlOutputMethod.Html:
                writer= new HtmlUtf8RawTextWriter(output, this);
                break;
        
    

    // Wrap with Xslt/XQuery specific writer if needed;
    // XmlOutputMethod.AutoDetect writer does this lazily when it creates the underlying Xml or Html writer.
    if (OutputMethod != XmlOutputMethod.AutoDetect) 
        if (IsQuerySpecific) 
            // Create QueryOutputWriter if CData sections or DocType need to be tracked
            writer = new QueryOutputWriter((XmlRawWriter)writer, this);
        
    

    // wrap with well-formed writer
    writer = new XmlWellFormedWriter(writer, this);

    if (_useAsync)
        writer = new XmlAsyncCheckWriter(writer);

    return writer;

所以最后,你会得到一层洋葱/食人魔

XmlAsyncCheckWriter(
    XmlWellFormedWriter(
        QueryOutputWriter(
            HtmlUtf8RawTextWriter)))

当您调用 Write...Async() 时,您会期望它从外部 Writer a 一直级联到 HtmlUtf8RawTextWriter 中的最深级别 - 您想要的 the async calls 确实有。

不幸的是,QueryOutputWriter 包装器不会将异步调用委托给内部编写器,实际上是抛出 ​​NotImplementedException 的那个。这是一个错误吗?还是深思熟虑的选择?我不知道。

如果您不需要 DOCTYPE,并且不在输出中使用 CDATA(两者都由我们有问题的 QueryOutputWriter 处理),只需从 XSL 中删除 doctype-system 即可解决您的问题。这将导致以下IsQuerySpecific 成为false,从而防止出现不希望的包装。

private bool IsQuerySpecific => 
    CDataSectionElements.Count != 0
    || _docTypePublic != null
    || _docTypeSystem != null
    || _standalone == XmlStandalone.Yes;

...

if (IsQuerySpecific)
    xmlWriter = new QueryOutputWriter((XmlRawWriter)xmlWriter, this);

如果您确实需要 DOCTYPE/CDATA,那么重新实现一些层并覆盖函数将是一个有趣的练习。

【讨论】:

不错的发现;很好的解释。

以上是关于XmlWriter 异步操作因 XmlWriterSettings.OutputMethod = Html 而失败的主要内容,如果未能解决你的问题,请参考以下文章

通过 System.Xml.XmlWriter 取消转义尖括号

如何在 .NET 中使用 XmlWriter 创建 XmlDocument?

使用XmlWriter写Xml

使用XmlWriter写Xml

如何将 EntityReference(例如 )写入 XmlNodeReader 读取的 XmlWriter?

C# XMLWriter 创建 WriteAttributeStringAsync