KissXML 因包含处理指令的 XML 而失败

Posted

技术标签:

【中文标题】KissXML 因包含处理指令的 XML 而失败【英文标题】:KissXML fails with XML containing processing instructions 【发布时间】:2012-09-03 12:09:45 【问题描述】:

我遇到了一个问题,当 XML 包含处理指令时,XPath 查询将在 KissXML 中失败,但如果完全相同的 XML 不包含任何处理指令,则可以正常工作。

作为一个例子,我有以下没有处理指令的 XML (demo1.xml):

<?xml version="1.0"?>
<tst:TstForm xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:tst="http://schemas.somewhere.com/tst/1" xml:lang="en-US">
    <tst:TstDetails>
        <tst:Title>Labelling error on boxes</tst:Title>
        <tst:Description>Box labels</tst:Description>
    </tst:TstDetails>
</tst:TstForm>

我正在解析 XML 并执行 XPath 查询,如下所示:

// Parse the DEMO XML - no processing instructions,
// so this will work
NSLog(@"**** About to parse %@ ****\n\n", xmlFilename);
NSString *XMLPath = [[[NSBundle mainBundle] resourcePath] stringByAppendingPathComponent:xmlFilename];
NSData *XMLData   = [NSData dataWithContentsOfFile:XMLPath];

NSError *err=nil;
NSURL *furl = [NSURL fileURLWithPath:XMLPath];
if (!furl) 
    NSLog(@"Can't create an URL from file: %@.", XMLPath);
    return;

xmlViewDoc = [[DDXMLDocument alloc] initWithData:XMLData
                                              options:0
                                                error:&err];

// Prove that the XML was parsed correctly...
NSLog(@"Root node'%@' found\n\n", xmlViewDoc.rootElement.name);
NSLog(@"About to iterate through elements 2 levels down from the root node...");
NSArray *nodes = xmlViewDoc.rootElement.children;
for (DDXMLElement *node in nodes) 
    NSArray *childNodes = node.children;
    for (DDXMLElement *childNode in childNodes) 
        NSLog(@" XPath: %@", childNode.XPath);
    

NSLog(@"Root node elements END\n\n");

// Show that the namespaces are parsed correctly
NSLog(@"Root node contains the following namespaces...");
nodes = xmlViewDoc.rootElement.namespaces;
for (DDXMLElement *node in nodes) 
    NSLog(@" Found NS: '%@' = '%@'", node.name, node.stringValue);

NSLog(@"Namespaces END\n\n");

// Now execute an XPath query using the namespace
NSString *xPathQuery = @"/tst:TstForm/tst:TstDetails";
NSLog(@"Based on the above namespace and the XML structure, we should be able to execite the following XPath query:\n %@", xPathQuery);
NSLog(@"The XPath query should return two elements (Title & Description)...");
nodes = [xmlViewDoc.rootElement nodesForXPath:xPathQuery error:nil];
for (DDXMLElement *node in nodes) 
    NSLog(@" XPath: %@", node.XPath);

NSLog(@"XPathQuery END\n\n");

此代码按预期提供以下日志输出:

2012-09-03 13:04:25.662 NDPad[37359:c07] **** About to parse demo1.xml ****

2012-09-03 13:04:25.690 NDPad[37359:c07] Root node'tst:TstForm' found

2012-09-03 13:04:25.690 NDPad[37359:c07] About to iterate through elements 2 levels down from the root node...
2012-09-03 13:04:25.691 NDPad[37359:c07]  XPath: /TstForm[1]/TstDetails[1]/Title[1]
2012-09-03 13:04:25.691 NDPad[37359:c07]  XPath: /TstForm[1]/TstDetails[1]/Description[1]
2012-09-03 13:04:25.691 NDPad[37359:c07] Root node elements END

2012-09-03 13:04:25.691 NDPad[37359:c07] Root node contains the following namespaces...
2012-09-03 13:04:25.692 NDPad[37359:c07]  Found NS: 'xsi' = 'http://www.w3.org/2001/XMLSchema-instance'
2012-09-03 13:04:25.692 NDPad[37359:c07]  Found NS: 'tst' = 'http://schemas.somewhere.com/tst/1'
2012-09-03 13:04:25.692 NDPad[37359:c07] Namespaces END

2012-09-03 13:04:25.692 NDPad[37359:c07] Based on the above namespace and the XML structure, we should be able to execite the following XPath query:
 /tst:TstForm/tst:TstDetails
2012-09-03 13:04:25.692 NDPad[37359:c07] The XPath query should return two elements (Title & Description)...
2012-09-03 13:04:25.693 NDPad[37359:c07]  XPath: /TstForm[1]/TstDetails[1]
2012-09-03 13:04:25.693 NDPad[37359:c07] XPathQuery END

但是,如果我随后使用包含单个处理指令 (demo1-5.xml) 的以下 XML:

<?xml version="1.0"?>
<?tst-example-instruction?>
<tst:TstForm xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:tst="http://schemas.somewhere.com/tst/1" xml:lang="en-US">
    <tst:TstDetails>
        <tst:Title>Labelling error on boxes</tst:Title>
        <tst:Description>Box labels</tst:Description>
    </tst:TstDetails>
</tst:TstForm>

相同的代码会失败,并且只提供以下输出:

2012-09-03 13:04:25.693 NDPad[37359:c07] **** About to parse demo1-5.xml ****

2012-09-03 13:04:25.756 NDPad[37359:c07] Root node'tst:TstForm' found

2012-09-03 13:04:25.756 NDPad[37359:c07] About to iterate through elements 2 levels down from the root node...
2012-09-03 13:04:25.756 NDPad[37359:c07]  XPath: /TstForm[1]/TstDetails[1]/Title[1]
2012-09-03 13:04:25.756 NDPad[37359:c07]  XPath: /TstForm[1]/TstDetails[1]/Description[1]
2012-09-03 13:04:25.756 NDPad[37359:c07] Root node elements END

2012-09-03 13:04:25.757 NDPad[37359:c07] Root node contains the following namespaces...
2012-09-03 13:04:25.757 NDPad[37359:c07]  Found NS: 'xsi' = 'http://www.w3.org/2001/XMLSchema-instance'
2012-09-03 13:04:25.757 NDPad[37359:c07]  Found NS: 'tst' = 'http://schemas.somewhere.com/tst/1'
2012-09-03 13:04:25.757 NDPad[37359:c07] Namespaces END

2012-09-03 13:04:25.757 NDPad[37359:c07] Based on the above namespace and the XML structure, we should be able to execite the following XPath query:
 /tst:TstForm/tst:TstDetails
2012-09-03 13:04:25.758 NDPad[37359:c07] The XPath query should return two elements (Title & Description)...
2012-09-03 13:04:25.758 NDPad[37359:c07] XPathQuery END

我看不出这个 XML 有什么问题会导致 XPath 查询失败,尤其是当遍历 DOM 表明它解析正确时。

谢谢, 保罗

【问题讨论】:

【参考方案1】:

我现在有一个 hack,我执行以下操作:

将 XML 作为字符串读取。 使用正则表达式解析以查找所有处理指令。 将所有处理指令添加到字符串数组中。 使用正则表达式去除处理指令

这让我可以毫无问题地执行 XPath 查询。但是,如果我需要写回 XML,我需要:

创建一个新的可变字符串来保存输出 XML 文档/ 循环处理指令并将它们附加到新的输出 XML 文档字符串中。 从我的真实 XML 文档中获取字符串并附加到新的输出 XML 字符串中。

但这远非最佳,我不明白为什么处理指令会破坏 KissXML 中的 XPath 查询。这是我的代码示例:

    // Load the XML file from the application bundle
    NSString *xmlFilename = @"demo1-5.xml";
    NSLog(@"**** About to parse %@ ****\n\n", xmlFilename);
    NSString *XMLPath = [[[NSBundle mainBundle] resourcePath] stringByAppendingPathComponent:xmlFilename];
    NSString *xmlStringData = [NSString stringWithContentsOfFile:XMLPath encoding:NSUTF8StringEncoding error:nil];

    // Now that we have the XML as a string, use regex to:

    // 1) Find all processing instructions
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"<\\?.*?\\?>" options:NSRegularExpressionCaseInsensitive error:nil];
    processingInstructions = [[NSMutableArray alloc] init];
    [regex enumerateMatchesInString:xmlStringData options:0 range:NSMakeRange(0, [xmlStringData length]) usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop)
        // 2) Put all the matches in an array
        NSString *toStore = [xmlStringData substringWithRange:[match range]];
        [processingInstructions addObject:toStore];
    ];
    // 3) Remove all processing instructions from the XML string
    NSString *newString = [regex stringByReplacingMatchesInString:xmlStringData options:0 range:NSMakeRange(0, xmlStringData.length) withTemplate:@""];

    // Having removed the processing instructions, we can now parse the XML
    NSError *err=nil;
    NSURL *furl = [NSURL fileURLWithPath:XMLPath];
    if (!furl) 
        NSLog(@"Can't create an URL from file: %@.", XMLPath);
        return;
    
    xmlViewDoc = [[DDXMLDocument alloc] initWithXMLString:newString
                                             options:0
                                               error:&err];

    // Now execute an XPath query using the namespace
    NSString *xPathQuery = @"/tst:TstForm/tst:TstDetails";
    NSArray *nodes = [xmlViewDoc.rootElement nodesForXPath:xPathQuery error:nil];
    for (DDXMLElement *node in nodes) 
        NSLog(@"Title Value: %@", node.stringValue);
    


    // Finally we'll generate a new XML string which is a concatenation of
    // the processing instructions and the NSXMLDocument.
    NSMutableString *xmlOutputString = [[NSMutableString alloc] init];
    for (NSString *piString in processingInstructions) 
        [xmlOutputString appendFormat:@"%@\n", piString];
    
    [xmlOutputString appendString:[xmlViewDoc XMLStringWithOptions:DDXMLNodePrettyPrint]];

    NSLog(@"XML:\n%@", xmlOutputString);

【讨论】:

以上是关于KissXML 因包含处理指令的 XML 而失败的主要内容,如果未能解决你的问题,请参考以下文章

使用 KissXML 处理 wcf Rest Web 服务 - 我应该如何处理命名空间问题

Xcode 7 命令因信号而失败:非法指令 4

XML 代码运行正常,但 junit 因 NoClassDefFound 而失败

如何在 KissXML 中转储 XML

iOS XML Parser 内存泄漏与 KissXML

SQL Server 存储过程因使用 XML/ANSI_NULLS、QUOTED_IDENTIFIER 选项而失败