如何使用正则表达式处理来自SOAP XML的数据
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何使用正则表达式处理来自SOAP XML的数据相关的知识,希望对你有一定的参考价值。
我打算编写一个通过SOAP获取数据的脚本。我可以得到一个包含大量以下数据的文件,例如:
<?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><soapenv:Body><ns0:GetList_Operation_0Response xmlns:ns0="urn:TEST:AST:Attributes" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ns0:getListValues>
<ns0:AssetLifecycleStatus>Deployed</ns0:AssetLifecycleStatus>
<ns0:ReconciliationIdentity>OI-ASDAQWDASDWA</ns0:ReconciliationIdentity>
<ns0:instanceId>OI-SDWDSDWDSDWD</ns0:instanceId>
<ns0:ModifiedDate>2017-12-12T03:31:32+01:00</ns0:ModifiedDate>
<ns0:ClassId>BMC_COMPUTERSYSTEM</ns0:ClassId>
</ns0:getListValues>
<ns0:getListValues>
<ns0:AssetLifecycleStatus>Being Assembled</ns0:AssetLifecycleStatus>
<ns0:ReconciliationIdentity>OI-ASDQWAWDSADW</ns0:ReconciliationIdentity>
<ns0:instanceId>OI-SDWDSWDSWDWD</ns0:instanceId>
<ns0:ModifiedDate>2017-12-10T03:30:21+01:00</ns0:ModifiedDate>
<ns0:ClassId>BMC_COMPUTERSYSTEM</ns0:ClassId>
</ns0:getListValues>
<ns0:getListValues>
<ns0:AssetLifecycleStatus>Deployed</ns0:AssetLifecycleStatus>
<ns0:ReconciliationIdentity>OI-ASDWASDWDASDW</ns0:ReconciliationIdentity>
<ns0:instanceId>OI-SDWDSDWDSDWD</ns0:instanceId>
<ns0:ModifiedDate>2017-12-12T03:31:31+01:00</ns0:ModifiedDate>
<ns0:ClassId>BMC_COMPUTERSYSTEM</ns0:ClassId>
</ns0:getListValues>
</ns0:GetList_Operation_0Response></soapenv:Body></soapenv:Envelope>
您能否建议如何仅提取“AssetLifecycleStatus”之间的状态,例如:Deployed。它应该遍历此行的每个部分并将输出提供给新行。例:
部署
被组装
部署
哪个是操纵这些数据的最佳语言,是Perl吗?谢谢你的信息!
答案
解析XML并不是最好的语言。 Perl以其正则表达式而闻名,但使用专用解析器可以更好地解析XML。在perl中,XML::LibXML
是一种广泛使用的。 Perl还有其他解析器,以及Python,Ruby,JS等。
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML;
my $doc = XML::LibXML->load_xml(IO => *DATA);
# can also load with (string => $xml_string) or (location => 'file.xml');
my @nodes = $doc->getElementsByTagName('ns0:AssetLifecycleStatus');
for my $node (@nodes) {
print $node->textContent."
";
}
__DATA__
<?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><soapenv:Body><ns0:GetList_Operation_0Response xmlns:ns0="urn:TEST:AST:Attributes" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ns0:getListValues>
<ns0:AssetLifecycleStatus>Deployed</ns0:AssetLifecycleStatus>
<ns0:ReconciliationIdentity>OI-ASDAQWDASDWA</ns0:ReconciliationIdentity>
<ns0:instanceId>OI-SDWDSDWDSDWD</ns0:instanceId>
<ns0:ModifiedDate>2017-12-12T03:31:32+01:00</ns0:ModifiedDate>
<ns0:ClassId>BMC_COMPUTERSYSTEM</ns0:ClassId>
</ns0:getListValues>
<ns0:getListValues>
<ns0:AssetLifecycleStatus>Being Assembled</ns0:AssetLifecycleStatus>
<ns0:ReconciliationIdentity>OI-ASDQWAWDSADW</ns0:ReconciliationIdentity>
<ns0:instanceId>OI-SDWDSWDSWDWD</ns0:instanceId>
<ns0:ModifiedDate>2017-12-10T03:30:21+01:00</ns0:ModifiedDate>
<ns0:ClassId>BMC_COMPUTERSYSTEM</ns0:ClassId>
</ns0:getListValues>
<ns0:getListValues>
<ns0:AssetLifecycleStatus>Deployed</ns0:AssetLifecycleStatus>
<ns0:ReconciliationIdentity>OI-ASDWASDWDASDW</ns0:ReconciliationIdentity>
<ns0:instanceId>OI-SDWDSDWDSDWD</ns0:instanceId>
<ns0:ModifiedDate>2017-12-12T03:31:31+01:00</ns0:ModifiedDate>
<ns0:ClassId>BMC_COMPUTERSYSTEM</ns0:ClassId>
</ns0:getListValues>
</ns0:GetList_Operation_0Response></soapenv:Body></soapenv:Envelope>
另一答案
如果你已经安装了Mojolicious framework,你可以使用ojo包编写一个简单的Perl单行程序。它被称为ojo
,因为当你使用-M
命令行时,它将成为-Mojo
。
该程序假设有一个名为foo.xml
的文件,其XML内容位于同一目录中。
$ perl -Mojo -E 'x(f("foo.xml")->slurp)->find("AssetLifecycleStatus")->map(sub{say $_->text})'
Deployed
Being Assembled
Deployed
我将引导您完成该计划。
perl # call the Perl interpreter
-Mojo # load module ojo
-E ' # run the following program with all features turned on
x( # ojo function to turn string into Mojo::DOM object
f("foo.xml") # ojo function to create a Mojo::File object
->slurp # read in the whole file and return content
)
->find("AssetLifecycleStatus") # find all instances of string in DOM with CSS selector
->map( # iterate and run on all instances ...
sub { # ... this function ...
say $_->text # ... output text node of element with newline attached
}
)
'
另一答案
如果您可以使用XSLT 2.0,请使用此代码。
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
exclude-result-prefixes="xs"
version="2.0">
<xsl:template match="/">
<xsl:for-each select="//*:AssetLifecycleStatus">
<xsl:text>
</xsl:text><xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
输出是:
部署 被组装 部署
以上是关于如何使用正则表达式处理来自SOAP XML的数据的主要内容,如果未能解决你的问题,请参考以下文章