使用'xmlbf'解析具有'nil'属性和空'element'标签的XML

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了使用'xmlbf'解析具有'nil'属性和空'element'标签的XML相关的知识,希望对你有一定的参考价值。

我正在尝试将XML API响应解析为Haskell数据类型。

基于我之前问的this question。>

使用xmlbf库。

这是示例响应

<GoodreadsResponse>
  <Request>
    <authentication>true</authentication>
    <key>api_key</key>
    <method>search_index</method>
  </Request>
  <search>
    <query>Ender's Game</query>
    <results-start>1</results-start>
    <results-end>20</results-end>
    <source>Goodreads</source>
    <results>
      <work>
        <id type="integer">2422333</id>
        <books_count type="integer">252</books_count>
        <ratings_count type="integer">1070421</ratings_count>
        <text_reviews_count type="integer">42249</text_reviews_count>
        <original_publication_year type="integer">1985</original_publication_year>
        <original_publication_month type="integer" nil="true"/>
        <original_publication_day type="integer" nil="true"/>
        <average_rating>4.30</average_rating>
        <best_book type="Book">
          <id type="integer">375802</id>
          <title>Ender's Game (Ender's Saga, #1)</title>
          <author>
            <id type="integer">589</id>
            <name>Orson Scott Card</name>
          </author>
        </best_book>
      </work>
      <work>
        <id type="integer">938064</id>
        <books_count type="integer">64</books_count>
        <ratings_count type="integer">82572</ratings_count>
        <text_reviews_count type="integer">867</text_reviews_count>
        <original_publication_year type="integer">1984</original_publication_year>
        <original_publication_month type="integer">12</original_publication_month>
        <original_publication_day type="integer" nil="true"/>
        <average_rating>4.18</average_rating>
        <best_book type="Book">
          <id type="integer">44687</id>
          <title>Enchanters' End Game (The Belgariad, #5)</title>
          <author>
          </author>
        </best_book>
      </work>
    </results>
  </search>
</GoodreadsResponse>

并且我想将其解析为这个

data GoodreadsBookAuthor =
  GoodreadsBookAuthor -- <author> element.
     goodreadsBookAuthorID   :: Text
    , goodreadsBookAuthorName :: Text
    
  deriving (Show)

data GoodreadsBook =
  GoodreadsBook -- <best_book> element.
     goodreadsBookID     :: Text
    , goodreadsBookTitle  :: Text
    , goodreadsBookAuthor :: Maybe GoodreadsBookAuthor -- Could be empty or missing.
    
  deriving (Show)

data GoodreadsWork =
  GoodreadsWork -- <work> element.
     goodreadsWorkID               :: Text
    -- Ignore <books_count> element.
    , goodreadsWorkRatingCount      :: Text
    -- Ignore <text_reviews_count> element.
    , goodreadsWorkPublicationYear  :: Maybe Int -- Could be missing.
    , goodreadsWorkPublicationMonth :: Maybe Int -- Could be missing.
    , goodreadsWorkPublicationDay   :: Maybe Int -- Could be missing.
    , goodreadsWorkAverageRating    :: Text
    
  deriving (Show)

newtype GoodreadsSearchResults =
  GoodreadsSearchResults -- <results> element.
     goodreadsWorks :: [GoodreadsWork]
    

data GoodreadsSearch =
  GoodreadsSearch -- <search> element.
     goodreadsSearchQuery        :: Text
    , goodreadsSearchResultsStart :: Text
    , goodreadsSearchResultsEnd   :: Text
    -- Ignore <source></source>
    , goodreadsSearchResults      :: GoodreadsSearchResults
    
  deriving (Show)

data GoodreadsRequest =
  GoodreadsRequest -- <Request> element.
     authentication :: Text
    , key            :: Text
    , method         :: Text
    
  deriving (Show)

data GoodreadsResponse =
  GoodreadsResponse -- <GoodreadsResponse> element.
     goodreadsRequest :: GoodreadsRequest
    , goodreadsSearch  :: GoodreadsSearch
    
  deriving (Show)

这些是到目前为止我写的instance推导

instance FromXml GoodreadsRequest where
  fromXml =
    pElement "Request"
      $   GoodreadsRequest
      <$> pElement "authentication" pText
      <*> pElement "key"            pText
      <*> pElement "method"         pText

instance FromXml GoodreadsBookAuthor where
  fromXml =
    pElement "author"
      $   GoodreadsBookAuthor
      <$> pElement "id"   pText
      <*> pElement "name" pText

instance FromXml GoodreadsBook where
  fromXml =
    pElement "best_book"
      $   GoodreadsBook
      <$> pElement "id"    pText
      <*> pElement "title" pText
      <*> fromXml

instance FromXml GoodreadsWork where
  fromXml =
    pElement "work"
      $   GoodreadsWork
      <$> pElement "id"                         pText -- kept Text for simplicity
      <*> pElement "ratings_count"              pText -- kept Text for simplicity
      <*> pElement "original_publication_year"  pText -- isn't handling missing value.
      <*> pElement "original_publication_month" pText -- isn't handling missing value.
      <*> pElement "original_publication_day"   pText -- isn't handling missing value.
      <*> pElement "average_rating"             pText -- kept Text for simplicity

instance FromXml GoodreadsSearchResults where
  fromXml = pElement "results" $ GoodreadsSearchResults <$> many fromXml

instance FromXml GoodreadsSearch where
  fromXml =
    pElement "search"
      $   GoodreadsSearch
      <$> pElement "query"              pText
      <*> pElement "results-start"      pText
      <*> pElement "results-end"        pText
      <*> fromXml

instance FromXml GoodreadsResponse where
  fromXml =
    pElement "GoodreadsResponse" $ GoodreadsResponse <$> fromXml <*> fromXml

我想知道的是我应该如何处理像这样的缺失值

<original_publication_month type="integer" nil="true"/>
<original_publication_day type="integer" nil="true"/>

以及这些空元素标签

<author>
</author>

其中一些值应解析为IntDouble,但为简单起见,我将其保留为Text

我正在尝试将XML API响应解析为Haskell数据类型。基于这个问题,我之前问过。使用xmlbf库。这是一个示例响应... ...

答案

要处理一个空元素,请尝试定义一个组合器,该组合器要么解析文本节点,要么同样高兴地是,元素中什么都没有。也许pMaybeText = optional pText是您所追求的,optional contents = fmap Just contents <|> pure Nothing或从Control.Applicative导入。

通常,分隔替代语法。将首先尝试之前的组合器,如果它无法解析输入,则后面的组合器将有机会。

以上是关于使用'xmlbf'解析具有'nil'属性和空'element'标签的XML的主要内容,如果未能解决你的问题,请参考以下文章

我怎么知道结构'URLError'具有成员'code'

使用未解析的标识符'CoreMLDelegate'-TFLite

Python请求代理错误'无法解析'

在Angular 7中使用SJCL时无法解析'./node_modules/sjcl'中的'crypto']

未找到模块。错误,无法解析'@typessvgo'。无法解析"@typessvgo

无法在Android Studio中解析符号'annotation'?