从 Tone Analyser 的 JSON 响应中的字典列表中提取数据 [重复]

Posted

技术标签:

【中文标题】从 Tone Analyser 的 JSON 响应中的字典列表中提取数据 [重复]【英文标题】:Extracting data from list of dictionaries from Tone Analyser's JSON response [duplicate] 【发布时间】:2019-04-18 00:19:14 【问题描述】:

我正在使用 IBM Watson 的语气分析器分析文本,并且我正在尝试提取与句子语气相关的所有信息(例如,sentence_idtexttonestone_idtone_name、@987654326 @) 并将其添加到数据框(带有列;sentence_idtexttonestone_idscoretone_name)。这是我的输出示例:

> ['document_tone': 'tones': ['score': 0.551743,
     'tone_id': 'analytical',
     'tone_name': 'Analytical'],
  'sentences_tone': ['sentence_id': 0,
    'text': '@jozee25 race is the basis on which quotas are implemented.',
    'tones': [],
   'sentence_id': 1, 'text': 'helloooooo', 'tones': []],
 'document_tone': 'tones': [],
 'document_tone': 'tones': ['score': 0.802429,
     'tone_id': 'analytical',
     'tone_name': 'Analytical',
    'score': 0.60167, 'tone_id': 'confident', 'tone_name': 'Confident'],
  'sentences_tone': ['sentence_id': 0,
    'text': '@growawaysa @cricketandre i have the answer on top yard from dpw:it is not currently "surplus to govt requirements".it is still being used for garaging until a new facility is ready in maitland.the',
    'tones': ['score': 0.631014,
      'tone_id': 'analytical',
      'tone_name': 'Analytical'],
   'sentence_id': 1,
    'text': 'cost of the housing options will of course depend on prospects for cross subsidisation.',
    'tones': ['score': 0.589295,
      'tone_id': 'analytical',
      'tone_name': 'Analytical',
     'score': 0.509368, 'tone_id': 'confident', 'tone_name': 'Confident']],
 'document_tone': 'tones': ['score': 0.58393,
     'tone_id': 'tentative',
     'tone_name': 'Tentative',
    'score': 0.641954, 'tone_id': 'analytical', 'tone_name': 'Analytical'],
 'document_tone': 'tones': ['score': 0.817073,
     'tone_id': 'joy',
     'tone_name': 'Joy',
    'score': 0.920556, 'tone_id': 'analytical', 'tone_name': 'Analytical',
    'score': 0.808202, 'tone_id': 'tentative', 'tone_name': 'Tentative'],
  'sentences_tone': ['sentence_id': 0,
    'text': 'thanks @khayadlangaand colleagues for the fascinating tour yesterday.really',
    'tones': ['score': 0.771305, 'tone_id': 'joy', 'tone_name': 'Joy',
     'score': 0.724236, 'tone_id': 'analytical', 'tone_name': 'Analytical'],
   'sentence_id': 1,
    'text': 'eyeopening and i learnt a lot.',
    'tones': ['score': 0.572756, 'tone_id': 'joy', 'tone_name': 'Joy',
     'score': 0.842108, 'tone_id': 'analytical', 'tone_name': 'Analytical',
     'score': 0.75152, 'tone_id': 'tentative', 'tone_name': 'Tentative']],

这是我为获得此输出而编写的代码:

result =[]
for i in helen['Tweets']:
   tone_analysis = tone_analyzer.tone(
       'text': i,
       'application/json'
   ).get_result()
   result.append(tone_analysis)

【问题讨论】:

你有什么问题? 【参考方案1】:

首先,由于您的 JSON 格式不正确,我正在使用来自 Tone Analyzer API 参考的 JSON here

使用 API 参考中的 JSON 和 Pandas json_normalize,这是我想出的代码

from pandas.io.json import json_normalize

jsonfile = 
  "document_tone": 
    "tones": [
      
        "score": 0.6165,
        "tone_id": "sadness",
        "tone_name": "Sadness"
      ,
      
        "score": 0.829888,
        "tone_id": "analytical",
        "tone_name": "Analytical"
      
    ]
  ,
  "sentences_tone": [
    
      "sentence_id": 0,
      "text": "Team, I know that times are tough!",
      "tones": [
        
          "score": 0.801827,
          "tone_id": "analytical",
          "tone_name": "Analytical"
        
      ]
    ,
    
      "sentence_id": 1,
      "text": "Product sales have been disappointing for the past three quarters.",
      "tones": [
        
          "score": 0.771241,
          "tone_id": "sadness",
          "tone_name": "Sadness"
        ,
        
          "score": 0.687768,
          "tone_id": "analytical",
          "tone_name": "Analytical"
        
      ]
    ,
    
      "sentence_id": 2,
      "text": "We have a competitive product, but we need to do a better job of selling it!",
      "tones": [
        
          "score": 0.506763,
          "tone_id": "analytical",
          "tone_name": "Analytical"
        
      ]
    
  ]


mydata = json_normalize(jsonfile['sentences_tone'])
mydata.head(3)
print(mydata)

tones_data = json_normalize(data=jsonfile['sentences_tone'], record_path='tones')
tones_data.head(3)
print(tones_data)

输出数据帧将是

   sentence_id                        ...                                            tones
0            0                        ...['score': 0.801827, 'tone_id': 'analytical', ...
1            1                        ...['score': 0.771241, 'tone_id': 'sadness', 'to...
2            2                        ...['score': 0.506763, 'tone_id': 'analytical', ...

[3 rows x 3 columns]
      score     tone_id   tone_name
0  0.801827  analytical  Analytical
1  0.771241     sadness     Sadness
2  0.687768  analytical  Analytical
3  0.506763  analytical  Analytical

另外,我为你创建了 REPL 来更改输入并在浏览器上运行代码 - https://repl.it/@aficionado/DarkturquoiseUnnaturalDistributeddatabase

请参阅此 Kaggle 链接以了解有关 flattening JSON in Python using Pandas 的更多信息

【讨论】:

以上是关于从 Tone Analyser 的 JSON 响应中的字典列表中提取数据 [重复]的主要内容,如果未能解决你的问题,请参考以下文章

IOS/Swift/JSON:使用 swiftyJSON 解析嵌套的 JSON

Safari 中不存在 Analyser.getFloatTimeDomainData()

nmon Analyser

偶像大师 白金星光的Variable Tone技术大公开!偶像从哪里看都那么可爱,VA小组谈制作方针

什么是 AudioFlinger,为什么它会失败 TONE_PROP_ACK?

Tone Mapping Correction