读取json数据并嵌套读取值,保存到excel中。将句子进行jieba分词,保存到excel中

Posted Coding With you.....

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了读取json数据并嵌套读取值,保存到excel中。将句子进行jieba分词,保存到excel中相关的知识,希望对你有一定的参考价值。

1.数据样式

{"source": "PMC", 
"date": "20140719", 
"key": "pmc.key", 
"infons": {}, 
"documents": [{"id": "555756", "infons": {}, 
                          "passages": [{"offset": 0, 
                                                "infons": {"name_3": "sunames:Seppo A",
                                                "text": "Gluten-free diet may alleviate depressive and behavioural symptoms in adolescents with coeliac disease: a prospective follow-up case-series study", "sentences": [],                                                 "annotations": [{"id": "MIC1", 
                                                                           "infons": {"type": "MeSH_Indexing_Chemical",                                                                                                "entry_term": "Amino Acids"}, 
                                                                           "text": "", "locations": []}, 
                                                                         {"id": "MIC2", 
                                                                           "infons": {"type":

2.代码

主要提取text 和entry_term并且保存到数据库中,其中

import jieba
import json
import jsonpath


file_ = open('555756_v1.json')
text_ = json.load(file_ )
texteach = jsonpath.jsonpath(text_,"$.documents[0].passages[0].text") #读取句子内容
lenn_entry = jsonpath.jsonpath(text_,"$.documents[2].passages") 

eachentry = jsonpath.jsonpath(text_,"$.documents[0].passages[0].annotations[0].infons.entry_term") 

print("结果:",len(text_ ['documents'][0]['passages']))#输出实体的个数,便于遍历


-----------------------------------------------------------------------------------

import json
import jsonpath
import xlwt

file_ = open('555756_v1.json')
text_ = json.load(file_ )

# 创建一个workbook 设置编码
workbook = xlwt.Workbook(encoding = 'utf-8')
worksheet = workbook.add_sheet("my1")
eachentry1 = jsonpath.jsonpath(text_,"$.documents[0].passages[0].annotations[0].infons.entry_term") 
for i in range(len(text_ ['documents'][0]['passages'])):
    #文本
    w="$.documents[0].passages["+str(i)+"].text"
    texteach = jsonpath.jsonpath(text_,w) 
   
    #实体
    m="$.documents[0].passages["+str(i)+"].annotations[0].infons.entry_term"
    
    eachentry = jsonpath.jsonpath(text_,m) 
    worksheet.write(i, 0,eachentry)  # 第i行0列
    worksheet.write(i, 1, texteach) # 第i行1列
print("结果:ok")
# 保存
workbook.save('Excel_test.xls')


    

3.结巴分词

cut_text = jieba.cut("Gluten-free diet may alleviate depressive and behavioural symptoms in adolescents with coeliac disease: a prospective follow-up case-series study")
result = " ".join(cut_text)
#print("句子:Gluten-free diet may alleviate depressive and behavioural symptoms in adolescents with coeliac disease: a prospective follow-up case-series study")
#print("结果:",result)

 

以上是关于读取json数据并嵌套读取值,保存到excel中。将句子进行jieba分词,保存到excel中的主要内容,如果未能解决你的问题,请参考以下文章

使用 python/pandas 从特定文件夹中读取几个嵌套的 .json 文件到 excel 中

C#读取Excel中嵌套的Json对象,Json带斜杠的问题(其一)

C#读取EXCEL中的信息,并保存到数据库

POI读取Excel数据保存到数据库,并反馈给用户处理信息

如何在python中读取嵌套的json数据值?

如何将 json 从 dash dcc.Store 保存到 excel 文件?