读取json数据并嵌套读取值，保存到excel中。将句子进行jieba分词，保存到excel中

Posted 2021-09-08 Coding With you.....

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了读取json数据并嵌套读取值，保存到excel中。将句子进行jieba分词，保存到excel中相关的知识，希望对你有一定的参考价值。

1.数据样式

{"source": "PMC", 
"date": "20140719", 
"key": "pmc.key", 
"infons": {}, 
"documents": [{"id": "555756", "infons": {}, 
                          "passages": [{"offset": 0, 
                                                "infons": {"name_3": "sunames:Seppo A",
                                                "text": "Gluten-free diet may alleviate depressive and behavioural symptoms in adolescents with coeliac disease: a prospective follow-up case-series study", "sentences": [],                                                 "annotations": [{"id": "MIC1", 
                                                                           "infons": {"type": "MeSH_Indexing_Chemical",                                                                                                "entry_term": "Amino Acids"}, 
                                                                           "text": "", "locations": []}, 
                                                                         {"id": "MIC2", 
                                                                           "infons": {"type":

2.代码

主要提取text 和entry_term并且保存到数据库中，其中

import jieba
import json
import jsonpath


file_ = open('555756_v1.json')
text_ = json.load(file_ )
texteach = jsonpath.jsonpath(text_,"$.documents[0].passages[0].text") #读取句子内容
lenn_entry = jsonpath.jsonpath(text_,"$.documents[2].passages") 

eachentry = jsonpath.jsonpath(text_,"$.documents[0].passages[0].annotations[0].infons.entry_term") 

print("结果：",len(text_ ['documents'][0]['passages']))#输出实体的个数，便于遍历


-----------------------------------------------------------------------------------

import json
import jsonpath
import xlwt

file_ = open('555756_v1.json')
text_ = json.load(file_ )

# 创建一个workbook 设置编码
workbook = xlwt.Workbook(encoding = 'utf-8')
worksheet = workbook.add_sheet("my1")
eachentry1 = jsonpath.jsonpath(text_,"$.documents[0].passages[0].annotations[0].infons.entry_term") 
for i in range(len(text_ ['documents'][0]['passages'])):
    #文本
    w="$.documents[0].passages["+str(i)+"].text"
    texteach = jsonpath.jsonpath(text_,w) 
   
    #实体
    m="$.documents[0].passages["+str(i)+"].annotations[0].infons.entry_term"
    
    eachentry = jsonpath.jsonpath(text_,m) 
    worksheet.write(i, 0,eachentry)  # 第i行0列
    worksheet.write(i, 1, texteach) # 第i行1列
print("结果：ok")
# 保存
workbook.save('Excel_test.xls')

3.结巴分词

cut_text = jieba.cut("Gluten-free diet may alleviate depressive and behavioural symptoms in adolescents with coeliac disease: a prospective follow-up case-series study")
result = " ".join(cut_text)
#print("句子：Gluten-free diet may alleviate depressive and behavioural symptoms in adolescents with coeliac disease: a prospective follow-up case-series study")
#print("结果：",result)

以上是关于读取json数据并嵌套读取值，保存到excel中。将句子进行jieba分词，保存到excel中的主要内容，如果未能解决你的问题，请参考以下文章