读取json数据并嵌套读取值,保存到excel中。将句子进行jieba分词,保存到excel中
Posted Coding With you.....
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了读取json数据并嵌套读取值,保存到excel中。将句子进行jieba分词,保存到excel中相关的知识,希望对你有一定的参考价值。
1.数据样式
{"source": "PMC",
"date": "20140719",
"key": "pmc.key",
"infons": {},
"documents": [{"id": "555756", "infons": {},
"passages": [{"offset": 0,
"infons": {"name_3": "sunames:Seppo A",
"text": "Gluten-free diet may alleviate depressive and behavioural symptoms in adolescents with coeliac disease: a prospective follow-up case-series study", "sentences": [], "annotations": [{"id": "MIC1",
"infons": {"type": "MeSH_Indexing_Chemical", "entry_term": "Amino Acids"},
"text": "", "locations": []},
{"id": "MIC2",
"infons": {"type":
2.代码
主要提取text 和entry_term并且保存到数据库中,其中
import jieba
import json
import jsonpath
file_ = open('555756_v1.json')
text_ = json.load(file_ )
texteach = jsonpath.jsonpath(text_,"$.documents[0].passages[0].text") #读取句子内容
lenn_entry = jsonpath.jsonpath(text_,"$.documents[2].passages")
eachentry = jsonpath.jsonpath(text_,"$.documents[0].passages[0].annotations[0].infons.entry_term")
print("结果:",len(text_ ['documents'][0]['passages']))#输出实体的个数,便于遍历
-----------------------------------------------------------------------------------
import json
import jsonpath
import xlwt
file_ = open('555756_v1.json')
text_ = json.load(file_ )
# 创建一个workbook 设置编码
workbook = xlwt.Workbook(encoding = 'utf-8')
worksheet = workbook.add_sheet("my1")
eachentry1 = jsonpath.jsonpath(text_,"$.documents[0].passages[0].annotations[0].infons.entry_term")
for i in range(len(text_ ['documents'][0]['passages'])):
#文本
w="$.documents[0].passages["+str(i)+"].text"
texteach = jsonpath.jsonpath(text_,w)
#实体
m="$.documents[0].passages["+str(i)+"].annotations[0].infons.entry_term"
eachentry = jsonpath.jsonpath(text_,m)
worksheet.write(i, 0,eachentry) # 第i行0列
worksheet.write(i, 1, texteach) # 第i行1列
print("结果:ok")
# 保存
workbook.save('Excel_test.xls')
3.结巴分词
cut_text = jieba.cut("Gluten-free diet may alleviate depressive and behavioural symptoms in adolescents with coeliac disease: a prospective follow-up case-series study")
result = " ".join(cut_text)
#print("句子:Gluten-free diet may alleviate depressive and behavioural symptoms in adolescents with coeliac disease: a prospective follow-up case-series study")
#print("结果:",result)
以上是关于读取json数据并嵌套读取值,保存到excel中。将句子进行jieba分词,保存到excel中的主要内容,如果未能解决你的问题,请参考以下文章
使用 python/pandas 从特定文件夹中读取几个嵌套的 .json 文件到 excel 中