创建 json 和/或 xml 的问题
Posted
技术标签:
【中文标题】创建 json 和/或 xml 的问题【英文标题】:Issues with creating json and/or xml 【发布时间】:2022-01-10 14:38:24 【问题描述】:我需要在 python 中编写代码的帮助,我需要编写一个代码来创建一个带有单词在句子中的位置/索引的 json 或 xml,无论单词中的所有字符是否都是字母,最后他们提供给我的句子中每个单词的单词本身。我首先想到的是用一个简单的字典来存储key和value,然后把字典转换成json:
import json
data =
liste = [] # it's for storing all the words after splitting them by space
sentence="As its price tag has been slashed to $1.7trn over a decade, half as much as first pitched, the hunger—or squid—games between progressives and moderates have turned fiercer."
liste = sentence.split(" ")
for word,index in zip(liste,range(0,len(liste))):
data[word.lower()] = "alpha":word.lower().isalpha()
data[word.lower()]['Word'] = word.lower()
data[word.lower()]['Index'] = index
json_data = json.dumps(data,ensure_ascii=False)
print(json_data)
给我打印这个 json:
"as": "alpha": true, "Word": "as", "Number": 15, "its": "alpha": true, "Word": "its", "Number": 1, "price": "alpha": true, "Word": "price", "Number": 2, "tag": "alpha": true, "Word": "tag", "Number": 3, "has": "alpha": true, "Word": "has", "Number": 4, "been": "alpha": true, "Word": "been", "Number": 5, "slashed": "alpha": true, "Word": "slashed", "Number": 6, "to": "alpha": true, "Word": "to", "Number": 7, "$1.7trn": "alpha": false, "Word": "$1.7trn", "Number": 8, "over": "alpha": true, "Word": "over", "Number": 9, "a": "alpha": true, "Word": "a", "Number": 10, "decade,": "alpha": false, "Word": "decade,", "Number": 11, "half": "alpha": true, "Word": "half", "Number": 12, "much": "alpha": true, "Word": "much", "Number":14, "first": "alpha": true, "Word": "first", "Number": 16, "pitched,": "alpha": false, "Word": "pitched,", "Number": 17, "the": "alpha": true, "Word": "the", "Number": 18, "hunger—or": "alpha": false, "Word": "hunger—or", "Number": 19, "squid—games": "alpha": false, "Word": "squid—games", "Number": 20, "between": "alpha": true, "Word": "between", "Number": 21, "progressives": "alpha": true, "Word": "progressives", "Number": 22, "and": "alpha": true, "Word": "and", "Number": 23, "moderates": "alpha": true, "Word": "moderates", "Number": 24, "have": "alpha": true, "Word": "have", "Number": 25, "turned": "alpha": true, "Word": "turned", "Number": 26, "fiercer.": "alpha": false, "Word": "fiercer.", "Number": 27
正如您所见,这个 json 不正确,缺少一些单词(另外两个“as”)。在对***做了一些研究之后,我想我开始明白为什么了:如果我的理解是正确的,一个字典和一个json对象不能多次拥有同一个键。但问题是,在大多数英语句子中,有些单词是重复的。
英文句子示例:由于其价格标签在过去十年中已降至 1.7 万亿美元,是最初价格的一半,进步派和温和派之间的饥饿或鱿鱼游戏变得更加激烈。
在这句话中,单词“as”重复了 3 次,所以我认为在我的代码中,字典中的键被覆盖了两次,因为有 3 个单词“as”。我的想法正确吗?如果是正确的,我该怎么做才能解决这个问题?我可以以某种方式绕过字典或json问题的唯一键吗?我应该使用哪种数据结构以及如何获取 json 或 xml 作为输出?
【问题讨论】:
您可以查看collections.defaultdict
或collections.Counter
。
谢谢@oc11,这就是我要找的!
【参考方案1】:
在 json 中,你不能绕过这个语法,但是你可以添加一个 json 属性到一个单词中:
data[word.lower()]["occurences"]= data[word.lower()]["occurences"] +1 if word.lower() in data else 1
作为旁注,我强烈建议您将常用代码重命名为属性(此处至少为word.lower()
)
【讨论】:
以上是关于创建 json 和/或 xml 的问题的主要内容,如果未能解决你的问题,请参考以下文章
Spring MVC REST - 根据请求内容类型返回 xml 或 json