Python通过多个键单向分组和聚合字典列表
Posted
技术标签:
【中文标题】Python通过多个键单向分组和聚合字典列表【英文标题】:Python Group and aggregate unidirectionally a list of dictionaries by multiple keys 【发布时间】:2022-01-15 02:35:45 【问题描述】:我正在构建一个树选择器,我需要将我的数据构建成一个分组项目树。我有以下输入,这是一个字典列表。
data = [
'region': 'R1', 'group': 'G1', 'category': 'C1', 'item': 'I2',
'region': 'R1', 'group': 'G1', 'category': 'C1', 'item': 'I1',
'region': 'R1', 'group': 'G2', 'category': 'C2', 'item': 'I3',
'region': 'R2', 'group': 'G1', 'category': 'C1', 'item': 'I1',
'region': 'R2', 'group': 'G2', 'category': 'C2', 'item': 'I3',
'region': 'R2', 'group': 'G2', 'category': 'C2', 'item': 'I4',
'region': 'R2', 'group': 'G2', 'category': 'C3', 'item': 'I5',
]
我想得到以下输出
result =
"regions": [
"name": "R1",
"groups": [
"name": "G1",
"categories": [
"name": "C1","items": [ "name": "I2","name": "I1"]
]
,
"name": "G2",
"categories": [
"name": "C2", "items": ["name": "I3"]
]
]
,
"name": "R2",
"groups": [
"name": "G1",
"categories": [
"name": "C1","items": ["name": "I1"]
]
,
"name": "G2",
"categories": [
"name": "C2","items": ["name": "I3","name": "I4"],
"name": "C3", "items": ["name": "I5"]
]
]
]
经过一些研究,我想出了这个解决方案
from collections import OrderedDict
d = OrderedDict()
for aggr in data:
d.setdefault(
key=(aggr['region'], aggr['group'], aggr['category']),
default=list()
).append("name": aggr['item'])
d1 = OrderedDict()
for k, v in d.items():
d1.setdefault(
key=(k[0], k[1]),
default=list()
).append("name": k[2], "items": v)
d2 = OrderedDict()
for k, v in d1.items():
d2.setdefault(
key=k[0],
default=list()
).append("name": k[1], "categories": v)
result = "regions": ["name": k, "groups": v for k, v in d2.items()]
它正在工作,但我相信它不是最 Pythonic 的解决方案。我没有设法简化它。
任何对上述代码提出其他解决方案或改进的帮助将不胜感激
【问题讨论】:
【参考方案1】:只要项目被排序,就像在你的例子中一样,你可以在递归函数中使用来自itertools
的groupby
,比如:
from itertools import groupby
from operator import itemgetter
def plural(word):
return f"words" if word[-1] != 'y' else f"word[:-1]ies"
def grouping(records, *keys):
if len(keys) == 1:
return ["name": record[keys[0]] for record in records]
return [
"name": key, plural(keys[1]): grouping(group, *keys[1:])
for key, group in groupby(records, itemgetter(keys[0]))
]
result = "regions": grouping(data, "region", "group", "category", "item")
如果不能保证排序,则可以通过以下方式调整grouping
def grouping(records, *keys):
if len(keys) == 1:
return ["name": record[keys[0]] for record in records]
key_func = itemgetter(keys[0])
records = sorted(records, key=key_func)
return [
"name": key, plural(keys[1]): grouping(group, *keys[1:])
for key, group in groupby(records, key_func)
]
或预先对data
进行排序
keys = ["region", "group", "category", "item"]
data = sorted(data, key=itemgetter(*keys))
result = "regions": grouping(data, *keys)
问题中提供的data
的第一个版本的结果:
result =
"regions": [
"name": "R1",
"groups": [
"name": "G1",
"categories": [
"name": "C1", "items": ["name": "I2", "name": "I1"]
]
,
"name": "G2",
"categories": [
"name": "C2", "items": ["name": "I3"]
]
]
,
"name": "R2",
"groups": [
"name": "G1",
"categories": [
"name": "C1", "items": ["name": "I1"]
]
,
"name": "G2",
"categories": [
"name": "C2", "items": ["name": "I3", "name": "I4"],
"name": "C3", "items": ["name": "I5"]
]
]
]
【讨论】:
确实,您的答案非常像 Python 一样简单。感谢您花时间提供帮助。但是,它不会输出所需的确切结构。这是因为它仅在最后一个元素上聚合并截断中间节点。实际上上面代码的结果并没有输出数据的最后一个条目'region': 'R2', 'group': 'G2', 'category': 'C3', 'item': 'I5'
聚合应该在所有中间级别上,而不仅仅是最后一个(项目)
@Rukamakama 感谢您的反馈。我有点惊讶:结果确实与您的预期输出完全匹配?
真的很抱歉。问题出在我这边,我使用了与我发布的数据输入不同的数据输入。确实你的答案是正确的。我必须赞成它。非常感谢
@Rukamakama 没有问题!刚刚意识到我忘了给你的问题投票:非常有趣的一个!以上是关于Python通过多个键单向分组和聚合字典列表的主要内容,如果未能解决你的问题,请参考以下文章