如何计算python字典中列表元素的平均值？

Posted 2023-03-24

技术标签:

【中文标题】如何计算python字典中列表元素的平均值？【英文标题】：How to calculate the mean of elements of a lists inside a dictionary in python? 【发布时间】：2021-12-03 07:02:21 【问题描述】：

嗨，我在 python 中有一个 dict，看起来像这样：

'NN3-001': 'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80],
'NN3-002': 'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]

其中 NN3-X 是时间序列的 id，diff 和 mas 是模型的名称，_ 后面的数字是模型执行的时间。

并且我希望列表的每个 i 元素的平均值与另一个列表的 i 元素对应，它们具有相同的模型名称，例如：1，来自 diffe_1，加上 5，来自 diffe_2，平均值为 3，最终结果如下：

'NN3-001': 'diffe':[3,4,5,6], 'mas':[30,40,50,60],
'NN3-002': 'diffe':[16,17,18,19], 'mas':[300,400,500,600]

谢谢。

【问题讨论】：

您的字典不正确。 Python 不能使用 'NN3-001': 'diffe_1': [1,2,3,4] - 它需要像 'NN3-001': 'diffe_1':[1,2,3,4] 你尝试了什么？你的代码在哪里？ data['NN3-001']['diffe_1'] 获取列表[3,4,5,6] 怎么样？然后你可以为这个列表计算mean。如果你想计算所有的，那么你必须使用for-loops 和dict.items() 您在开始时仍然有不正确的字典，并且您还显示不正确的字典和结果。 :) 【参考方案1】：

首先：您的示例不是正确的字典。你在某些地方错过了。

你应该有


    'NN3-001': 'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80],
    'NN3-002': 'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]

要获得单个列表，您可以使用

values = data['NN3-001']['diffe_1']

你可以计算mean

mean = sum(values)/len(values)

对于所有列表，您必须使用 for-loops 和 dict.items()

dictionary = 
    'NN3-001': 'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80],
    'NN3-002': 'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]


for name, values in dictionary.items():
    print('=== time serie:', name, '===')
    for key, data in values.items():
        print('  key:', key)
        print(' data:', data)
        print(' mean:', sum(data)/len(data))
        print('---')

结果：

=== time serie: NN3-001 ===
  key: diffe_1
 data: [1, 2, 3, 4]
 mean: 2.5
---
  key: mas_1
 data: [10, 20, 30, 40]
 mean: 25.0
---
  key: diffe_2
 data: [5, 6, 7, 8]
 mean: 6.5
---
  key: mas_2
 data: [50, 60, 70, 80]
 mean: 65.0
---
=== time serie: NN3-002 ===
  key: diffe_1
 data: [14, 15, 16, 17]
 mean: 15.5
---
  key: mas_1
 data: [100, 200, 300, 400]
 mean: 250.0
---
  key: diffe_2
 data: [18, 19, 20, 21]
 mean: 19.5
---
  key: mas_2
 data: [500, 600, 700, 800]
 mean: 650.0

编辑：

在更改问题后，我发现您需要 zip(diffe_1, diffe_2) 来创建对。

dictionary = 
    'NN3-001': 'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80],
    'NN3-002': 'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]


result = 

for name, values in dictionary.items():
    print('=== time serie:', name, '===')
    
    result[name] = 'diff':[], 'mas':[]
    
    print('--- diffe_1, diffe_2 ---')
    for a, b in zip(values['diffe_1'],values['diffe_2']):
        mean = int( (a+b)/2 )
        print(a, '&', b, '=>', mean)
        result[name]['diff'].append(mean)
        
    print('--- mas_1, mas_2 ---')
    for a, b in zip(values['mas_1'],values['mas_2']):
        mean = int( (a+b)/2 )
        print(a, '&', b, '=>', mean)
        result[name]['mas'].append(mean)

print(result)

给予

=== time serie: NN3-001 ===
--- diffe_1, diffe_2 ---
1 & 5 => 3.0
2 & 6 => 4.0
3 & 7 => 5.0
4 & 8 => 6.0
--- mas_1, mas_2 ---
10 & 50 => 30.0
20 & 60 => 40.0
30 & 70 => 50.0
40 & 80 => 60.0
=== time serie: NN3-002 ===
--- diffe_1, diffe_2 ---
14 & 18 => 16.0
15 & 19 => 17.0
16 & 20 => 18.0
17 & 21 => 19.0
--- mas_1, mas_2 ---
100 & 500 => 300.0
200 & 600 => 400.0
300 & 700 => 500.0
400 & 800 => 600.0



'NN3-001': 'diff': [3, 4, 5, 6], 'mas': [30, 40, 50, 60],  
'NN3-002': 'diff': [16, 17, 18, 19], 'mas': [300, 400, 500, 600]

您也可以使用循环for prefix in ['diffe', 'mas']: 来减少代码。

dictionary = 
    'NN3-001': 'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80],
    'NN3-002': 'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]


result = 

for name, values in dictionary.items():
    print('=== time serie:', name, '===')
    
    
    result[name] = 
    
    for prefix in ['diffe', 'mas']:

        print('--- prefix:', prefix, '---')
        
        result[name][prefix] = []

        for a, b in zip(values[prefix+'_1'],values[prefix+'_2']):
            mean = int( (a+b)/2 )
            print(a, '&', b, '=>', mean)
            result[name][prefix].append(mean)
        
print(result)

【讨论】：

感谢您的回答，但我想我没有解释得很好，我想要相同型号和相同id时间系列列表的2个元素之间的平均值。这需要zip(diffe_1, diffe_2) 才能获得配对。但它可能需要在zip()中使用更多的手动设置名称带前缀的那个效果很好。谢谢=) 您可以将我的答案标记为已接受，几分钟后您可以投票。【参考方案2】：

首先，是的，您的字典无效，但这可能是因为您只写了两行。你可能想这样写：

dictionary = 
    'NN3-001': 
        'diffe_1':[1,2,3,4],
        'mas_1':[10,20,30,40],
        'diffe_2':[5,6,7,8],
        'mas_2':[50,60,70,80],
        ,
    'NN3-002': 
        'diffe_1':[14,15,16,17],
        'mas_1':[100,200,300,400],
        'diffe_2':[18,19,20,21],
        'mas_2':[500,600,700,800],
        ,

对于均值计算函数：

def compute_mean(dictionary):
    new_dictionary = 
    # Loop on 'NN3-' level
    for key, sub_dictionary in dictionary.items():
        new_sub_dictionary, accumulated_arrays = , 
        # Loop on 'diffe_' level
        for sub_key, list in sub_dictionary.items():
            # Extract the sub_key without the _n
            sub_key = sub_key.split('_')[0]
            # If we already encountered this sub_key
            if sub_key in new_sub_dictionary:
                new_sub_dictionary[sub_key] += np.array(list)
                accumulated_arrays[sub_key] += 1
            # If we haven't encountered this sub_key
            else:
                new_sub_dictionary[sub_key] = np.array(list)
                accumulated_arrays[sub_key] = 0
        # Compute mean and convert back to list 
        for sub_key, array in new_sub_dictionary.items():
            new_sub_dictionary[sub_key] = list(array / accumulated_arrays[sub_key])
        # Add to the main dictionary
        new_dictionary[key] = new_sub_dictionary
    return new_dictionary

【讨论】：

以上是关于如何计算python字典中列表元素的平均值？的主要内容，如果未能解决你的问题，请参考以下文章