把我的字典变成熊猫数据框

Posted 2023-03-12

技术标签:

【中文标题】把我的字典变成熊猫数据框【英文标题】：Turning my dictionary into a pandas dataframe 【发布时间】：2020-05-11 14:56:03 【问题描述】：

我有一个函数可以根据某些条件创建几个 dicts 的 dicts。

但是，我真的很想在收集它之后将 dict 变成一个数据框。但我找不到一个简单的方法来做到这一点......现在我在想解决方案是将字典中的每个键乘以最内部字典中的键数，但希望有更好的方法

由于我的函数创建了 dict，如果有更好的方法，我可以以任何方式更改它。

这是我现在的字典

'TSLA': 2011: 'negative': 'lowPrice': 185.16,
    'lowDate': '05/27/19',
    'highPrice': 365.71,
    'highDate': '12/10/18',
    'change': -0.49,
  2012: 'negative': 'lowPrice': 185.16,
    'lowDate': '05/27/19',
    'highPrice': 365.71,
    'highDate': '12/10/18',
    'change': -0.49,
  2013: 'negative': 'lowPrice': 32.91,
    'lowDate': '01/07/13',
    'highPrice': 37.24,
    'highDate': '03/26/12',
    'change': -0.12,
   'positive': 'lowPrice': 32.91,
    'lowDate': '01/07/13',
    'highPrice': 190.9,
    'highDate': '09/23/13',
    'change': 4.8

我想要的输出应该是这样的，当然还有值：

                    lowPrice lowDate highPrice highDate change
ATVI  2012 Negative      NaN     NaN       NaN      NaN  NaN
           Positive      NaN     NaN       NaN      NaN  NaN
      2013 Negative      NaN     NaN       NaN      NaN  NaN
TSLA  2014 Positive      NaN     NaN       NaN      NaN  NaN
      2012 Negative      NaN     NaN       NaN      NaN  NaN
      2013 Positive      NaN     NaN       NaN      NaN  NaN
      2014 Positive      NaN     NaN       NaN      NaN  NaN

【问题讨论】：

【参考方案1】：

对于键的元组，您可以将嵌套字典展平 2 次并传递给 DataFrame.from_dict：

d1 = (k1, k2, k3): v3 
      for k1, v1 in d.items() 
      for k2, v2 in v1.items()
      for k3, v3 in v2.items()

df = pd.DataFrame.from_dict(d1, orient='index')
#alternative
#df = pd.DataFrame(d1).T

print (df)
                   lowPrice   lowDate highPrice  highDate change
TSLA 2011 negative   185.16  05/27/19    365.71  12/10/18  -0.49
     2012 negative   185.16  05/27/19    365.71  12/10/18  -0.49
     2013 negative    32.91  01/07/13     37.24  03/26/12  -0.12
          positive    32.91  01/07/13     190.9  09/23/13    4.8

【讨论】：

【参考方案2】：

类似但你也可以使用from_dict:

df=pd.DataFrame.from_dict((i, j, x) : y
                           for i in d.keys()
                           for j in d[i].keys()
                           for x, y in d[i][j].items(),
                           orient='index')

print (df)

                    lowPrice   lowDate  highPrice  highDate  change
TSLA 2011 negative    185.16  05/27/19     365.71  12/10/18   -0.49
     2012 negative    185.16  05/27/19     365.71  12/10/18   -0.49
     2013 negative     32.91  01/07/13      37.24  03/26/12   -0.12
          positive     32.91  01/07/13     190.90  09/23/13    4.80

【讨论】：

【参考方案3】：

参考：Construct pandas DataFrame from items in nested dictionary

df = pd.DataFrame.from_dict((i,j): dict_[i][j][z] 
                               for i in dict_.keys() 
                               for j in dict_[i].keys()
                               for z in dict_[i][j].keys(),
                           orient='index')
df


           lowPrice   lowDate  highPrice  highDate  change
TSLA 2011    185.16  05/27/19     365.71  12/10/18   -0.49
     2012    185.16  05/27/19     365.71  12/10/18   -0.49
     2013     32.91  01/07/13     190.90  09/23/13    4.80

【讨论】：

【参考方案4】：

x = 'TSLA': 2011: 'negative': 'lowPrice': 185.16,
    'lowDate': '05/27/19',
    'highPrice': 365.71,
    'highDate': '12/10/18',
    'change': -0.49,
  2012: 'negative': 'lowPrice': 185.16,
    'lowDate': '05/27/19',
    'highPrice': 365.71,
    'highDate': '12/10/18',
    'change': -0.49,
  2013: 'negative': 'lowPrice': 32.91,
    'lowDate': '01/07/13',
    'highPrice': 37.24,
    'highDate': '03/26/12',
    'change': -0.12,
   'positive': 'lowPrice': 32.91,
    'lowDate': '01/07/13',
    'highPrice': 190.9,
    'highDate': '09/23/13',
    'change': 4.8

y = []
z = []
for k0 in x:
    for k1 in x[k0]:
        for k2 in x[k0][k1]:
            y .append((k0, k1, k2))     
            col = x[k0][k1][k2].keys()
            for c in col:
                z.append(x[k0][k1][k2][c])


index = pd.MultiIndex.from_tuples(y)
df = pd.DataFrame(columns=col, index=index)
z  = np.array(z).reshape(df.shape)
df = pd.DataFrame(columns=col, index=index, data=z)

print(df)

                   lowPrice   lowDate highPrice  highDate change
TSLA 2011 negative   185.16  05/27/19    365.71  12/10/18  -0.49
     2012 negative   185.16  05/27/19    365.71  12/10/18  -0.49
     2013 negative    32.91  01/07/13     37.24  03/26/12  -0.12
          positive    32.91  01/07/13     190.9  09/23/13    4.8

【讨论】：

以上是关于把我的字典变成熊猫数据框的主要内容，如果未能解决你的问题，请参考以下文章