把我的字典变成熊猫数据框
Posted
技术标签:
【中文标题】把我的字典变成熊猫数据框【英文标题】:Turning my dictionary into a pandas dataframe 【发布时间】:2020-05-11 14:56:03 【问题描述】:我有一个函数可以根据某些条件创建几个 dicts 的 dicts。
但是,我真的很想在收集它之后将 dict 变成一个数据框。 但我找不到一个简单的方法来做到这一点......现在我在想解决方案是将字典中的每个键乘以最内部字典中的键数,但希望有更好的方法
由于我的函数创建了 dict,如果有更好的方法,我可以以任何方式更改它。
这是我现在的字典
'TSLA': 2011: 'negative': 'lowPrice': 185.16,
'lowDate': '05/27/19',
'highPrice': 365.71,
'highDate': '12/10/18',
'change': -0.49,
2012: 'negative': 'lowPrice': 185.16,
'lowDate': '05/27/19',
'highPrice': 365.71,
'highDate': '12/10/18',
'change': -0.49,
2013: 'negative': 'lowPrice': 32.91,
'lowDate': '01/07/13',
'highPrice': 37.24,
'highDate': '03/26/12',
'change': -0.12,
'positive': 'lowPrice': 32.91,
'lowDate': '01/07/13',
'highPrice': 190.9,
'highDate': '09/23/13',
'change': 4.8
我想要的输出应该是这样的,当然还有值:
lowPrice lowDate highPrice highDate change
ATVI 2012 Negative NaN NaN NaN NaN NaN
Positive NaN NaN NaN NaN NaN
2013 Negative NaN NaN NaN NaN NaN
TSLA 2014 Positive NaN NaN NaN NaN NaN
2012 Negative NaN NaN NaN NaN NaN
2013 Positive NaN NaN NaN NaN NaN
2014 Positive NaN NaN NaN NaN NaN
【问题讨论】:
【参考方案1】:对于键的元组,您可以将嵌套字典展平 2 次并传递给 DataFrame.from_dict
:
d1 = (k1, k2, k3): v3
for k1, v1 in d.items()
for k2, v2 in v1.items()
for k3, v3 in v2.items()
df = pd.DataFrame.from_dict(d1, orient='index')
#alternative
#df = pd.DataFrame(d1).T
print (df)
lowPrice lowDate highPrice highDate change
TSLA 2011 negative 185.16 05/27/19 365.71 12/10/18 -0.49
2012 negative 185.16 05/27/19 365.71 12/10/18 -0.49
2013 negative 32.91 01/07/13 37.24 03/26/12 -0.12
positive 32.91 01/07/13 190.9 09/23/13 4.8
【讨论】:
【参考方案2】:类似但你也可以使用from_dict
:
df=pd.DataFrame.from_dict((i, j, x) : y
for i in d.keys()
for j in d[i].keys()
for x, y in d[i][j].items(),
orient='index')
print (df)
lowPrice lowDate highPrice highDate change
TSLA 2011 negative 185.16 05/27/19 365.71 12/10/18 -0.49
2012 negative 185.16 05/27/19 365.71 12/10/18 -0.49
2013 negative 32.91 01/07/13 37.24 03/26/12 -0.12
positive 32.91 01/07/13 190.90 09/23/13 4.80
【讨论】:
【参考方案3】:参考:Construct pandas DataFrame from items in nested dictionary
df = pd.DataFrame.from_dict((i,j): dict_[i][j][z]
for i in dict_.keys()
for j in dict_[i].keys()
for z in dict_[i][j].keys(),
orient='index')
df
lowPrice lowDate highPrice highDate change
TSLA 2011 185.16 05/27/19 365.71 12/10/18 -0.49
2012 185.16 05/27/19 365.71 12/10/18 -0.49
2013 32.91 01/07/13 190.90 09/23/13 4.80
【讨论】:
【参考方案4】:x = 'TSLA': 2011: 'negative': 'lowPrice': 185.16,
'lowDate': '05/27/19',
'highPrice': 365.71,
'highDate': '12/10/18',
'change': -0.49,
2012: 'negative': 'lowPrice': 185.16,
'lowDate': '05/27/19',
'highPrice': 365.71,
'highDate': '12/10/18',
'change': -0.49,
2013: 'negative': 'lowPrice': 32.91,
'lowDate': '01/07/13',
'highPrice': 37.24,
'highDate': '03/26/12',
'change': -0.12,
'positive': 'lowPrice': 32.91,
'lowDate': '01/07/13',
'highPrice': 190.9,
'highDate': '09/23/13',
'change': 4.8
y = []
z = []
for k0 in x:
for k1 in x[k0]:
for k2 in x[k0][k1]:
y .append((k0, k1, k2))
col = x[k0][k1][k2].keys()
for c in col:
z.append(x[k0][k1][k2][c])
index = pd.MultiIndex.from_tuples(y)
df = pd.DataFrame(columns=col, index=index)
z = np.array(z).reshape(df.shape)
df = pd.DataFrame(columns=col, index=index, data=z)
print(df)
lowPrice lowDate highPrice highDate change
TSLA 2011 negative 185.16 05/27/19 365.71 12/10/18 -0.49
2012 negative 185.16 05/27/19 365.71 12/10/18 -0.49
2013 negative 32.91 01/07/13 37.24 03/26/12 -0.12
positive 32.91 01/07/13 190.9 09/23/13 4.8
【讨论】:
以上是关于把我的字典变成熊猫数据框的主要内容,如果未能解决你的问题,请参考以下文章