绘制分组的熊猫数据框

Posted

技术标签:

【中文标题】绘制分组的熊猫数据框【英文标题】:Plotting a grouped pandas dataframe 【发布时间】:2019-12-14 00:37:34 【问题描述】:

我花了几个小时寻找答案,但似乎找不到答案。

长话短说,我有一个数据框。以下代码将生成有问题的数据帧(尽管使用随机数匿名):

variable1 = ["Attribute 1","Attribute 1","Attribute 1","Attribute 1","Attribute 1","Attribute 1","Attribute 2","Attribute 2",
         "Attribute 2","Attribute 2","Attribute 2","Attribute 2","Attribute 3","Attribute 3","Attribute 3","Attribute 3",
         "Attribute 3","Attribute 3","Attribute 4","Attribute 4","Attribute 4","Attribute 4","Attribute 4","Attribute 4",
         "Attribute 5","Attribute 5","Attribute 5","Attribute 5","Attribute 5","Attribute 5"]


variable2 = ["Property1","Property2","Property3","Property4","Property5","Property6","Property1","Property2","Property3",
         "Property4","Property5","Property6","Property1","Property2","Property3",
         "Property4","Property5","Property6","Property1","Property2","Property3","Property4",
         "Property5","Property6","Property1","Property2","Property3","Property4","Property5","Property6"]

number = [93,224,192,253,186,266,296,100,135,169,373,108,211,194,164,375,211,71,120,334,59,164,348,50,249,18,251,343,172,41]

bar = pd.DataFrame("variable1":variable1, "variable2":variable2, "number":number)

bar_grouped = bar.groupby(["variable1","variable2"]).sum()

结果应该是这样的:

第二个:

我一直在尝试用条形图绘制它们,并将属性作为组,将不同的属性作为条形。与此类似(尽管在 Excel 中手动绘制)。我更喜欢在分组数据农场中进行,以便能够使用不同的分组进行绘图,而无需每次都重置索引。

我希望这很清楚。

非常感谢您对此的任何帮助。

谢谢! :)

【问题讨论】:

试试bar_grouped['number'].unstack(0).plot(kind='bar') 【参考方案1】:

我不会费心创建您的 groupby 结果(因为您没有汇总任何内容)。这是pivot


bar.pivot('variable2', 'variable1', 'number').plot(kind='bar')

plt.tight_layout()
plt.show()


如果需要聚合,您仍然可以从您的bar 开始并使用pivot_table

bar.pivot_table(index='variable2', columns='variable1', values='number', aggfunc='sum')

【讨论】:

聚合是必需的,因为这只是更大的 DataFrame 的一部分,它最终具有更多可以聚合的值,所以谢谢! :)【参考方案2】:

首先使用unstack

bar_grouped['number'].unstack(0).plot(kind='bar')

[出]

【讨论】:

【参考方案3】:

以下代码将执行您尝试建立的操作:

import numpy as np
import matplotlib.pyplot as plt

# set width of bar
barWidth = 0.25
f = plt.figure(figsize=(15,8))

bars=
bar_pos=
for i,proprty in enumerate(bar_grouped.unstack().columns.droplevel(0).tolist()):
    bars[i] = bar_grouped.unstack()['number',proprty].tolist()
    if(i==0):
        bar_pos[i]=2*np.arange(len(bars1))
    else:
        bar_pos[i]=[x + barWidth for x in bar_pos[i-1]] 
    plt.bar(bar_pos[i], bars[i], width=barWidth, edgecolor='white', label=proprty, figure=f)

# Add xticks on the middle of the group bars
plt.xlabel('group', fontweight='bold')
plt.xticks([2*r + 2*barWidth for r in range(len(bars[0]))], bar_grouped.unstack().index.tolist())
# plt.figure(figsize=(10,5))

# Create legend & Show graphic
plt.legend(loc=0)
plt.show()

我从here 获取了解决方案并对其进行了修改以满足您的需要。希望这会有所帮助!

【讨论】:

以上是关于绘制分组的熊猫数据框的主要内容,如果未能解决你的问题,请参考以下文章

将熊猫数据框按两列分组而不汇总

根据月份绘制熊猫数据框

熊猫数据框分组图

熊猫数据框列的分组和计数

如何按定义的时间间隔对熊猫数据框进行分组?

熊猫数据框分组求和