Pandas groupby:如何在使用两列创建 groupby 时以正确的顺序对工作日进行排序?

Posted

技术标签:

【中文标题】Pandas groupby:如何在使用两列创建 groupby 时以正确的顺序对工作日进行排序?【英文标题】:Pandas groupby: How to sort weekdays in the correct order when creating groupby with two columns? 【发布时间】:2019-05-03 16:20:22 【问题描述】:

以下数据框包含一年中每个小时的值 (kWh)。

cons2016.head()

    Date        Hour    kWh     Month   Weekday
0   2016-01-01  00:00   71.48   January Friday
1   2016-01-01  01:00   65.32   January Friday
2   2016-01-01  02:00   65.38   January Friday
3   2016-01-01  03:00   62.44   January Friday
4   2016-01-01  04:00   57.56   January Friday

我想从这个数据框创建一个 Seaborn 热图(垂直轴上以 正确 顺序排列的工作日和水平轴上的小时数)。所以我分组:

weekdayhour = cons2016.groupby(["Weekday", "Hour"]).mean()
weekdayhour = weekdayhour.reset_index()
weekdayhour.head()

    Weekday Hour    kWh
0   Friday  00:00   61.188113
1   Friday  01:00   57.231698
2   Friday  02:00   55.818679
3   Friday  03:00   55.074151
4   Friday  04:00   55.049811

但现在工作日按字母顺序排序(也在热图中):

heat_weekdayhour = weekdayhour.pivot(index="Weekday", columns="Hour", values="kWh")
sns.heatmap(heat_weekdayhour)

我怎样才能按正常顺序获取周一到周日的工作日?我尝试像这样添加 .reindex:

weekdays = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
weekdayhour = cons2016.groupby(["Weekday", "Hour"]).mean().reindex(labels=weekdays)

但这给了我TypeError: Expected tuple, got str

感谢您的帮助!

【问题讨论】:

【参考方案1】:

使用Categorical

weekdays = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
weekdayhour.Weekday = pd.Categorical(weekdayhour.Weekday,categories=weekdays)
weekdayhour = weekdayhour.sort_values('Weekday')
  Weekday   Hour    kWh
0  Friday  00:00  71.48
1  Friday  01:00  65.32
2  Friday  02:00  65.38
3  Friday  03:00  62.44
4  Friday  04:00  57.56

更多信息:

weekdayhour.Weekday
0    Friday
1    Friday
2    Friday
3    Friday
4    Friday
Name: Weekday, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]

【讨论】:

谢谢。所以这个根本不需要 groupby!【参考方案2】:
import pandas as pd

#You first create your list in the order you want it
days = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]

#Using Categorical() function to set the order according to how it is arranged above
df["DOTW_Appointment"] = pd.Categorical(df.DOTW_Appointment, categories=days, ordered=True)

【讨论】:

以上是关于Pandas groupby:如何在使用两列创建 groupby 时以正确的顺序对工作日进行排序?的主要内容,如果未能解决你的问题,请参考以下文章

Python Pandas 计算两列的 value_counts 并使用 groupby

Pandas DataFrame Groupby 两列并获取计数

pandas groupby 并为各自的总数聚合两列,然后计算比率 - 总结摘要

如何使用子图创建 Pandas groupby 图

Pandas-分组函数和分层索引的展开

如何从 pandas groupby().sum() 的输出创建一个新列?