使用 matplotlib 绘制堆积条形图,保持 pandas 数据框的顺序,因为它使用 python

Posted

技术标签:

【中文标题】使用 matplotlib 绘制堆积条形图,保持 pandas 数据框的顺序,因为它使用 python【英文标题】:plot a stacked bar chart using matplotlib keeping the pandas dataframe order as it is using python 【发布时间】:2021-05-19 00:22:21 【问题描述】:

您好,我有一个包含 3 列的 df:“Day-Shift”、“State”、“seconds”。

需要使用堆栈条形图可视化这些数据,但保持数据不变。

 Day-Shift   State          seconds
    Day 01-05   A              7439
    Day 01-05   STOPPED        0
    Day 01-05   B              10
    Day 01-05   C              35751
    Night 01-05 C              43200
    Day 01-06   STOPPED        7198
    Day 01-06   F              18
    Day 01-06   A              14
    Day 01-06   A              29301
    Day 01-06   STOPPED        6
    Day 01-06   A              6663
    Night 01-06 A              43200

图表 X 轴:"Day-shift",Y:"seconds",color:"State"" 我使用 python plotly 创建了堆栈栏,但数据顺序与数据框中的不同。

例如:在 01-08 晚上,需要按照这个 df 的顺序(停止 -> B -> 运行 -> D)绘制相同的图。但是图表显示 D-> Stopped ->B -> Running 。有什么方法可以创建堆栈栏,使其保持与 使用 matplotlib 相同的数据帧顺序?我之前没用过matplotlib。

DF:

df = pd.DataFrame('Day-Shift': 0: 'Day 01-05',
  1: 'Day 01-05',
  2: 'Day 01-05',
  3: 'Day 01-05',
  4: 'Night 01-05',
  5: 'Day 01-06',
  6: 'Day 01-06',
  7: 'Day 01-06',
  8: 'Day 01-06',
  9: 'Day 01-06',
  10: 'Day 01-06',
  11: 'Night 01-06',
  12: 'Day 01-07',
  13: 'Night 01-07',
  14: 'Night 01-07',
  15: 'Night 01-07',
  16: 'Night 01-07',
  17: 'Night 01-07',
  18: 'Night 01-08',
  19: 'Night 01-08',
  20: 'Night 01-08',
  21: 'Night 01-08',
  22: 'Day 01-08',
  23: 'Day 01-08',
  24: 'Day 01-08',
  25: 'Night 01-09',
  26: 'Night 01-09',
  27: 'Night 01-09',
  28: 'Day 01-09',
  29: 'Day 01-09',
  30: 'Day 01-09',
  31: 'Day 01-09',
  32: 'Day 01-10',
  33: 'Night 01-10',
  34: 'Day 01-11',
  35: 'Day 01-11',
  36: 'Day 01-11',
  37: 'Day 01-11',
  38: 'Day 01-11',
  39: 'Night 01-11',
  40: 'Day 01-12',
  41: 'Night 01-12',
  42: 'Day 01-13',
  43: 'Day 01-13',
  44: 'Day 01-13',
  45: 'Day 01-13',
  46: 'Day 01-13',
  47: 'Day 01-13',
  48: 'Day 01-13',
  49: 'Night 01-13',
  50: 'Day 01-14',
  51: 'Day 01-14',
  52: 'Day 01-14',
  53: 'Day 01-14',
  54: 'Day 01-14',
  55: 'Day 01-14',
  56: 'Day 01-14',
  57: 'Day 01-14',
  58: 'Day 01-14',
  59: 'Night 01-14',
 'State': 0: 'D',
  1: 'STOPPED',
  2: 'B',
  3: 'A',
  4: 'A',
  5: 'A',
  6: 'A1',
  7: 'A2',
  8: 'A3',
  9: 'A4',
  10: 'B1',
  11: 'B1',
  12: 'B1',
  13: 'B1',
  14: 'B2',
  15: 'STOPPED',
  16: 'RUNNING',
  17: 'B',
  18: 'STOPPED',
  19: 'B',
  20: 'RUNNING',
  21: 'D',
  22: 'STOPPED',
  23: 'B',
  24: 'RUNNING',
  25: 'STOPPED',
  26: 'RUNNING',
  27: 'B',
  28: 'RUNNING',
  29: 'STOPPED',
  30: 'B',
  31: 'D',
  32: 'B',
  33: 'B',
  34: 'B',
  35: 'RUNNING',
  36: 'STOPPED',
  37: 'D',
  38: 'A',
  39: 'A',
  40: 'A',
  41: 'A',
  42: 'A',
  43: 'A1',
  44: 'A2',
  45: 'A3',
  46: 'A4',
  47: 'B1',
  48: 'B2',
  49: 'B2',
  50: 'B2',
  51: 'B',
  52: 'STOPPED',
  53: 'A',
  54: 'A1',
  55: 'A2',
  56: 'A3',
  57: 'A4',
  58: 'B1',
  59: 'B1',
 'seconds': 0: 7439,
  1: 0,
  2: 10,
  3: 35751,
  4: 43200,
  5: 7198,
  6: 18,
  7: 14,
  8: 29301,
  9: 6,
  10: 6663,
  11: 43200,
  12: 43200,
  13: 5339,
  14: 8217,
  15: 0,
  16: 4147,
  17: 1040,
  18: 24787,
  19: 1500,
  20: 14966,
  21: 1410,
  22: 2499,
  23: 1310,
  24: 39391,
  25: 3570,
  26: 17234,
  27: 47390,
  28: 36068,
  29: 270,
  30: 6842,
  31: 20,
  32: 43200,
  33: 43200,
  34: 2486,
  35: 8420,
  36: 870,
  37: 30,
  38: 31394,
  39: 43200,
  40: 43200,
  41: 43200,
  42: 36733,
  43: 23,
  44: 6,
  45: 4,
  46: 4,
  47: 3,
  48: 6427,
  49: 43200,
  50: 620,
  51: 0,
  52: 4,
  53: 41336,
  54: 4,
  55: 4,
  56: 4,
  57: 23,
  58: 1205,
  59: 43200)

【问题讨论】:

请research 执行matplotlib,然后认真尝试解决方案。使用与尝试相关的特定问题更新帖子。 嗨@Parfait 我研究了很多关于 matplotlib 堆栈栏的信息。但是我的要求有点难。我看到了很多关于 matplotlib 堆栈栏推车的文章。但是没有看到一篇关于堆栈栏保持数据顺序与数据帧顺序相同的文章。 【参考方案1】:

您可以将与每个班次相对应的条形图独立地绘制为堆叠条形图,从而保留原始数据框中的顺序。代码如下:

# Get all the possible states and associate a color to each of them
all_states = df.State.unique()
cm = plt.get_cmap('tab20')  # you can choose the colormap you want
colors = 
    s: cm(1. * i / len(all_states))  # get a different color for each state, sampling the color map
    for i, s in enumerate(all_states)


fig, ax = plt.subplots(1, 1)
day_shifts = df['Day-Shift'].unique()
# Plot the bar of each shift independently, so preserving the order of the stack
for i, d in enumerate(day_shifts):
    total_height = [0]  # total height of the stacked bars so far
    # stack each state on top of the previous ones
    for t in df[df['Day-Shift'] == d].itertuples():
        ax.bar((i,), (t.seconds), bottom=total_height, color=colors[t.State], label=t.State, linewidth=2, edgecolor='w')
        total_height = [total_height[0] + t.seconds]
# Add xticks with labels
ax.set_xticks(list(range(len(day_shifts))))
ax.set_xticklabels(day_shifts, rotation=45, ha='right')

# Create an unique legend, removing duplicates
handles, labels = ax.get_legend_handles_labels()
by_label = OrderedDict(zip(labels, handles))
plt.legend(by_label.values(), by_label.keys(), bbox_to_anchor=(1, 1))

您必须将以下导入添加到您的代码中:

import matplotlib.pyplot as plt
from collections import OrderedDict

【讨论】:

这真的很棒 PieCot 。我尝试了很多选择来实现这一点,但无法成功。非常感谢,伙计。你知道如何手动修复标签的颜色吗?例如:红色表示“停止”,黄色表示“A” 还有,你们的cmet对理解代码很有帮助,真的很感激!!!!!!!!! 您可以手动创建dict colors,为ech State 选择您喜欢的颜色。例如,colors = 'STOPPED: 'r', 'A': 'y', ...

以上是关于使用 matplotlib 绘制堆积条形图,保持 pandas 数据框的顺序,因为它使用 python的主要内容,如果未能解决你的问题,请参考以下文章

python3——matplotlib绘图1

Pandas - 绘制堆积条形图

Python Matplotlib – 在 x 轴上代表采样位置的条形图

绘制水平堆积条形图不适用于日期中的 x 轴

按小时重新采样 Pandas DataFrame 并使用 Plotly 绘制堆积条形图

matplotlib可视化篇barh()--直方图(2)