如何为多个组绘制带有注释的堆叠条
Posted
技术标签:
【中文标题】如何为多个组绘制带有注释的堆叠条【英文标题】:How to plot a stacked bar with annotations for multiple groups 【发布时间】:2021-12-20 15:32:57 【问题描述】:在直方图中,2 个柱之间出现了一个间隙。有人知道为什么吗?
我收到此错误:
FixedLocator 位置的数量 (11),通常来自对 set_ticks 的调用,与刻度标签的数量 (10) 不匹配。
csv 文件只有 2 列,一列是国家名称,另一列是获得的奖牌类型,每一行都有奖牌的类型和国家。
文件的链接是:https://github.com/jpiedehierroa/files/blob/main/Libro1.csv
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
my_csv = Path("C:/Usersjosep/Desktop/Libro1.csv")
df = pd.read_csv("Libro1.csv", sep=',')
# or load from github repo link
url = 'https://raw.githubusercontent.com/jpiedehierroa/files/main/Libro1.csv'
df = pd.read_csv(url)
# Prepare data
x_var = 'countries'
groupby_var = 'type'
df_agg = df.loc[:,[x_var, groupby_var]].groupby(groupby_var)
vals = [df[x_var].values.tolist() for i, df in df_agg]
# Draw
plt.figure(figsize=(10,10), dpi= 100)
colors= ("#CD7F32","silver","gold")
n, bins, patches = plt.hist(vals, df[x_var].unique().__len__(), stacked=True, density=False, color=colors[:len(vals)])
# Decoration
plt.legend(["bronze", "silver","gold"], loc="upper right")
plt.title(f"Histogram of medals achieved by $x_var$ colored by $groupby_var$ in Tokyo 2020", fontsize=18)
plt.text(2,80,"138")
plt.xlabel(x_var)
plt.ylabel("amount of medals by type")
plt.ylim(0, 130)
plt.xticks(ticks=bins, labels=np.unique(df[x_var]).tolist(), rotation=90, horizontalalignment='left')
plt.show()
测试数据
万一链接失效countries,type
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,silver
Australia,silver
Australia,silver
Australia,silver
Australia,silver
Australia,silver
Australia,silver
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
France,gold
France,gold
France,gold
France,gold
France,gold
France,gold
France,gold
France,gold
France,gold
France,gold
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
【问题讨论】:
如果您使用的是 jupyter,那么all the plot and labeling code should be in the same cell。如果您正在使用 anaconda,请在 anaconda 提示符处使用conda update --all
进行更新
嗨特伦顿,你是完全正确的。它在 spyder 更新后工作。问题是我必须在 jupyter 中呈现它,所以,要结束这个话题,在同一个单元格中是指相同的代码行?谢谢。
就像我之前评论的链接中显示的那样。一个单元格可以有多行代码。我所有的代码和示例都在jupyter lab
中运行
好的,但是,你如何分离这些细胞......因为我什至不知道存在。我的意思是,如果我理解正确的话,我开始编码,但在同一个“块”/单元格上。我看到你区分负载/形状和情节。是这个意思吗?
该示例在 Jupyter Notebook 中制作绘图时,任何影响绘图格式的代码都必须位于同一单元格中,如屏幕截图所示。
【参考方案1】:
这更容易实现为堆积条形图,因此,使用pandas.crosstab
重塑数据框并使用pandas.DataFrame.plot
与kind='bar'
和stacked=True
绘制
这不应该用plt.hist
来实现,因为它更复杂,直接使用pandas plot方法更容易。
当 x 值是一个连续的数字范围,而不是离散的分类值时,直方图更合适。
ct.iloc[:, :-1]
选择除最后一列之外的所有列,'tot'
被绘制为条形。
使用matplotlib.pyplot.bar_label
添加注解
ax.bar_label(ax.containers[2], padding=3)
默认使用label_type='edge'
,这导致使用累积和注释边缘('center'
使用补丁值注释),如answer 所示。
ax.containers[2]
中的[2]
仅选择顶部容器以使用累积和进行注释。 containers
从底部开始索引为 0。
有关更多详细信息和示例,请参阅此answer
这个answer 展示了如何在没有.bar_label
的情况下以旧方式进行注释。我不推荐它。
此answer 展示了如何自定义标签以防止对给定大小以下的值进行注释。
在python 3.10
、pandas 1.3.5
、matplotlib 3.5.1
测试
加载和调整 DataFrame
import pandas as pd
# load from github repo link
url = 'https://raw.githubusercontent.com/jpiedehierroa/files/main/Libro1.csv'
df = pd.read_csv(url)
# reshape the dataframe
ct = pd.crosstab(df.countries, df.type)
# total medals per country, which is necessary to sort the bars
ct['tot'] = ct.sum(axis=1)
# sort
ct = ct.sort_values(by='tot', ascending=False)
# display(ct)
type bronze gold silver tot
countries
USA 33 39 41 113
China 18 38 32 88
ROC 23 20 28 71
GB 22 22 21 65
Japan 17 27 14 58
Australia 22 17 7 46
Italy 20 10 10 40
Germany 16 10 11 37
Netherlands 14 10 12 36
France 11 10 12 33
情节
colors = ("#CD7F32", "silver", "gold")
cd = dict(zip(ct.columns, colors))
# plot the medals columns
title = 'Country Medal Count for Tokyo 2020'
ax = ct.iloc[:, :-1].plot(kind='bar', stacked=True, color=cd, title=title,
figsize=(12, 5), rot=0, width=1, ec='k' )
# annotate each container with individual values
for c in ax.containers:
ax.bar_label(c, label_type='center')
# annotate the top containers with the cumulative sum
ax.bar_label(ax.containers[2], padding=3)
# pad the spacing between the number and the edge of the figure
ax.margins(y=0.1)
另一种用总和注释顶部的方法是将'tot'
列用于自定义标签,但如图所示,这不是必需的。
labels = ct.tot.tolist()
ax.bar_label(ax.containers[2], labels=labels, padding=3)
【讨论】:
以上是关于如何为多个组绘制带有注释的堆叠条的主要内容,如果未能解决你的问题,请参考以下文章