没有 NaN 值空间的 Pandas 绘图条

Posted

技术标签:

【中文标题】没有 NaN 值空间的 Pandas 绘图条【英文标题】:Pandas plot bar without NaN values spaces 【发布时间】:2019-04-23 06:03:40 【问题描述】:

我有一个包含 NaN 值的 pandas DataFrame。我想用 x axys 中的索引制作一个条形图,每列都有一个条形图,按索引分组。我想只绘制具有实际值的条形图。

就我而言,从这个例子:

df = pandas.DataFrame('foo':[1,None,None], 'bar':[None,2,0.5], 'col': [1,1.5,None], index=["A","B","C"])
df.plot.bar()
plt.show()

我可以制作这个情节:

我想要删除为 NaN 列留下的空格。所以要压缩条形并将组居中在 x 刻度上方。

【问题讨论】:

无法直接使用 pandas。您可以查看how matplotlib grouped barcharts 的制作并根据您的情况进行调整。 这能回答你的问题吗? How do you remove spaces between bars in bar charts for where plotted values are zero? 【参考方案1】:

您可以通过遍历数据框的每一行来执行以下代码的操作 并检查每一列的 NaN。

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(
    "foo": [1, None, None], "bar": [None, 2, 0.5], "col": [1, 1.5, None],
    index=["A", "B", "C"],
)


# define the colors for each column
colors = "foo": "blue", "bar": "orange", "col": "green"

fig = plt.figure(figsize=(10, 6))
ax = plt.gca()

# width of bars
width = 1

# create emptly lists for x tick positions and names
x_ticks, x_ticks_pos = [], []

# counter for helping with x tick positions
count = 0

# reset the index
# so that we can iterate through the numbers.
# this will help us to get the x tick positions
df = df.reset_index()
# go through each row of the dataframe
for idx, row in df.iterrows():
    # this will be the first bar position for this row
    count += idx

    # this will be the start of the first bar for this row
    start_idx = count - width / 2
    # this will be the end of the last bar for this row
    end_idx = start_idx
    # for each column in the wanted columns,
    # if the row is not null,
    # add the bar to the plot
    # also update the end position of the bars for this row
    for column in df.drop(["index"], axis=1).columns:
        if row[column] == row[column]:
            plt.bar(count, row[column], color=colors[column], width=width, label=column)
            count += 1
            end_idx += width
    # this checks if the row had any not NULL value in the desired columns
    # in other words, it checks if there was any bar for this row
    # if yes, add the center of all the row's bars and the row's name (A,B,C) to the respective lists
    if end_idx != start_idx:
        x_ticks_pos.append((end_idx + start_idx) / 2)
        x_ticks.append(row["index"])

# now set the x_ticks
plt.xticks(x_ticks_pos, x_ticks)

# also plot the legends
# and make sure to not display duplicate labels
# the below code is taken from:
# https://***.com/questions/13588920/stop-matplotlib-repeating-labels-in-legend
handles, labels = plt.gca().get_legend_handles_labels()
by_label = dict(zip(labels, handles))
plt.legend(by_label.values(), by_label.keys())
plt.show()

结果:

【讨论】:

以上是关于没有 NaN 值空间的 Pandas 绘图条的主要内容,如果未能解决你的问题,请参考以下文章

pandas:用列中的最后一个非 NaN 值替换 NaN [重复]

如何将 Pandas 系列中的连续 NaN 值分组到一组切片中?

通过使用 pandas 在时间序列中在先前的 NaN 之间分配值来回填值

python 值比较判断,np.nan is np.nan 却 np.nan != np.nan ,pandas 单个数据框值判断nan

展会&NBSP如果值空

Pandas 映射将所有值返回为 NaN [重复]