Seaborn.countplot :按数量排序,也按类别排序?

Posted

技术标签:

【中文标题】Seaborn.countplot :按数量排序,也按类别排序?【英文标题】:Seaborn.countplot : order categories by count, also by category? 【发布时间】:2019-08-20 11:25:24 【问题描述】:

所以我了解如何对条形图进行排序(即here)。我找不到的是如何按子类别之一对条形图进行排序。

例如,给定以下数据框,我可以获得条形图。但我想做的是,按TypeClassic 将其从大到小排序。

import pandas as pd

test_df = pd.DataFrame([
['Jake',    38, 'MW',   'Classic'],
['John',    38,'NW',    'Classic'],
['Sam', 34, 'SE',   'Classic'],
['Sam', 22, 'E' ,'Classic'],
['Joe', 43, 'ESE2', 'Classic'],
['Joe', 34, 'MTN2', 'Classic'],
['Joe', 38, 'MTN2', 'Classic'],
['Scott',   38, 'ESE2', 'Classic'],
['Chris',   34, 'SSE1', 'Classic'],
['Joe', 43, 'S1',   'New'],
['Paul',    34, 'NE2',  'New'],
['Joe', 38, 'MC1',  'New'],
['Joe', 34, 'NE2',  'New'],
['Nick',    38, 'MC1',  'New'],
['Al',  38, 'SSE1', 'New'],
['Al',  34, 'ME',   'New'],
['Al',  34, 'MC1',  'New'],
['Joe', 43, 'S1',   'New']], columns = ['Name','Code_A','Code_B','Type'])


import seaborn as sns
sns.set(style="darkgrid")
palette ="Classic":"#FF9999","New":"#99CC99"


g = sns.countplot(y="Name",
                  palette=palette,
                  hue="Type",
                  data=test_df)

所以而不是:

'Joe' 会在顶部,然后是'Sam',等等。

【问题讨论】:

【参考方案1】:

添加order 参数。使用pandas.crosstabsort_values 获得:

import pandas as pd

test_df = pd.DataFrame([
['Jake',    38, 'MW',   'Classic'],
['John',    38,'NW',    'Classic'],
['Sam', 34, 'SE',   'Classic'],
['Sam', 22, 'E' ,'Classic'],
['Joe', 43, 'ESE2', 'Classic'],
['Joe', 34, 'MTN2', 'Classic'],
['Joe', 38, 'MTN2', 'Classic'],
['Scott',   38, 'ESE2', 'Classic'],
['Chris',   34, 'SSE1', 'Classic'],
['Joe', 43, 'S1',   'New'],
['Paul',    34, 'NE2',  'New'],
['Joe', 38, 'MC1',  'New'],
['Joe', 34, 'NE2',  'New'],
['Nick',    38, 'MC1',  'New'],
['Al',  38, 'SSE1', 'New'],
['Doug',    34, 'ME',   'New'],
['Fred',    34, 'MC1',  'New'],
['Joe', 43, 'S1',   'New']], columns = ['Name','Code_A','Code_B','Type'])


import seaborn as sns
sns.set(style="darkgrid")
palette ="Classic":"#FF9999","New":"#99CC99"

order = pd.crosstab(test_df.Name, test_df.Type).sort_values('Classic', ascending=False).index
g = sns.countplot(y="Name",
                  palette=palette,
                  hue="Type",
                  data=test_df,
                  order=order
                 )

【讨论】:

天哪。我在想它要复杂得多。好的。明白了。 我接受这个答案,因为它正是我想要的。但只是好奇,这将按“经典”排序。如果我想让它先按“新”排序怎么办(我调整了上面的测试数据) 实际上更新了我的答案,误读了关于它是按“经典”计数排序的部分 哦,完美!再次感谢【参考方案2】:
import pandas as pd

test_df = pd.DataFrame([
['Jake',    38, 'MW',   'Classic'],
['John',    38,'NW',    'Classic'],
['Sam', 34, 'SE',   'Classic'],
['Sam', 22, 'E' ,'Classic'],
['Joe', 43, 'ESE2', 'Classic'],
['Joe', 34, 'MTN2', 'Classic'],
['Joe', 38, 'MTN2', 'Classic'],
['Scott',   38, 'ESE2', 'Classic'],
['Chris',   34, 'SSE1', 'Classic'],
['Joe', 43, 'S1',   'New'],
['Paul',    34, 'NE2',  'New'],
['Joe', 38, 'MC1',  'New'],
['Joe', 34, 'NE2',  'New'],
['Nick',    38, 'MC1',  'New'],
['Al',  38, 'SSE1', 'New'],
['Al',  34, 'ME',   'New'],
['Al',  34, 'MC1',  'New'],
['Joe', 43, 'S1',   'New']], columns = ['Name','Code_A','Code_B','Type'])


import seaborn as sns
sns.set(style="darkgrid")
palette ="Classic":"#FF9999","New":"#99CC99"

sb.countplot(y = 'Name', hue='Type', data=test_df, 
order=test_df['Name'].value_counts().index)

【讨论】:

以上是关于Seaborn.countplot :按数量排序,也按类别排序?的主要内容,如果未能解决你的问题,请参考以下文章

Seaborn:带有频率的计数图()

追加按放置数量排序的项目

如何按在 sql/namedQuery 中找到的关键字数量排序

按匹配数量排序结果(来自搜索)

Postgresql 选择你可能认识的人,按共同朋友的数量排序

按特定数据对字符串进行排序[重复]