如何按索引级别和值对分组的多索引熊猫系列进行排序？

Posted 2023-03-12

技术标签:

【中文标题】如何按索引级别和值对分组的多索引熊猫系列进行排序？【英文标题】：How to sort grouped multi-index pandas series by index level and values? 【发布时间】：2019-06-29 03:53:23 【问题描述】：

我有一个熊猫系列：

import numpy as np
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
s = pd.Series(np.random.randn(8), index=index)
s
Out[3]: 
first  second
bar    one      -1.111475
       two      -0.644368
baz    one       0.027621
       two       0.130411
foo    one      -0.942718
       two      -1.335731
qux    one       1.277417
       two      -0.242090
dtype: float64

如何按每个组中的值对这个系列进行排序？

例如，qux 组的第一行应该有两个，-0.242090，然后是第一行，1.277417。组栏排序良好，因为 -1.111475 低于 -0.644368。

我需要像 s.groupby(level=0).sort_values() 这样的东西。

【问题讨论】：

【参考方案1】：

使用sort_values:

np.random.seed(0)
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
s = pd.Series(np.random.randn(8), index=index)

s = (s.reset_index(name='value')
      .sort_values(['first', 'value'])
      .set_index(['first', 'second'])['value'])
s.name = None

print(s)
first  second
bar    two       0.400157
       one       1.764052
baz    one       0.978738
       two       2.240893
foo    two      -0.977278
       one       1.867558
qux    two      -0.151357
       one       0.950088
dtype: float64

【讨论】：

【参考方案2】：

您可以使用np.lexsort 按您的第一个索引级别first 排序，然后按值排序second。

np.random.seed(0)
s = pd.Series(np.random.randn(8), index=index)

s = s.iloc[np.lexsort((s.values, s.index.get_level_values(0)))]

print(s)

# first  second
# bar    two       0.400157
#        one       1.764052
# baz    one       0.978738
#        two       2.240893
# foo    two      -0.977278
#        one       1.867558
# qux    two      -0.151357
#        one       0.950088
# dtype: float64

【讨论】：

以上是关于如何按索引级别和值对分组的多索引熊猫系列进行排序？的主要内容，如果未能解决你的问题，请参考以下文章