Pandas数据框按顺序分组[重复]
Posted
技术标签:
【中文标题】Pandas数据框按顺序分组[重复]【英文标题】:Pandas dataframe group by order [duplicate] 【发布时间】:2019-03-20 01:40:15 【问题描述】:我有输入数据框:
df1 = pandas.DataFrame(
"Name" : ["Alice", "Bob", "Mallory", "Mallory","Mallory", "Bob" ,"Bob", "Mallory", "Alice"] ,
"City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland", "Portland", "Seattle", "Seattle"] )
我想按名称分组,但不是唯一的,所以输出应该是:
["Alice","Bob","Mallory","Bob","Mallory", "Alice"]
我找不到任何有效的方法 - 有没有不迭代所有行的方法?
【问题讨论】:
df1.groupby(df1.Name.ne(df1.Name.shift()).cumsum()).Name.first()
【参考方案1】:
您可以执行以下操作:
df1.groupby((df1['Name'] != df1['Name'].shift()).cumsum()).first()
产量:
Name City
Name
1 Alice Seattle
2 Bob Seattle
3 Mallory Portland
4 Bob Portland
5 Mallory Seattle
6 Alice Seattle
如果您只想要'Name'
列:
df1.groupby((df1['Name'] != df1['Name'].shift()).cumsum())['Name'].first().values
产量:
['Alice' 'Bob' 'Mallory' 'Bob' 'Mallory' 'Alice']
【讨论】:
以上是关于Pandas数据框按顺序分组[重复]的主要内容,如果未能解决你的问题,请参考以下文章