Pandas Groupby 列并获得频率为 0
Posted
技术标签:
【中文标题】Pandas Groupby 列并获得频率为 0【英文标题】:Pandas Groupby columns and get a frequency of 0 【发布时间】:2020-11-17 11:14:29 【问题描述】:我有一个数据框,我想按 Col1 Col2 Col3 分组并获得 Value 列的 0 频率: df =
Col1 Col2 Col3 Value
Val1 Val2 A 0
Val1 Val2 A 1
Val1 Val2 A 2
Val1 Val2 A 0
Val1 Val2 A 1
Val1 Val2 B 0
Val1 Val2 B 0
Val1 Val2 B 0
Val1 Val2 B 0
Val1 Val2 B 1
...
如何应用groupby来实现
Col1 Col2 Col3 Fercentage_of_0
Val1 Val2 A 0.2
Val1 Val2 B 0.8
...
谢谢!
【问题讨论】:
df['Value'].eq(0).groupby([df['Col1'],df['Col2'],df['Col3']]).mean()
?
@QuangHoang 谢谢!你从哪里学来的?
【参考方案1】:
一个简单的lambda
函数为您完成。生成一个列表,其中Value==0
获取此列表的 len 和组中项目的 len。你有百分比
df = pd.DataFrame("Col1":["Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1"],"Col2":["Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2"],"Col3":["A","A","A","A","A","B","B","B","B","B"],"Value":[0,1,2,0,1,0,0,0,0,1])
df.groupby(["Col1","Col2","Col3"]).\
agg("Value":lambda x: len([v for v in x if v==0])/len(x))
输出
Value
Col1 Col2 Col3
Val1 Val2 A 0.4
B 0.8
【讨论】:
【参考方案2】:在数据帧上使用 groupby,然后在结果数据帧上应用 size() 方法。 例如,假设您创建了一个名为 df 的数据框,其中包含这些值
df = pd.DataFrame('Col1': ['Val1','Val1','Val1','Val1','Val1','Val1','Val1','Val1'],
'Col2': ['Val2','Val2','Val2','Val2','Val2','Val2','Val2','Val2'],
'Col3': ['A','A','A','A','B','B','B','B'],
'Value':[0,1,2,0,0,0,0,1])
然后可以使用
找到单个元素的频率计数df.groupby(['Col1','Col2','Col3','Value']).size()
Col1 Col2 Col3 Value
Val1 Val2 A 0 2
1 1
2 1
B 0 3
1 1
dtype: int64
【讨论】:
【参考方案3】:这是另一种不使用 lambda 的方法,我觉得这更容易理解:
df['is_zero'] = df['Value'] == 0
df.groupby(['Col1', 'Col2', 'Col3'])['is_zero'].mean()
【讨论】:
【参考方案4】:为Value
创建一个等于0 的布尔列,并在Col
列上进行分组
(
df.assign(Percentage_Of_0=lambda x: x.Value.eq(0))
.groupby(["Col1", "Col2", "Col3"], as_index=False)
.Percentage_Of_0.mean()
)
Col1 Col2 Col3 Percentage_Of_0
0 Val1 Val2 A 0.4
1 Val1 Val2 B 0.8
【讨论】:
以上是关于Pandas Groupby 列并获得频率为 0的主要内容,如果未能解决你的问题,请参考以下文章
Pandas:groupby A 列并从其他列创建元组列表?
具有多列的groupby,在pandas中具有添加和频率计数[重复]