使用 mutate_ 进行标准评估以按组计算百分比

Posted

技术标签:

【中文标题】使用 mutate_ 进行标准评估以按组计算百分比【英文标题】:Standard evaluation with mutate_ to calculate percentages by group 【发布时间】:2016-06-15 14:07:11 【问题描述】:

我正在尝试使用dplyr 的标准评估来计算百分比作为两个分组变量的函数。问题出在我的mutate_ statement 中。

这是一个数据集:

structure(list(
    var1 = structure(c(2L, 1L, 1L, 2L, 1L, 2L, 1L, 
    2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 
    2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 
    2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 
    2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 
    1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 
    2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L
    ), 
    .Label = c("No", "Yes"), class = "factor"), 
    var2 = structure(c(2L, 2L, 1L, 2L, 
    2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 
    1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 
    1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 
    2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 
    2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 
    1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L
    ), 
    .Label = c("Female", "Male"), class = "factor")), 
    .Names = c("var1", "var2"), row.names = c(NA, -100L), class = "data.frame")

这是我正在使用的代码:

for_plots = function(data, var1, var2)
  grouped_data = data %>% group_by_(var1, var2) %>% 
  summarise_(n_in_group = ~n()) %>% 
  mutate_(.dots = setNames(list(
    interp(quote(n_in_group / sum(n_in_group, na.rm = TRUE) * 100),
           n_in_group = as.name(n_in_group)))
    ))
  return(grouped_data)

当我运行代码时,我收到一个错误:

setNames 中的错误(list(interp(quote(n_in_group/sum(n_in_group, na.rm = TRUE) * : 缺少参数“nm”,没有默认值

有什么想法吗?

【问题讨论】:

那里没有理由使用 SE。您在函数中定义了变量名称 n_in_group,因此不需要将其视为动态输入... @Frank 谢谢。以下代码有效:for_plots = function(data, var1, var2) grouped_data = data %>% group_by_(var1, var2) %>% summarise_(​​n_in_group = ~n()) %>% mutate(percent = (n_in_group / sum(n_in_group, na.rm = TRUE)) * 100) 返回(grouped_data) 【参考方案1】:

这是基于@Frank 回复的一些代码:

for_plots = function(data, var1, var2)  
   grouped_data = data %>% group_by_(var1, var2) %>% 
     summarise_(n_in_group = ~n()) %>% 
     mutate(percent = (n_in_group / sum(n_in_group, na.rm = TRUE)) * 100) 
   return(grouped_data) 
 

【讨论】:

以上是关于使用 mutate_ 进行标准评估以按组计算百分比的主要内容,如果未能解决你的问题,请参考以下文章

用 R 中的多列按组计算百分比

有没有办法使用聚合命令按组计算不对称平均值(例如从百分位数 0.05 到 0.5)? R-工作室

ggplot barplot 按组给出百分比

按组占总数的百分比

为data.frame中的多个变量按组计算平均值和标准差

计算边界框重叠的百分比,用于图像检测器评估