使用包 dplyr 的summerise

Posted 2023-03-24

技术标签:

【中文标题】使用包 dplyr 的summerise【英文标题】：using summerise of package dplyr 【发布时间】：2021-12-13 16:56:16 【问题描述】：

我想知道在使用summarize函数时，使用了多少个值来计算mean

    table<- df %>%  group_by(x) %>%   summarise_if(is.numeric, mean, na.rm = TRUE)

【问题讨论】：

你能做一个 R 可重现的例子吗？ 【参考方案1】：

也添加计数摘要。（通过查看是否为 na，然后将它们相加）

注意，summarise_if 已被 across() 取代

table<- df %>%  group_by(x) %>%
    summarise(across(where(is.numeric), list(mean = ~ mean(.x, na.rm = TRUE), n = ~sum(!is.na(.x)))))

【讨论】：

谢谢！这正是我想要的。【参考方案2】：

我可能错了，但我相信简单地使用 dplyr 的 count() 应该可以工作。见下文：

# Creating a demonstrative data frame
colors <- c('red', 'green', 'red', 'green', 'red', 'green', 'green')
obs <- c(1, 2, 3, 1, 5, 2, 6)
mytable <- data.frame(colors, obs)

# Checking the summarise function
mytable %>%
  group_by(colors) %>%
  summarise_if(is.numeric, mean)

# First approach, using summarise, n = n
mytable %>%
  group_by(colors) %>%
  summarise(n = n())

# Second, more elegant approach using count
mytable %>% 
  count(colors)

如果需要，您可以添加filter 或subset 函数来测试数据是否为数字。

【讨论】：

以上是关于使用包 dplyr 的summerise的主要内容，如果未能解决你的问题，请参考以下文章