r 用多列dplyr传播

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了r 用多列dplyr传播相关的知识,希望对你有一定的参考价值。


gen_cats <- function(x, N = 1000) {
    sample(x, N, replace = TRUE)
}

set.seed(101)
N <- 1000

income <- rnorm(N, 100, 50)

vars <- list(stratum = c(1:8),
          sex = c("M", "F"),
          race =  c("B", "W"),
          educ = c("HS", "BA"))

df <- as_tibble(map_dfc(vars, gen_cats))
df <- add_column(df, income)

## stratum, sex, race, educ, income
# datatable way is easy peasy
data.table::setDT(df)
dt_wide <- data.table::dcast(df, sex + race + stratum ~ educ,
              fun = list(mean, length),
              value.var = "income")
              
              
              
# dplyr
## Simple tidy summary
tv_wide1 <- df %>% group_by(sex, race, stratum, educ) %>%
    summarize(mean_inc = mean(income), N = n())
    
## 1. gather 
## 1. gather()
tv_wide2 <- df %>% group_by(sex, race, stratum, educ) %>%
    summarize(mean_inc = mean(income), N = n()) %>%
    gather(variable, value, -(sex:educ))

tv_wide2

## 2. unite()
tv_wide2 <- df %>% group_by(sex, race, stratum, educ) %>%
    summarize(mean_inc = mean(income), N = n()) %>%
    gather(variable, value, -(sex:educ)) %>%
    unite(temp, educ, variable)

tv_wide2

## 3. spread()
tv_wide2 <- df %>% group_by(sex, race, stratum, educ) %>%
    summarize(mean_inc = mean(income), N = n()) %>%
    gather(variable, value, -(sex:educ)) %>%
    unite(temp, educ, variable) %>%
    spread(temp, value)

tv_wide2


multi_spread <- function(df, key, value) {
    # quote key
    keyq <- rlang::enquo(key)
    # break value vector into quotes
    valueq <- rlang::enquo(value)
    s <- rlang::quos(!!valueq)
    df %>% gather(variable, value, !!!s) %>%
        unite(temp, !!keyq, variable) %>%
        spread(temp, value)
}

## Final version
tv_wide3 <- df %>% group_by(sex, race, stratum, educ) %>%
    summarize(mean_inc = mean(income), N = n()) %>%
    multi_spread(educ, c(mean_inc, N))

tv_wide3

以上是关于r 用多列dplyr传播的主要内容,如果未能解决你的问题,请参考以下文章

R语言dplyr包使用case_when函数和mutate函数生成新的数据列实战:基于单列生成新的数据列基于多列生成新的数据列

R语言dplyr包数据列重排(reorder)实战:把特定数据列移动到第一列把特定数据列移动到最后一列数据列多列重排按照字母顺序重排数据列把数据列反序

在 dplyr 窗口函数中使用多列?

使用 dplyr 对多列进行不同操作的汇总

在 dplyr mutate_at 调用中使用多列的函数

r 用dplyr进行标准评估总结。