R:获取值不为空的列名
Posted
技术标签:
【中文标题】R:获取值不为空的列名【英文标题】:R: Get column names where value is not null 【发布时间】:2021-06-14 01:55:27 【问题描述】:我有一个 7 列的表,第一列是 id,然后是 3 列蔬菜类型,最后 3 列是水果类型。这些值表明一个人是否有这种蔬菜/水果。有没有办法对蔬菜和水果进行分组,如果这个人有蔬菜/水果,则输出列名?
输入数据框:
id1 <- c("id_1", 1, NA, NA, NA, 1, NA)
id2 <- c("id_2", NA, 1, 1, NA, NA, NA)
input <- data.frame(rbind(id1, id2))
colnames(input) = c("id", "lettuce", "tomato", "bellpeper", "pineapple", "apple", "banana")
预期的输出数据帧:
output_id1 <- c("id_1", "lettuce", "apple")
output_id2 <- c("id_2", "tomato, bellpeper", NA)
output <- data.frame(rbind(output_id1, output_id2))
colnames(output) <- c("id", "veg", "fruit")
【问题讨论】:
如果您对返回名称中包含“veg”和“fruit”不感兴趣,您可以从一开始就完全删除这些词吗?比如说,在进行任何分析之前重命名输入数据框。 是的,这是可行的。我将编辑问题 【参考方案1】:使用您发布的原始input
数据(也显示在下面的数据 中),您可以使用tidyr
包做到这一点:
library(tidyr)
input %>%
tidyr::pivot_longer(cols = matches("^veg|^fruit"),
names_sep = "_",
names_to = c("type", "val"),
values_drop_na = T) %>%
tidyr::pivot_wider(id_cols = id,
names_from = type,
values_from = val,
values_fn = function(x) paste0(x, collapse = ","))
输出
id veg fruit
<chr> <chr> <chr>
1 id_1 lettuce apple
2 id_2 tomato,bellpeper NA
数据
input <- structure(list(id = c("id_1", "id_2"), veg_lettuce = c("1", NA
), veg_tomato = c(NA, "1"), veg_bellpeper = c(NA, "1"), fruit_pineapple = c(NA_character_,
NA_character_), fruit_apple = c("1", NA), fruit_banana = c(NA_character_,
NA_character_)), class = "data.frame", row.names = c("id1", "id2"
))
【讨论】:
【参考方案2】:这应该可以解决问题!
id1 <- c("id_1", 1, NA, NA, NA, 1, NA)
id2 <- c("id_2", NA, 1, 1, 1, NA, NA)
input <- data.frame(rbind(id1, id2))
colnames(input) = c("id", "lettuce", "tomato", "bellpeper", "pineapple", "apple", "banana")
# Remove the id column, it's not necessary
input_without_id <- dplyr::select(input, -c("id"))
# For each row (margin = 1) of the input, return the names vector (names(input))
# but only in the positions the where the row (x!) is not NA
result <- apply(input_without_id, MARGIN = 1, function(x)
return(names(input_without_id)[which(!is.na(x))])
)
# Rename the result with the corresponding ids originally found in input.
names(result) <- input$id
【讨论】:
【参考方案3】:这是tidyverse
解决方案:
library(tidyverse)
input %>%
pivot_longer(-id) %>%
group_by(id) %>%
separate(name, into = c('type', 'class'), sep = "_") %>%
na.omit() %>%
select(-value) %>%
group_by(id, type) %>%
summarise(class = toString(class)) %>%
ungroup() %>%
pivot_wider(names_from = type, values_from = class) %>%
unnest() %>%
select(id, veg, fruit)
这给了我们:
# A tibble: 2 x 3
id veg fruit
<chr> <chr> <chr>
1 id_1 lettuce apple
2 id_2 tomato, bellpeper NA
【讨论】:
以上是关于R:获取值不为空的列名的主要内容,如果未能解决你的问题,请参考以下文章