R:获取值不为空的列名

Posted

技术标签:

【中文标题】R:获取值不为空的列名【英文标题】:R: Get column names where value is not null 【发布时间】:2021-06-14 01:55:27 【问题描述】:

我有一个 7 列的表,第一列是 id,然后是 3 列蔬菜类型,最后 3 列是水果类型。这些值表明一个人是否有这种蔬菜/水果。有没有办法对蔬菜和水果进行分组,如果这个人有蔬菜/水果,则输出列名?

输入数据框:

id1 <- c("id_1", 1, NA, NA, NA, 1, NA)
id2 <- c("id_2", NA, 1, 1, NA, NA, NA)
input <- data.frame(rbind(id1, id2))
colnames(input) = c("id", "lettuce", "tomato", "bellpeper", "pineapple", "apple", "banana")

预期的输出数据帧:

output_id1 <- c("id_1", "lettuce", "apple")
output_id2 <- c("id_2", "tomato, bellpeper", NA)
output <- data.frame(rbind(output_id1, output_id2))
colnames(output) <- c("id", "veg", "fruit")

【问题讨论】:

如果您对返回名称中包含“veg”和“fruit”不感兴趣,您可以从一开始就完全删除这些词吗?比如说,在进行任何分析之前重命名输入数据框。 是的,这是可行的。我将编辑问题 【参考方案1】:

使用您发布的原始input 数据(也显示在下面的数据 中),您可以使用tidyr 包做到这一点:

library(tidyr)

input %>% 
  tidyr::pivot_longer(cols = matches("^veg|^fruit"),
                      names_sep = "_",
                      names_to = c("type", "val"),
                      values_drop_na = T) %>% 
  tidyr::pivot_wider(id_cols = id,
                     names_from = type,
                     values_from = val,
                     values_fn = function(x) paste0(x, collapse = ","))

输出

  id    veg              fruit
  <chr> <chr>            <chr>
1 id_1  lettuce          apple
2 id_2  tomato,bellpeper NA   

数据

input <- structure(list(id = c("id_1", "id_2"), veg_lettuce = c("1", NA
), veg_tomato = c(NA, "1"), veg_bellpeper = c(NA, "1"), fruit_pineapple = c(NA_character_, 
NA_character_), fruit_apple = c("1", NA), fruit_banana = c(NA_character_, 
NA_character_)), class = "data.frame", row.names = c("id1", "id2"
))

【讨论】:

【参考方案2】:

这应该可以解决问题!

id1 <- c("id_1", 1, NA, NA, NA, 1, NA)
id2 <- c("id_2", NA, 1, 1, 1, NA, NA)
input <- data.frame(rbind(id1, id2))
colnames(input) = c("id", "lettuce", "tomato", "bellpeper", "pineapple", "apple", "banana")

# Remove the id column, it's not necessary
input_without_id <- dplyr::select(input, -c("id"))

# For each row (margin = 1) of the input, return the names vector (names(input))
# but only in the positions the where the row (x!) is not NA
result <- apply(input_without_id, MARGIN = 1, function(x) 
    return(names(input_without_id)[which(!is.na(x))])
)

# Rename the result with the corresponding ids originally found in input.
names(result) <- input$id

【讨论】:

【参考方案3】:

这是tidyverse 解决方案:

library(tidyverse)

input %>% 
  pivot_longer(-id) %>% 
  group_by(id) %>% 
  separate(name, into = c('type', 'class'), sep = "_") %>% 
  na.omit() %>% 
  select(-value) %>% 
  group_by(id, type) %>% 
  summarise(class = toString(class)) %>% 
  ungroup() %>% 
  pivot_wider(names_from = type, values_from = class) %>% 
  unnest() %>% 
  select(id, veg, fruit)

这给了我们:

# A tibble: 2 x 3
  id    veg               fruit
  <chr> <chr>             <chr>
1 id_1  lettuce           apple
2 id_2  tomato, bellpeper NA  

【讨论】:

以上是关于R:获取值不为空的列名的主要内容,如果未能解决你的问题,请参考以下文章

如何在mysql中获取值不为null的列名

SQL:如何在 B 列不为空的情况下使用“100”更新 A 列

查找值不为null的列sql语句

关于ASP语句判定某数输入值不为空

JQ选择值包含字符串且值不为空的对象

选择性过滤列值不为空的行 PostgreSQL