R：获取值不为空的列名

Posted 2023-03-29

技术标签:

【中文标题】R：获取值不为空的列名【英文标题】：R: Get column names where value is not null 【发布时间】：2021-06-14 01:55:27 【问题描述】：

我有一个 7 列的表，第一列是 id，然后是 3 列蔬菜类型，最后 3 列是水果类型。这些值表明一个人是否有这种蔬菜/水果。有没有办法对蔬菜和水果进行分组，如果这个人有蔬菜/水果，则输出列名？

输入数据框：

id1 <- c("id_1", 1, NA, NA, NA, 1, NA)
id2 <- c("id_2", NA, 1, 1, NA, NA, NA)
input <- data.frame(rbind(id1, id2))
colnames(input) = c("id", "lettuce", "tomato", "bellpeper", "pineapple", "apple", "banana")

预期的输出数据帧：

output_id1 <- c("id_1", "lettuce", "apple")
output_id2 <- c("id_2", "tomato, bellpeper", NA)
output <- data.frame(rbind(output_id1, output_id2))
colnames(output) <- c("id", "veg", "fruit")

【问题讨论】：

如果您对返回名称中包含“veg”和“fruit”不感兴趣，您可以从一开始就完全删除这些词吗？比如说，在进行任何分析之前重命名输入数据框。是的，这是可行的。我将编辑问题 【参考方案1】：

使用您发布的原始input 数据（也显示在下面的数据中），您可以使用tidyr 包做到这一点：

library(tidyr)

input %>% 
  tidyr::pivot_longer(cols = matches("^veg|^fruit"),
                      names_sep = "_",
                      names_to = c("type", "val"),
                      values_drop_na = T) %>% 
  tidyr::pivot_wider(id_cols = id,
                     names_from = type,
                     values_from = val,
                     values_fn = function(x) paste0(x, collapse = ","))

输出

  id    veg              fruit
  <chr> <chr>            <chr>
1 id_1  lettuce          apple
2 id_2  tomato,bellpeper NA

数据

input <- structure(list(id = c("id_1", "id_2"), veg_lettuce = c("1", NA
), veg_tomato = c(NA, "1"), veg_bellpeper = c(NA, "1"), fruit_pineapple = c(NA_character_, 
NA_character_), fruit_apple = c("1", NA), fruit_banana = c(NA_character_, 
NA_character_)), class = "data.frame", row.names = c("id1", "id2"
))

【讨论】：

【参考方案2】：

这应该可以解决问题！

id1 <- c("id_1", 1, NA, NA, NA, 1, NA)
id2 <- c("id_2", NA, 1, 1, 1, NA, NA)
input <- data.frame(rbind(id1, id2))
colnames(input) = c("id", "lettuce", "tomato", "bellpeper", "pineapple", "apple", "banana")

# Remove the id column, it's not necessary
input_without_id <- dplyr::select(input, -c("id"))

# For each row (margin = 1) of the input, return the names vector (names(input))
# but only in the positions the where the row (x!) is not NA
result <- apply(input_without_id, MARGIN = 1, function(x) 
    return(names(input_without_id)[which(!is.na(x))])
)

# Rename the result with the corresponding ids originally found in input.
names(result) <- input$id

【讨论】：

【参考方案3】：

这是tidyverse 解决方案：

library(tidyverse)

input %>% 
  pivot_longer(-id) %>% 
  group_by(id) %>% 
  separate(name, into = c('type', 'class'), sep = "_") %>% 
  na.omit() %>% 
  select(-value) %>% 
  group_by(id, type) %>% 
  summarise(class = toString(class)) %>% 
  ungroup() %>% 
  pivot_wider(names_from = type, values_from = class) %>% 
  unnest() %>% 
  select(id, veg, fruit)

这给了我们：

# A tibble: 2 x 3
  id    veg               fruit
  <chr> <chr>             <chr>
1 id_1  lettuce           apple
2 id_2  tomato, bellpeper NA

【讨论】：

以上是关于R：获取值不为空的列名的主要内容，如果未能解决你的问题，请参考以下文章

如何在mysql中获取值不为null的列名

SQL：如何在 B 列不为空的情况下使用“100”更新 A 列

查找值不为null的列sql语句

关于ASP语句判定某数输入值不为空

JQ选择值包含字符串且值不为空的对象

选择性过滤列值不为空的行 PostgreSQL