在R中，获取数据框的子集，其中列中的值包含在列表中[重复]

Posted 2023-03-11

技术标签:

【中文标题】在R中，获取数据框的子集，其中列中的值包含在列表中[重复]【英文标题】：In R, get a subset of the data frame where the value in a column are contained in a list [duplicate] 【发布时间】：2017-03-04 07:00:00 【问题描述】：

例如，假设我有一个名为 df 的数据框，其中有一列 "ID" 整数，我想获取我的数据框的子集，其中 "ID" 中的值位于向量 [123,198,204,245,87,91,921].

R 中的语法是什么？

【问题讨论】：

【参考方案1】：

我相信你想要%in%函数：

df <- data.frame(ID=1:1000, STUFF=runif(1000))
df2 <- df[df$ID %in% c(123,198,204,245,87,91,921), ]

【讨论】：

【参考方案2】：

如果它解决了您的问题，请告诉我。

首先，我们需要 which 函数。

？哪个

哪些指标为真？

说明

给出逻辑对象的 TRUE 索引，允许数组索引。

i <- 1:10

which(i < 5)

1 2 3 4

我们还需要 %in% 运算符：

?"%in%"

%in% 是一个更直观的二元运算符界面，它返回一个逻辑向量，指示其是否匹配左操作数。

2 %in% 1:5

是的

2 %in% 5:10

错误

齐心协力

# some starting ids
id <- c(123, 204, 11, 12, 13, 15, 87, 123)

# the df constructed with the ids
df <- data.frame(id)

# the valid ids 
valid.ids <- c(123,198,204,245,87,91,921)

# positions is a logical vector which represent for each element if it's a match or not
positions <- df$id %in% valid.ids

positions

[1] 真真假假假真真

# BONUS
# we can easily count how many matches we have:
sum(positions)

[1] 4

# using the which function we get only the indices 'which' contain TRUE
matched_elements_positions <- which(positions)

matched_elements_positions

1 2 7 8

# last step, we select only the matching rows from our dataframe
df[matched_elements_positions,]

123 204 87 123

【讨论】：

以上是关于在R中，获取数据框的子集，其中列中的值包含在列表中[重复]的主要内容，如果未能解决你的问题，请参考以下文章