R中数据框中的嵌套子集
Posted
技术标签:
【中文标题】R中数据框中的嵌套子集【英文标题】:Nested subsetting in a dataframe in R 【发布时间】:2022-01-14 21:45:33 【问题描述】:我想知道如何在下面对我的data
进行子集化,这样我最终会得到 4 个study
ies,其中包括:
(A) 2 个独特的 study
ies,其中 study_type==standard
包括 1 个 study
和 reporting==subscale
,1 个 study
和 reporting==composite
(类似于研究 1 和3)
与
(B) 2 个独特的 study
ies,其中 study_type==alternative
包括 1 个 study
和 reporting==subscale
,1 个 study
和 reporting==composite
。(类似研究 5和 7)
这在 R 中可能吗?
m="
study subscale reporting obs include yi vi study_type
1 A subscale 1 yes 1.94 0.33503768 standard
1 A subscale 2 yes 1.06 0.01076604 standard
2 A subscale 3 yes 2.41 0.23767389 standard
2 A subscale 4 yes 2.34 0.37539841 standard
3 A&C composite 5 yes 3.09 0.31349510 standard
3 A&C composite 6 yes 3.99 0.01349510 standard
4 A&B composite 7 yes 2.90 0.91349510 standard
4 A&B composite 8 yes 3.01 0.99349510 standard
5 G&H composite 9 yes 1.01 0.99910197 alternative
5 G&H composite 10 yes 2.10 0.97910095 alternative
6 E&G composite 11 yes 0.11 0.27912095 alternative
6 E&G composite 12 yes 3.12 0.87910095 alternative
7 E subscale 13 yes 0.08 0.21670360 alternative
7 G subscale 14 yes 1.00 0.91597190 alternative
8 F subscale 15 yes 1.08 0.81670360 alternative
8 E subscale 16 yes 0.99 0.91297170 alternative"
data <- read.table(text=m,h=T)
【问题讨论】:
【参考方案1】:如果我理解正确,您可以使用 dplyr::distinct
library(tidyverse)
data %>%
distinct(study_type, reporting, .keep_all = TRUE)
#> study subscale reporting obs include yi vi study_type
#> 1 1 A subscale 1 yes 1.94 0.3350377 standard
#> 2 3 A&C composite 5 yes 3.09 0.3134951 standard
#> 3 5 G&H composite 9 yes 1.01 0.9991020 alternative
#> 4 7 E subscale 13 yes 0.08 0.2167036 alternative
【讨论】:
【参考方案2】:如果您询问如何将数据过滤到您询问的子集中,您可以这样做:
> study1 <- dplyr::filter(data, study_type == "standard" & reporting == "subscale")
> study1
study subscale reporting obs include yi vi study_type
1 1 A subscale 1 yes 1.94 0.33503768 standard
2 1 A subscale 2 yes 1.06 0.01076604 standard
3 2 A subscale 3 yes 2.41 0.23767389 standard
4 2 A subscale 4 yes 2.34 0.37539841 standard
> study2 <- dplyr::filter(data, study_type == "standard" & reporting == "composite")
> study2
study subscale reporting obs include yi vi study_type
1 3 A&C composite 5 yes 3.09 0.3134951 standard
2 3 A&C composite 6 yes 3.99 0.0134951 standard
3 4 A&B composite 7 yes 2.90 0.9134951 standard
4 4 A&B composite 8 yes 3.01 0.9934951 standard
> study3 <- dplyr::filter(data, study_type == "alternative" & reporting == "subscale")
> study3
study subscale reporting obs include yi vi study_type
1 7 E subscale 13 yes 0.08 0.2167036 alternative
2 7 G subscale 14 yes 1.00 0.9159719 alternative
3 8 F subscale 15 yes 1.08 0.8167036 alternative
4 8 E subscale 16 yes 0.99 0.9129717 alternative
> study4 <- dplyr::filter(data, study_type == "alternative" & reporting == "composite")
> study4
study subscale reporting obs include yi vi study_type
1 5 G&H composite 9 yes 1.01 0.999102 alternative
2 5 G&H composite 10 yes 2.10 0.979101 alternative
3 6 E&G composite 11 yes 0.11 0.279121 alternative
4 6 E&G composite 12 yes 3.12 0.879101 alternative
【讨论】:
检查split(df, df[c("reporting", "study_type")])
...以上是关于R中数据框中的嵌套子集的主要内容,如果未能解决你的问题,请参考以下文章