如何同时按名称或标准偏差选择列？

Question

解

我选择了@thelatemail提供的解决方案，因为我试图坚持使用tidyverse，因此dplyr - 我还是R的新手，所以我正在采取措施并利用辅助库。感谢大家花时间提供解决方案。

df_new <- df_inh %>%
select(
  isolate,
  Phenotype,
  which(
    sapply( ., function( x ) sd( x ) != 0 )
  )
)

题

如果列名称为“isolate”或“Phenotype”，或者列值的标准偏差不为0，我正在尝试选择列。

我试过以下代码。

df_new <- df_inh %>%
# remove isolate and Phenotype column for now, don't want to calculate their standard deviation
select(
  -isolate,
  -Phenotype
) %>%
# remove columns with all 1's or all 0's by calculating column standard deviation
select_if(
  function( col ) return( sd( col ) != 0 )
) %>%
# add back the isolate and Phenotype columns
select(
  isolate,
  Phenotype
)

我也尝试过这个

df_new <- df_inh %>%
select_if(
  function( col ) {
  if ( col == 'isolate' | col == 'Phenotype' ) {
    return( TRUE )
  }
  else {
    return( sd( col ) != 0 )
  }
}
)

我可以通过标准偏差或列名选择列，但我不能同时执行此操作。

Answer 1

另一答案