在 dplyr 中使用 case_when 改变新列时遇到问题

Posted

技术标签:

【中文标题】在 dplyr 中使用 case_when 改变新列时遇到问题【英文标题】:Trouble mutating new column using case_when in dplyr 【发布时间】:2019-12-22 02:15:32 【问题描述】:

我想使用来自 this 数据帧上的 Código 列的不同长度值来改变一个新列。

ods <- readODS::read_ods('http://www.arcotel.gob.ec/wp-content/uploads/2016/09/proyeccion_cantonal_total_2010-2020_seg%C3%BAn_INEC1.ods', skip = 2)

我尝试过在这样的变异中使用 case_when:

mutate(ods, Provincia = case_when(
        length(ods$Código) == 3 ~ str_extract(ods$Código, '[[:digit:]]1'),
        length(ods$Código) == 4 ~ str_extract(ods$Código, '[[:digit:]]2')
))

只要值的长度为 3,就应该使用 Código 的第一个数字创建一个新的 Provincia 列,否则应该提取两个数字。运行上述代码时,我只得到 NA's

【问题讨论】:

我想你在这里想要nchar 而不是length 非常好!您是否会将您的评论作为答案,以便我对其进行投票? 【参考方案1】:

使用nchar,它将统计每个观察中的字符数:

ods <- mutate(ods, Provincia = case_when(
         nchar(ods$Código) == 3 ~ str_extract(ods$Código, '[[:digit:]]1'),
         nchar(ods$Código) == 4 ~ str_extract(ods$Código, '[[:digit:]]2')
 ))

结果:

    > ods %>% pull(Provincia)
  [1] "1"  "1"  "1"  "1"  "1"  "1"  "1"  "1"  "1"  "1"  "1"  "1"  "1"  "1"  "1"  "2"  "2"  "2"  "2"  "2"  "2" 
 [22] "2"  "3"  "3"  "3"  "3"  "3"  "3"  "3"  "4"  "4"  "4"  "4"  "4"  "4"  "5"  "5"  "5"  "5"  "5"  "5"  "5" 
 [43] "6"  "6"  "6"  "6"  "6"  "6"  "6"  "6"  "6"  "6"  "7"  "7"  "7"  "7"  "7"  "7"  "7"  "7"  "7"  "7"  "7" 
 [64] "7"  "7"  "7"  "8"  "8"  "8"  "8"  "8"  "8"  "8"  "8"  "9"  "9"  "9"  "9"  "9"  "9"  "9"  "9"  "9"  "9" 
 [85] "9"  "9"  "9"  "9"  "9"  "9"  "9"  "9"  "9"  "9"  "9"  "9"  "9"  "9"  "9"  "10" "10" "10" "10" "10" "10"
[106] "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "12" "12" "12" "12" "12"
[127] "12" "12" "12" "12" "12" "12" "12" "12" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13"
[148] "13" "13" "13" "13" "13" "13" "13" "13" "13" "14" "14" "14" "14" "14" "14" "14" "14" "14" "14" "14" "14"
[169] "15" "15" "15" "15" "15" "16" "16" "16" "16" "17" "17" "17" "17" "17" "17" "17" "17" "18" "18" "18" "18"
[190] "18" "18" "18" "18" "18" "19" "19" "19" "19" "19" "19" "19" "19" "19" "20" "20" "20" "21" "21" "21" "21"
[211] "21" "21" "21" "22" "22" "22" "22" "23" "24" "24" "24" "90" "90" "90"

【讨论】:

以上是关于在 dplyr 中使用 case_when 改变新列时遇到问题的主要内容,如果未能解决你的问题,请参考以下文章

如何从 dplyr 中的 case_when 捕获逻辑

dplyr case_when具有动态案例数时

Mutate和case_when正在给NA

R语言case_when函数和cases函数实战

使用 dplyr 在自定义函数中无法识别默认参数

R语言dplyr包使用mutate函数生成新的数据列(不改变原数据列)实战