如何在R中使用for循环提取多个子字符串

Posted 2023-02-14

技术标签:

【中文标题】如何在R中使用for循环提取多个子字符串【英文标题】：How to extract several substrings with a foor loop in R 【发布时间】：2022-01-07 23:34:12 【问题描述】：

我有以下 100 个字符串：

 [3] "Department_Complementary_Demand_Converted_Sum" 
 [4] "Department_Home_Demand_Converted_Sum"                   
 [5] "Department_Store A_Demand_Converted_Sum"                
 [6] "Department_Store B_Demand_Converted_Sum"
 ...                
 [100] "Department_Unisex_Demand_Converted_Sum"

显然，我可以为每个字符串使用substr()，并为字符串索引提供不同的开始和结束值。但是可以看到，所有的字符串都以Department_ 开头并以_Demand_Converted_Sum 结尾。我只想提取介于两者之间的内容。如果有办法始终从左侧的索引 11 开始并在左侧的索引 21 结束，那么我可以对上面的所有 100 个字符串运行一个 for 循环。

示例

给定输入： Department_Unisex_Demand_Converted_Sum

预期输出： Unisex

【问题讨论】：

你能根据所显示的预期输入显示预期的输出吗？ @sindri_baldur - 当然。请检查我的编辑。 gsub("^Department_|_Demand_Converted_Sum$", "", string) 或 stringr::str_sub(string, 12, -22)。 @RitchieSacramento - 太棒了，谢谢！ 【参考方案1】：

使用strsplit()，

sapply(strsplit(string, '_'), '[', 2)
# [1] "Complementary" "Home"          "Store A"

或stringi::stri_sub_all()。

unlist(stringi::stri_sub_all(str, 12, -22))
# [1] "Complementary" "Home"          "Store A"

【讨论】：

【参考方案2】：

看起来像一个经典的环视案例：

library(stringr)
str_extract(str, "(?<=Department_)[^_]+(?=_)")
[1] "Complementary" "Home"          "Store A"

数据：

str <- c("Department_Complementary_Demand_Converted_Sum",
         "Department_Home_Demand_Converted_Sum",
         "Department_Store A_Demand_Converted_Sum")

【讨论】：

以上是关于如何在R中使用for循环提取多个子字符串的主要内容，如果未能解决你的问题，请参考以下文章