使用多个 for 循环获取所有元素的组合
Posted
技术标签:
【中文标题】使用多个 for 循环获取所有元素的组合【英文标题】:Use Multiple for loop to get combinations of all elements 【发布时间】:2017-11-16 15:38:18 【问题描述】:我正在尝试对三个字符向量中的多个元素使用组合,但我只得到最后一个元素迭代,我还想提出一个条件,我的 budg_min 不应该在创建组合列表时大于 budg_max
这是我的代码
text1="http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom="
text3="&proptype="
text4="Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment"
text5="&cityName=Thane&BudgetMin="
text6="&BudgetMax="
uuu=list()
bhk=c("1","2","3","4","5",">5")
budg_min=c("5-Lacs","10-Lacs","20-Lacs","30-Lacs","40-Lacs","50-Lacs","60-Lacs","70-Lacs","80-Lacs","90-Lacs","1-Crores","1.2-Crores","1.4-Crores","1.6-Crores","1.8-Crores","2-Crores","2.3-Crores","2.6-Crores","3-Crores","3.5-Crores","4-Crores","4.5-Crores","5-Crores","10-Crores","20-Crores")
budg_max=c("5-Lacs","10-Lacs","20-Lacs","30-Lacs","40-Lacs","50-Lacs","60-Lacs","70-Lacs","80-Lacs","90-Lacs","1-Crores","1.2-Crores","1.4-Crores","1.6-Crores","1.8-Crores","2-Crores","2.3-Crores","2.6-Crores","3-Crores","3.5-Crores","4-Crores","4.5-Crores","5-Crores","10-Crores","20-Crores")
for(i in bhk)
for(j in budg_min)
for(k in budg_max)
if(budg_min>budg_max)"Skip that combination "
else
uuu[i]=paste(text1,i,text3,text4,text5,j,text6,k,sep = "")
我期待像这样的输出
[1]
http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMax=5-Lacs
[2]
http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=10-Lacs
[3]
http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=20-Lacs
.
.
.
.
[n]
http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=%3E5&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMax=20-Crores
注意:在上面的输出中,列表的第一个元素仅包含 BudgetMax 参数 AND 列表的最后一个(第 n 个)元素只有 BudgetMax 参数 其余元素是 bhk 、 budg_min 和 budg_min 的组合。
但我的代码给出的只是 6 条记录
[1] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=20-Crores&BudgetMax=20-Crores"
$`2`
[1] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=2&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=20-Crores&BudgetMax=20-Crores"
$`3`
[1] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=3&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=20-Crores&BudgetMax=20-Crores"
$`4`
[1] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=4&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=20-Crores&BudgetMax=20-Crores"
$`5`
[1] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=5&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=20-Crores&BudgetMax=20-Crores"
$`>5`
[1] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=>5&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=20-Crores&BudgetMax=20-Crores"
应该对我的代码进行哪些更改,以便它提供我的所有组合。 任何帮助将不胜感激。谢谢
【问题讨论】:
【参考方案1】:无需使用 for 循环。与expand.grid
和sprintf
:
eg <- expand.grid(bhk = bhk, budg_min = budg_min, budg_max = budg_max)
eg <- eg[as.integer(eg$budg_min) <= as.integer(eg$budg_max),]
uuu <- sprintf("%s%s%s%s%s%s%s%s", text1,eg[,1],text3,text4,text5,eg[,2],text6,eg[,3])
你也会得到想要的结果:
> head(uuu,10) [1] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=5-Lacs" [2] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=2&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=5-Lacs" [3] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=3&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=5-Lacs" [4] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=4&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=5-Lacs" [5] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=5&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=5-Lacs" [6] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=>5&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=5-Lacs" [7] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=10-Lacs" [8] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=2&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=10-Lacs" [9] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=3&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=10-Lacs" [10] "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=4&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=10-Lacs"
解释:
使用expand.grid
,您可以创建向量bhk
、budg_min
和budg_max
的所有组合。
由于因子变量budg_min
和budg_max
的级别按货币值的升序排列,您可以通过将这些因子转换为整数来过滤掉budg_min > budg_max
的情况。
sprintf
按照指定的格式 ("%s%s%s%s%s%s%s%s"
) 将所有向量粘贴在一起。格式的每个%s
部分都被向量的元素替换。
要将所有货币值转换为 lacs,您可以这样做(受@MattJewett 启发):
eg <- expand.grid(bhk = bhk, budg_min = budg_min, budg_max = budg_max)
# Convert values to lacs prior to min/max comparison
eg$min_lacs <- as.numeric(gsub('([0-9.]+).*','\\1',eg$budg_min))
eg$min_lacs[grepl('Crores',eg$budg_min)] <- eg$min_lacs[grepl('Crores',eg$budg_min)]*100
eg$max_lacs <- as.numeric(gsub('([0-9.]+).*','\\1',eg$budg_min))
eg$max_lacs[grepl('Crores',eg$budg_max)] <- eg$max_lacs[grepl('Crores',eg$budg_max)]*100
eg <- eg[as.integer(eg$min_lacs) <= as.integer(eg$max_lacs),]
uuu <- sprintf("%s%s%s%s%s%s%s%s", text1,eg[,1],text3,text4,text5,eg[,2],text6,eg[,3])
【讨论】:
您的代码运行良好,但请再次查看帖子,因为我也对其进行了一些编辑...感谢您的努力! 是的,这有效..这是当天的救星..非常感谢您的努力!!! @deepesh 我现在也包含了一个转换 是的...看到了更新,这真是太好了...。再次感谢!!! :)【参考方案2】:如果你使用apply
,你可以从头开始。
expand.grid
为您提供三个向量的所有组合:
allcombs <-expand.grid(bhk = bhk,bmin = budg_min, bmax =budg_max)
您创建一个索引,其中budg_min
小于或等于budg_max
:
ix <- apply(allcombs,1,function(x) which(budg_min %in% x[2]) <= which(budg_max %in% x[3]))
然后根据您的条件仅形成正确的组合:
res <- apply(allcombs[ix,],1,function(x) paste(text1,x[1],text3,text4,text5,x[2],text6,x[3]))
【讨论】:
非常感谢您的努力。你的脚本工作得很好,但我也用其他一些条件编辑了这篇文章。感谢您的努力!!! 绝对@Val 这在条件下非常有效......非常感谢您的努力!!!【参考方案3】:每次创建新的 k 元素时,您都会覆盖 uuu[i] 处的值。
例如第一次运行循环 i == 1, j == 1, k == 1 然后将第一个值分配给 uuu[1] 第二次穿越 i == 1, j == 1, k == 2 然后,您还将第二个值分配给 uuu[1] (因为 i 仍然等于 1)
要解决此问题,您需要一个单独的计数器来跟踪列表中的项目。
这样的东西应该给你一个新元素中的每一个组合。
text1 <- "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom="
text3 <- "&proptype="
text4 <- "Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment"
text5 <- "&cityName=Thane&BudgetMin="
text6 <-"&BudgetMax="
uuu <- list()
bhk <- c("1","2","3","4","5",">5")
budg_min <- c("5-Lacs","10-Lacs","20-Lacs","30-Lacs","40-Lacs","50-Lacs","60-Lacs","70-Lacs","80-Lacs","90-Lacs","1-Crores","1.2-Crores","1.4-Crores","1.6-Crores","1.8-Crores","2-Crores","2.3-Crores","2.6-Crores","3-Crores","3.5-Crores","4-Crores","4.5-Crores","5-Crores","10-Crores","20-Crores")
budg_max <- c("5-Lacs","10-Lacs","20-Lacs","30-Lacs","40-Lacs","50-Lacs","60-Lacs","70-Lacs","80-Lacs","90-Lacs","1-Crores","1.2-Crores","1.4-Crores","1.6-Crores","1.8-Crores","2-Crores","2.3-Crores","2.6-Crores","3-Crores","3.5-Crores","4-Crores","4.5-Crores","5-Crores","10-Crores","20-Crores")
item <- 1
for(i in bhk)
for(j in budg_min)
# Split budg_min to separate value from unit
min <- unlist(strsplit(j,"-"))
# Convert Crores to Lacs to get min value in Lacs
min <- ifelse(min[2] == "Crores", as.numeric(min[1]) * 100, as.numeric(min[1]))
for(k in budg_max)
# Split budg_min to separate value from unit
max <- unlist(strsplit(k,"-"))
# Convert Crores to Lacs to get max value in Lacx
max <- ifelse(max[2] == "Crores", as.numeric(max[1]) * 100, as.numeric(max[1]))
# If min is less than max, insert the comparison
if(min < max)
uuu[item] <- paste(text1,i,text3,text4,text5,j,text6,k,sep = "")
item <- item + 1
【讨论】:
感谢@Matt Jeweett 的努力。我也用其他一些条件编辑了这篇文章。请再过一遍.. 感谢您的努力! 我已经更新了我的解决方案来处理最小/最大比较。它还有一些代码可用于处理 crores 到 lacs 的转换,这应该可以更准确地比较这些值。您可能希望找到一种方法将其合并到@Jaap 提供的解决方案中,以确保将这些值作为相同的货币单位进行比较。 感谢建议的编辑。希望你不介意我拒绝了它,因为 imo 它可以更有效地完成。查看我的答案的更新。【参考方案4】:似乎 uuu[i] 是导致问题的原因。试试这个:
df <- data.frame()
for(i in bhk)
for(j in budg_min)
for(k in budg_max)
uuu=data.frame(paste(text1,i,text3,text4,text5,j,text6,k,sep = ""))
df <- rbind(df, uuu)
【讨论】:
在“df”中创建的组合包含一些不存在的 url。所以也用其他条件编辑了我的帖子.....例如,如果您通过数据框“df”行号“26,51,52,76,77,78”包含 budg_min 大于 budg_max 的链接,理想情况下并非如此。所以对于这种网址,它不会给出任何记录。所以任何关于相同的建议。谢谢你的努力!!!【参考方案5】:考虑使用多个 args 甚至 sapply/vapply:
mat <- sapply(budg_min, function(j, k, i, t1, t3, t4, t5, t6)
paste0(t1,i,t3,t4,t5,j,t6,k), budg_max, bhk, text1, text3, text4, text5, text6)
mat <- vapply(budg_min, function(j, k, i, t1, t3, t4, t5, t6)
paste0(t1,i,t3,t4,t5,j,t6,k), character(25), USE.NAMES = TRUE, budg_max, bhk, text1, text3, text4, text5, text6)
以上任一方法的返回是一个命名列矩阵(名称来自第一个输入列表或 budg_min)。但是如果你想要长字符向量使用as.vector(mat)
【讨论】:
以上是关于使用多个 for 循环获取所有元素的组合的主要内容,如果未能解决你的问题,请参考以下文章