在 R 中生成列表的所有不同排列

Posted

技术标签:

【中文标题】在 R 中生成列表的所有不同排列【英文标题】:Generating all distinct permutations of a list in R 【发布时间】:2012-06-21 04:46:07 【问题描述】:

我正在尝试创建一个列表的排列列表,例如,perms(list("a", "b", "c")) 返回

list(list("a", "b", "c"), list("a", "c", "b"), list("b", "a", "c"),
     list("b", "c", "a"), list("c", "a", "b"), list("c", "b", "a"))

我不知道如何继续,任何帮助将不胜感激。

【问题讨论】:

在 R 中有几个用于生成排列的包。我写了一个 summary,其中包括基准测试以及每种可用方法的使用演示。 【参考方案1】:

不久前,我不得不在基本 R 中执行此操作而不加载任何包。

permutations <- function(n)
    if(n==1)
        return(matrix(1))
     else 
        sp <- permutations(n-1)
        p <- nrow(sp)
        A <- matrix(nrow=n*p,ncol=n)
        for(i in 1:n)
            A[(i-1)*p+1:p,] <- cbind(i,sp+(sp>=i))
        
        return(A)
    

用法:

> matrix(letters[permutations(3)],ncol=3)
     [,1] [,2] [,3]
[1,] "a"  "b"  "c" 
[2,] "a"  "c"  "b" 
[3,] "b"  "a"  "c" 
[4,] "b"  "c"  "a" 
[5,] "c"  "a"  "b" 
[6,] "c"  "b"  "a" 

【讨论】:

不错的功能。看起来也很快。 这个函数比combinat::permn快很多,排列次数更多。例如:microbenchmark:microbenchmark(permn(letters[1:9]), matrix(letters[permutations(9)],ncol=9), times=20)【参考方案2】:

combinat::permn 将完成这项工作:

> library(combinat)
> permn(letters[1:3])
[[1]]
[1] "a" "b" "c"

[[2]]
[1] "a" "c" "b"

[[3]]
[1] "c" "a" "b"

[[4]]
[1] "c" "b" "a"

[[5]]
[1] "b" "c" "a"

[[6]]
[1] "b" "a" "c"

请注意,如果元素很大,则计算量很大。

【讨论】:

如果我们希望从 3 个字母的字符串中不仅包含所有 3 个字母元素,还包含 2 个字母和 1 个字母元素,情况如何?【参考方案3】:

base R 也可以提供答案:

all <- expand.grid(p1 = letters[1:3], p2 = letters[1:3], p3 = letters[1:3], stringsAsFactors = FALSE) 
perms <- all[apply(all, 1, function(x) length(unique(x)) == 3),]

【讨论】:

【参考方案4】:

您可以尝试gtools 包中的permutations(),但与combinat 中的permn() 不同,它不会输出列表:

> library(gtools)
> permutations(3, 3, letters[1:3])
     [,1] [,2] [,3]
[1,] "a"  "b"  "c" 
[2,] "a"  "c"  "b" 
[3,] "b"  "a"  "c" 
[4,] "b"  "c"  "a" 
[5,] "c"  "a"  "b" 
[6,] "c"  "b"  "a" 

【讨论】:

值得注意的是permutations更灵活。它允许排列 n 个元素中的 m 个并允许重复使用元素。我在尝试permn 没有成功后发现了这个。 v源向量有重复元素时,它无法生成所有可能的排列。所以假设我想得到单词letters的所有可能排列【参考方案5】:

base R 中的解决方案,不依赖其他包:

> getPerms <- function(x) 
    if (length(x) == 1) 
        return(x)
    
    else 
        res <- matrix(nrow = 0, ncol = length(x))
        for (i in seq_along(x)) 
            res <- rbind(res, cbind(x[i], Recall(x[-i])))
        
        return(res)
    


> getPerms(letters[1:3])
     [,1] [,2] [,3]
[1,] "a"  "b"  "c" 
[2,] "a"  "c"  "b" 
[3,] "b"  "a"  "c" 
[4,] "b"  "c"  "a" 
[5,] "c"  "a"  "b" 
[6,] "c"  "b"  "a"

我希望这会有所帮助。

【讨论】:

优于gtools 解决方案。 之前没测试过,不过好像是这样。酷。【参考方案6】:
# Another recursive implementation    
# for those who like to roll their own, no package required 
    permutations <- function( x, prefix = c() )
    
        if(length(x) == 0 ) return(prefix)
        do.call(rbind, sapply(1:length(x), FUN = function(idx) permutations( x[-idx], c( prefix, x[idx])), simplify = FALSE))
    

    permutations(letters[1:3])
    #    [,1] [,2] [,3]
    #[1,] "a"  "b"  "c" 
    #[2,] "a"  "c"  "b" 
    #[3,] "b"  "a"  "c" 
    #[4,] "b"  "c"  "a" 
    #[5,] "c"  "a"  "b" 
    #[6,] "c"  "b"  "a" 

【讨论】:

很好的答案!放弃sapply(..., simplify = FALSE) 并改用lapply(...) 怎么样?【参考方案7】:

试试:

> a = letters[1:3]
> eg = expand.grid(a,a,a)
> eg[!(eg$Var1==eg$Var2 | eg$Var2==eg$Var3 | eg$Var1==eg$Var3),]
   Var1 Var2 Var3
6     c    b    a
8     b    c    a
12    c    a    b
16    a    c    b
20    b    a    c
22    a    b    c

正如@Adrian 在 cmets 中所建议的,最后一行可以替换为:

eg[apply(eg, 1, anyDuplicated) == 0, ]

【讨论】:

或者,对于最后一行:eg[apply(eg, 1, anyDuplicated) == 0, ] @dusadrian 关于可扩展性的说明:在“严肃”代码中使用这种方法之前,我会三思而后行,因为搜索空间(例如)随着样本大小/采样集的增加(命中率:n! 与 n^n - 根据斯特林公式估计的近似指数恶化)。对于十分之十的情况,命中率已经只有prod(1:10) / (10 ^ 10) = 0.036%。似乎所有这些检查过的变体都在某个时间点存储在内存中的数据框中。但是,我一直喜欢这个用于小型手动任务,因为它很容易理解。 @brezniczky 是的,这仅用于演示目的。我有一个完全不同的解决方案(在这个线程中),它是自包含的。两者都使用纯 R,但是对于更密集的内存操作,当然应该实现一些编译代码(实际上,大多数 R 的内部函数都是用 C 编写的)。【参考方案8】:

一个有趣的“概率”解决方案,使用基础 R 的样本:

elements <- c("a", "b", "c")
k <- length(elements)
res=unique(t(sapply(1:200, function(x) sample(elements, k))))
# below, check you have all the permutations you need (if not, try again)
nrow(res) == factorial(k)
res

基本上你调用许多随机样本,希望得到它们,然后你将它们唯一化。

【讨论】:

【参考方案9】:

我们可以使用基函数combn 稍加修改:

   combn_n <- function(x) 
      m <- length(x) - 1 # number of elements to choose: n-1 
      xr <- rev(x) # reversed x
      part_1 <- rbind(combn(x, m), xr, deparse.level = 0) 
      part_2 <- rbind(combn(xr, m), x, deparse.level = 0) 
      cbind(part_1, part_2)
       
  combn_n(letters[1:3])

[,1] [,2] [,3] [,4] [,5] [,6]  
[1,] "a"  "a"  "b"  "c"  "c"  "b"   
[2,] "b"  "c"  "c"  "b"  "a"  "a"   
[3,] "c"  "b"  "a"  "a"  "b"  "c"   

【讨论】:

【参考方案10】:

如果这有帮助,有一个“安排”包,它可以让你简单地做:

> abc  = letters[1:3]

> permutations(abc)
     [,1] [,2] [,3]
[1,] "a"  "b"  "c" 
[2,] "a"  "c"  "b" 
[3,] "b"  "a"  "c" 
[4,] "b"  "c"  "a" 
[5,] "c"  "a"  "b" 
[6,] "c"  "b"  "a" 

【讨论】:

【参考方案11】:

rnso's answer 的通用版本是:

get_perms <- function(x)
  stopifnot(is.atomic(x)) # for the matrix call to make sense
  out <- as.matrix(expand.grid(
    replicate(length(x), x, simplify = FALSE), stringsAsFactors = FALSE))
  out[apply(out,1, anyDuplicated) == 0, ]

这里有两个例子:

get_perms(letters[1:3])
#R>      Var1 Var2 Var3
#R> [1,] "c"  "b"  "a" 
#R> [2,] "b"  "c"  "a" 
#R> [3,] "c"  "a"  "b" 
#R> [4,] "a"  "c"  "b" 
#R> [5,] "b"  "a"  "c" 
#R> [6,] "a"  "b"  "c" 
get_perms(letters[1:4])
#R>       Var1 Var2 Var3 Var4
#R>  [1,] "d"  "c"  "b"  "a" 
#R>  [2,] "c"  "d"  "b"  "a" 
#R>  [3,] "d"  "b"  "c"  "a" 
#R>  [4,] "b"  "d"  "c"  "a" 
#R>  [5,] "c"  "b"  "d"  "a" 
#R>  [6,] "b"  "c"  "d"  "a" 
#R>  [7,] "d"  "c"  "a"  "b" 
#R>  [8,] "c"  "d"  "a"  "b" 
#R>  [9,] "d"  "a"  "c"  "b" 
#R> [10,] "a"  "d"  "c"  "b" 
#R> [11,] "c"  "a"  "d"  "b" 
#R> [12,] "a"  "c"  "d"  "b" 
#R> [13,] "d"  "b"  "a"  "c" 
#R> [14,] "b"  "d"  "a"  "c" 
#R> [15,] "d"  "a"  "b"  "c" 
#R> [16,] "a"  "d"  "b"  "c" 
#R> [17,] "b"  "a"  "d"  "c" 
#R> [18,] "a"  "b"  "d"  "c" 
#R> [19,] "c"  "b"  "a"  "d" 
#R> [20,] "b"  "c"  "a"  "d" 
#R> [21,] "c"  "a"  "b"  "d" 
#R> [22,] "a"  "c"  "b"  "d" 
#R> [23,] "b"  "a"  "c"  "d" 
#R> [24,] "a"  "b"  "c"  "d" 

也可以稍微改变Rick's answer,使用lapply,只做一个rbind,减少[s]/[l]apply的调用次数:

permutations <- function(x, prefix = c())
  if(length(x) == 1) # was zero before
    return(list(c(prefix, x)))
  out <- do.call(c, lapply(1:length(x), function(idx) 
    permutations(x[-idx], c(prefix, x[idx]))))
  if(length(prefix) > 0L)
    return(out)
  
  do.call(rbind, out)

【讨论】:

【参考方案12】:

看,purrr ? 解决方案:

> map(1:3, ~ c('a', 'b', 'c')) %>%
    cross() %>%
    keep(~ length(unique(.x)) == 3) %>%
    map(unlist)
#> [[1]]
#> [1] "c" "b" "a"
#> 
#> [[2]]
#> [1] "b" "c" "a"
#> 
#> [[3]]
#> [1] "c" "a" "b"
#> 
#> [[4]]
#> [1] "a" "c" "b"
#> 
#> [[5]]
#> [1] "b" "a" "c"
#> 
#> [[6]]
#> [1] "a" "b" "c"

【讨论】:

【参考方案13】:

怎么样

pmsa <- function(l) 
  pms <- function(n) if(n==1) return(list(1)) else unlist(lapply(pms(n-1),function(v) lapply(0:(n-1),function(k) append(v,n,k))),recursive = F)
  lapply(pms(length(l)),function(.) l[.])

这给出了一个列表。那么

pmsa(letters[1:3])

【讨论】:

以上是关于在 R 中生成列表的所有不同排列的主要内容,如果未能解决你的问题,请参考以下文章

如何从所有排列中生成所有可能的组合?

在没有映射函数的列表中生成排列

在 R 中生成不重复的组合对

如何在一个 1×41 向量中生成定位 20 个 -1 值的每个排列?

编写一个循环来选择变量值的所有组合,在 R 中生成正方程值

如何在 m 列中生成 n 行的排列或组合?