如何在 R 中生成对象的排列或组合？

Posted 2023-02-15

技术标签:

【中文标题】如何在 R 中生成对象的排列或组合？【英文标题】：How to generate permutations or combinations of object in R? 【发布时间】：2014-04-29 10:26:45 【问题描述】：

如何从n 对象生成r 对象序列？我正在寻找一种方法来进行排列或组合，有/没有替换，有不同的和不不同的项目（又名多集）。

这与twelvefold way 有关。 “不同”的解决方案可以包含十二种方式，而“非不同”的解决方案则不包括在内。

【问题讨论】：

可以说有twelve questions这种类型。是的，这是一种非常有用的方式来组织和思考所有这些不同的组合对象。仅供参考，谷歌“十二重方式”的大部分首页点击都包含比我链接的***页面更易读的表格/更清晰的解释。感谢您的信息。我认为我缺少的是满格情况。正确的..？ [更新]：好像是错的您是对的，那是错误的 ;) 12 折分类所基于的特征与您选择的特征有 +/- 差异。对我来说，到目前为止，最好的思考方式是看着 n 个球被放入瓮中。关于如何放置它们有三种可能的限制（无限制，必须是单射的，或必须是满射的），以及 4 种可能的标记/未标记球和骨灰盒组合。 Here 和 here 是使用该镜头查看问题的 2 个来源。最后搞清楚这里8题和12题的区别。这里的四个问题是 12 倍的（那些“不同的”问题），而那些“不明确的”问题不是 12 倍的。 【参考方案1】：

R* 中的一段组合学演练

下面，我们将检查具有生成组合和排列功能的软件包。如果我遗漏了任何包裹，请原谅我并发表评论，或者更好的是，编辑这篇文章。

分析大纲：

简介组合排列多组总结内存

在我们开始之前，我们注意到一次m 替换不同与非不同项的组合/排列与是等效的。之所以如此，是因为当我们有替换时，它并不具体。因此，无论某个特定元素最初出现多少次，输出都会有该元素的一个实例重复 1 到 m 次。

1。简介

gtools

combinat

multicool

partitions

RcppAlgos

arrangements

gRbase

我没有包括 permute、permutations 或 gRbase::aperm/ar_perm，因为它们并不是真正要解决这些类型的问题。

|---------------------------------------------------- 概述 ----------------------------------------|

|_______________| gtools | combinat | multicool | partitions | 
|      comb rep |  Yes   |          |           |            | 
|   comb NO rep |  Yes   |   Yes    |           |            | 
|      perm rep |  Yes   |          |           |            |  
|   perm NO rep |  Yes   |   Yes    |    Yes    |    Yes     |
| perm multiset |        |          |    Yes    |            |  
| comb multiset |        |          |           |            |  
|accepts factors|        |   Yes    |           |            |  
|   m at a time |  Yes   |  Yes/No  |           |            |  
|general vector |  Yes   |   Yes    |    Yes    |            |
|    iterable   |        |          |    Yes    |            |
|parallelizable |        |          |           |            |
|  big integer  |        |          |           |            |

|_______________| iterpc | arrangements | RcppAlgos | gRbase |
|      comb rep |  Yes   |     Yes      |    Yes    |        |
|   comb NO rep |  Yes   |     Yes      |    Yes    |  Yes   |   
|      perm rep |  Yes   |     Yes      |    Yes    |        |
|   perm NO rep |  Yes   |     Yes      |    Yes    |   *    |
| perm multiset |  Yes   |     Yes      |    Yes    |        |
| comb multiset |  Yes   |     Yes      |    Yes    |        |
|accepts factors|        |     Yes      |    Yes    |        |
|   m at a time |  Yes   |     Yes      |    Yes    |  Yes   |
|general vector |  Yes   |     Yes      |    Yes    |  Yes   |
|    iterable   |        |     Yes      | Partially |        |
|parallelizable |        |     Yes      |    Yes    |        |
|  big integer  |        |     Yes      |           |        |

任务m at a time和general vector是指生成结果“一次m”的能力（当m小于向量）并重新排列“一般向量”而不是1:n。在实践中，我们通常关心的是寻找一般向量的重排，因此下面的所有检查都会反映这一点（如果可能的话）。

所有基准测试均在 3 种不同的设置上运行。

Macbook Pro i7 16Gb Macbook Air i5 4Gb 联想运行 Windows 7 i5 8Gb

列出的结果来自设置 #1（即 MBPro）。其他两个系统的结果相似。此外，定期调用gc() 以确保所有内存可用（请参阅?gc）。

2。组合

首先，我们检查一次没有替换选择m的组合。

RcppAlgos

combinat

utils

gtools

arrangements

gRbase

如何：

library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(13)
testVector1 <- sort(sample(100, 17))
m <- 9
t1 <- comboGeneral(testVector1, m)  ## returns matrix with m columns
t3 <- combinat::combn(testVector1, m)  ## returns matrix with m rows
t4 <- gtools::combinations(17, m, testVector1)  ## returns matrix with m columns
identical(t(t3), t4) ## must transpose to compare
#> [1] TRUE
t5 <- combinations(testVector1, m)
identical(t1, t5)
#> [1] TRUE
t6 <- gRbase::combnPrim(testVector1, m)
identical(t(t6)[do.call(order, as.data.frame(t(t6))),], t1)
#> [1] TRUE

基准测试：

microbenchmark(cbRcppAlgos = comboGeneral(testVector1, m),
               cbGRbase = gRbase::combnPrim(testVector1, m),
               cbGtools = gtools::combinations(17, m, testVector1),
               cbCombinat = combinat::combn(testVector1, m),
               cbArrangements = combinations(17, m, testVector1),
               unit = "relative")
#> Unit: relative
#>            expr     min      lq    mean  median      uq    max neval
#>     cbRcppAlgos   1.064   1.079   1.160   1.012   1.086  2.318   100
#>        cbGRbase   7.335   7.509   5.728   6.807   5.390  1.608   100
#>        cbGtools 426.536 408.807 240.101 310.848 187.034 63.663   100
#>      cbCombinat  97.756  97.586  60.406  75.415  46.391 41.089   100
#>  cbArrangements   1.000   1.000   1.000   1.000   1.000  1.000   100

现在，我们一次检查替换选择 m 的组合。

RcppAlgos

gtools

arrangements

如何：

library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(97)
testVector2 <- sort(rnorm(10))
m <- 8
t1 <- comboGeneral(testVector2, m, repetition = TRUE)
t3 <- gtools::combinations(10, m, testVector2, repeats.allowed = TRUE)
identical(t1, t3)
#> [1] TRUE
## arrangements
t4 <- combinations(testVector2, m, replace = TRUE)
identical(t1, t4)
#> [1] TRUE

基准测试：

microbenchmark(cbRcppAlgos = comboGeneral(testVector2, m, TRUE),
               cbGtools = gtools::combinations(10, m, testVector2, repeats.allowed = TRUE),
               cbArrangements = combinations(testVector2, m, replace = TRUE),
               unit = "relative")
#> Unit: relative
#>            expr     min      lq   mean  median      uq     max neval
#>     cbRcppAlgos   1.000   1.000  1.000   1.000   1.000 1.00000   100
#>        cbGtools 384.990 269.683 80.027 112.170 102.432 3.67517   100
#>  cbArrangements   1.057   1.116  0.618   1.052   1.002 0.03638   100

3。排列

首先，我们检查一次没有替换选择m的排列。

RcppAlgos

gtools

arrangements

如何：

library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(101)
testVector3 <- as.integer(c(2, 3, 5, 7, 11, 13, 17, 19, 23, 29))

## RcppAlgos... permuteGeneral same as comboGeneral above
t1 <- permuteGeneral(testVector3, 6)
## gtools... permutations same as combinations above
t3 <- gtools::permutations(10, 6, testVector3)
identical(t1, t3)
#> [1] TRUE
## arrangements
t4 <- permutations(testVector3, 6)
identical(t1, t4)
#> [1] TRUE

基准测试：

microbenchmark(cbRcppAlgos = permuteGeneral(testVector3, 6),
               cbGtools = gtools::permutations(10, 6, testVector3),
               cbArrangements = permutations(testVector3, 6),
               unit = "relative")
#> Unit: relative
#>            expr     min     lq   mean median     uq   max neval
#>     cbRcppAlgos   1.079  1.027  1.106  1.037  1.003  5.37   100
#>        cbGtools 158.720 92.261 85.160 91.856 80.872 45.39   100
#>  cbArrangements   1.000  1.000  1.000  1.000  1.000  1.00   100

接下来，我们使用一般向量检查不替换的排列（返回所有排列）。

RcppAlgos

gtools

combinat

multicool

arrangements

如何：

library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(89)
testVector3 <- as.integer(c(2, 3, 5, 7, 11, 13, 17, 19, 23, 29))
testVector3Prime <- testVector3[1:7]
## For RcppAlgos, & gtools (see above)

## combinat
t4 <- combinat::permn(testVector3Prime) ## returns a list of vectors
## convert to a matrix
t4 <- do.call(rbind, t4)
## multicool.. we must first call initMC
t5 <- multicool::allPerm(multicool::initMC(testVector3Prime)) ## returns a matrix with n columns
all.equal(t4[do.call(order,as.data.frame(t4)),],
          t5[do.call(order,as.data.frame(t5)),])
#> [1] TRUE

基准测试：

microbenchmark(cbRcppAlgos = permuteGeneral(testVector3Prime, 7),
               cbGtools = gtools::permutations(7, 7, testVector3Prime),
               cbCombinat = combinat::permn(testVector3Prime),
               cbMulticool = multicool::allPerm(multicool::initMC(testVector3Prime)),
               cbArrangements = permutations(x = testVector3Prime, k = 7),
               unit = "relative")
#> Unit: relative
#>            expr      min       lq     mean   median       uq     max neval
#>     cbRcppAlgos    1.152    1.275   0.7508    1.348    1.342  0.3159   100
#>        cbGtools  965.465  817.645 340.4159  818.137  661.068 12.7042   100
#>      cbCombinat  280.207  236.853 104.4777  238.228  208.467  9.6550   100
#>     cbMulticool 2573.001 2109.246 851.3575 2039.531 1638.500 28.3597   100
#>  cbArrangements    1.000    1.000   1.0000    1.000    1.000  1.0000   100

现在，我们检查排列而不替换 1:n（返回所有排列）。

RcppAlgos

gtools

combinat

multicool

partitions

arrangements

如何：

library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(89)
t1 <- partitions::perms(7)  ## returns an object of type 'partition' with n rows
identical(t(as.matrix(t1)), permutations(7,7))
#> [1] TRUE

基准测试：

microbenchmark(cbRcppAlgos = permuteGeneral(7, 7),
               cbGtools = gtools::permutations(7, 7),
               cbCombinat = combinat::permn(7),
               cbMulticool = multicool::allPerm(multicool::initMC(1:7)),
               cbPartitions = partitions::perms(7),
               cbArrangements = permutations(7, 7),
               unit = "relative")
#> Unit: relative
#>            expr      min       lq     mean   median       uq      max
#>     cbRcppAlgos    1.235    1.429    1.412    1.503    1.484    1.720
#>        cbGtools 1152.826 1000.736  812.620  939.565  793.373  499.029
#>      cbCombinat  347.446  304.866  260.294  296.521  248.343  284.001
#>     cbMulticool 3001.517 2416.716 1903.903 2237.362 1811.006 1311.219
#>    cbPartitions    2.469    2.536    2.801    2.692    2.999    2.472
#>  cbArrangements    1.000    1.000    1.000    1.000    1.000    1.000
#>  neval
#>    100
#>    100
#>    100
#>    100
#>    100
#>    100

最后，我们用替换检查排列。

RcppAlgos

iterpc

gtools

arrangements

如何：

library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(34)
testVector3 <- as.integer(c(2, 3, 5, 7, 11, 13, 17, 19, 23, 29))
t1 <- permuteGeneral(testVector3, 5, repetition = TRUE)
t3 <- gtools::permutations(10, 5, testVector3, repeats.allowed = TRUE)
t4 <- permutations(x = testVector3, k = 5, replace = TRUE)

鉴于迄今为止的结果，下一个基准测试有点令人惊讶。

microbenchmark(cbRcppAlgos = permuteGeneral(testVector3, 5, TRUE),
               cbGtools = gtools::permutations(10, 5, testVector3, repeats.allowed = TRUE),
               cbArrangements = permutations(x = testVector3, k = 5, replace = TRUE),
               unit = "relative")
#> Unit: relative
#>            expr   min     lq  mean median    uq   max neval
#>     cbRcppAlgos 1.106 0.9183 1.200  1.030 1.063 1.701   100
#>        cbGtools 2.426 2.1815 2.068  1.996 2.127 1.367   100
#>  cbArrangements 1.000 1.0000 1.000  1.000 1.000 1.000   100

这不是一个错字...gtools::permutations 几乎与其他编译函数一样快。我鼓励读者去查看gtools::permutations 的源代码，因为它是最优雅的编程展示之一（R 或其他）。

4。多组

首先，我们检查多重集的组合。

RcppAlgos

arrangements

要查找多重集的组合/排列，使用RcppAlgos 使用freqs 参数来指定源向量v 的每个元素重复多少次。

library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(496)
myFreqs <- sample(1:5, 10, replace = TRUE)
## This is how many times each element will be repeated
myFreqs
#>  [1] 2 4 4 5 3 2 2 2 3 4
testVector4 <- as.integer(c(1, 2, 3, 5, 8, 13, 21, 34, 55, 89))
t1 <- comboGeneral(testVector4, 12, freqs = myFreqs)
t3 <- combinations(freq = myFreqs, k = 12, x = testVector4)
identical(t1, t3)
#> [1] TRUE

基准测试：

microbenchmark(cbRcppAlgos = comboGeneral(testVector4, 12, freqs = myFreqs),
               cbArrangements = combinations(freq = myFreqs, k = 12, x = testVector4),
               unit = "relative")
#> Unit: relative
#>            expr   min    lq  mean median    uq   max neval
#>     cbRcppAlgos 1.000 1.000 1.000  1.000 1.000 1.000   100
#>  cbArrangements 1.254 1.221 1.287  1.259 1.413 1.173   100

对于一次选择 m 的多重集合的排列，我们有：

RcppAlgos

arrangements

如何：

library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(8128)
myFreqs <- sample(1:3, 5, replace = TRUE)
testVector5 <- sort(runif(5))
myFreqs
#> [1] 2 2 2 1 3
t1 <- permuteGeneral(testVector5, 7, freqs = myFreqs)
t3 <- permutations(freq = myFreqs, k = 7, x = testVector5)
identical(t1, t3)
#> [1] TRUE

基准测试：

microbenchmark(cbRcppAlgos = permuteGeneral(testVector5, 7, freqs = myFreqs),
               cbArrangements = permutations(freq = myFreqs, k = 7, x = testVector5),
               unit = "relative")
#> Unit: relative
#>            expr   min    lq  mean median    uq   max neval
#>     cbRcppAlgos 1.461 1.327 1.282  1.177 1.176 1.101   100
#>  cbArrangements 1.000 1.000 1.000  1.000 1.000 1.000   100

对于返回所有排列的多集排列，我们有：

RcppAlgos

multicool

arrangements

如何：

library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(8128)
myFreqs2 <- c(2,1,2,1,2)
testVector6 <- (1:5)^3
## For multicool, you must have the elements explicitly repeated
testVector6Prime <- rep(testVector6, times = myFreqs2)
t3 <- multicool::allPerm(multicool::initMC(testVector6Prime))

## for comparison
t1 <- permuteGeneral(testVector6, freqs = myFreqs2)
identical(t1[do.call(order,as.data.frame(t1)),],
          t3[do.call(order,as.data.frame(t3)),])
#> [1] TRUE

基准测试：

microbenchmark(cbRcppAlgos = permuteGeneral(testVector6, freqs = myFreqs2),
               cbMulticool = multicool::allPerm(multicool::initMC(testVector6Prime)),
               cbArrangements = permutations(freq = myFreqs2, x = testVector6),
               unit = "relative")
#> Unit: relative
#>            expr      min       lq    mean   median      uq     max neval
#>     cbRcppAlgos    1.276    1.374   1.119    1.461    1.39  0.8856   100
#>     cbMulticool 2434.652 2135.862 855.946 2026.256 1521.74 31.0651   100
#>  cbArrangements    1.000    1.000   1.000    1.000    1.00  1.0000   100

5。总结

gtools 和 combinat 都是用于重新排列矢量元素的成熟软件包。使用gtools 有更多选项（参见上面的概述），使用combinat，您可以重新排列factors。使用multicool，可以重新排列多组。尽管partitions 和gRbase 对这个问题的目的有所限制，但它们是强大的强大功能，分别包含用于处理分区和数组对象的高效函数。

`arrangements`

layout

r = row-major

c = column-major

l = list

collect

getnext

2^31 - 1

RcppAlgos

lower/upper

multicool

nextPerm

getnext

d

观察：

library(arrangements)
icomb <- icombinations(1000, 7)
icomb$getnext(d = 5)
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,]    1    2    3    4    5    6    7
#> [2,]    1    2    3    4    5    6    8
#> [3,]    1    2    3    4    5    6    9
#> [4,]    1    2    3    4    5    6   10
#> [5,]    1    2    3    4    5    6   11

当您只需要一些组合/排列时，此功能非常好。使用传统方法，您必须生成所有组合/排列，然后生成子集。这将使前面的示例无法实现，因为结果超过了10^17（即ncombinations(1000, 7, bigz = TRUE) = 194280608456793000）。

此功能以及对 arrangements 中生成器的改进，使其在内存方面非常高效。

`RcppAlgos`

upper

rowCap

getnext

d

观察：

library(RcppAlgos)
comboGeneral(1000, 7, upper = 5)
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,]    1    2    3    4    5    6    7
#> [2,]    1    2    3    4    5    6    8
#> [3,]    1    2    3    4    5    6    9
#> [4,]    1    2    3    4    5    6   10
#> [5,]    1    2    3    4    5    6   11

2.0.0

lower

2^31 - 1

具有超过 60 亿个组合的并行示例：

system.time(parallel::mclapply(seq(1,6397478649,4390857), function(x) 
        a <- comboGeneral(25, 15, freqs = c(rep(1:5, 5)), lower = x, upper = x + 4390856)
        ## do something
        x
    , mc.cores = 7))
#>     user  system elapsed 
#>  510.623 140.970 109.496

如果您想知道每个包如何扩展，我将为您提供最后一个示例，该示例衡量每个包生成超过 1 亿个结果的速度（NB gtools::combinations 被忽略，因为它会抛出错误：@ 987654437@)。此外，我们从utils 包中显式调用combn，因为我无法从combinat::combn 成功运行。这两者之间的内存使用差异非常奇怪，因为它们只是略有不同（参见“作者”部分下的?utils::combn）。

观察：

library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(2187)
testVector7 <- sort(sample(10^7, 10^3))
system.time(utils::combn(testVector7, 3))
#>    user  system elapsed 
#> 179.956   5.687 187.159
system.time(RcppAlgos::comboGeneral(testVector7, 3))
#>    user  system elapsed 
#>   1.136   0.758   1.937
system.time(arrangements::combinations(x = testVector7, k = 3))
#>    user  system elapsed 
#>   1.963   0.930   2.910
system.time(RcppAlgos::permuteGeneral(testVector7[1:500], 3))
#>    user  system elapsed 
#>   1.095   0.631   1.738
system.time(arrangements::permutations(x = testVector7[1:500], k = 3))
#>    user  system elapsed 
#>   1.399   0.584   1.993

6。内存

在执行comboGeneral 和arrangements::combinations 时，内存将在调用gc 之前跳跃近2 Gbs。这似乎与#rows * #nols * bytesPerCell / 2^30 bytes = choose(1000,3) * 3 * 4 / 2^30 bytes = (166167000 * 3 * 4)/2^30 = 1.857 Gbs 差不多）。但是，在执行combn 时，内存行为是不稳定的（例如，有时它会使用所有 16 Gb 的内存，而其他时候它只会增加几个 Gbs）。当我在 Windows 设置上对此进行测试时，它经常会崩溃。

我们可以使用Rprof 和summaryRporf 来确认这一点。观察：

Rprof("RcppAlgos.out", memory.profiling = TRUE)
t1 <- RcppAlgos::comboGeneral(testVector7, 3)
Rprof(NULL)
summaryRprof("RcppAlgos.out", memory = "both")$by.total
                          total.time total.pct mem.total self.time self.pct
"CombinatoricsRcpp"              1.2       100    1901.6       1.2      100
"RcppAlgos::comboGeneral"        1.2       100    1901.6       0.0        0

Rprof("arrangements.out", memory.profiling = TRUE)
t3 <- arrangements::combinations(10^3, 3, testVector7)
Rprof(NULL)
summaryRprof("arrangements.out", memory = "both")$by.total
                             total.time total.pct mem.total self.time self.pct
".Call"                            2.08     99.05    1901.6      2.08    99.05

使用RcppAlgos 和arrangements，mem.total 注册刚刚超过1900 Mb。

这是比较 gtools、utils 和 combinat 的较小向量上的内存配置文件。

testVector7Prime <- testVector7[1:300]

Rprof("combinat.out", memory.profiling = TRUE)
t3 <- combinat::combn(testVector7Prime, 3)
Rprof(NULL)
summaryRprof("combinat.out", memory = "both")$by.total
                  total.time total.pct mem.total self.time self.pct
"combinat::combn"       3.98    100.00    1226.9      3.72    93.47

Rprof("utils.out", memory.profiling = TRUE)
t4 <- utils::combn(testVector7Prime, 3)
Rprof(NULL)
summaryRprof("utils.out", memory = "both")$by.total
               total.time total.pct mem.total self.time self.pct
"utils::combn"       2.52    100.00    1952.7      2.50    99.21

Rprof("gtools.out", memory.profiling = TRUE)
t5 <- gtools::combinations(300, 3, testVector7Prime)
Rprof(NULL)
summaryRprof("gtools.out", memory = "both")$by.total
                      total.time total.pct mem.total self.time self.pct
"rbind"                     4.94     95.00    6741.6      4.40    84.62

有趣的是，utils::combn 和 combinat::combn 使用不同的内存量并花费不同的时间来执行。这不适用于较小的向量：

microbenchmark(combinat::combn(2:13, 6), utils::combn(2:13, 6))
Unit: microseconds
                    expr     min      lq     mean  median       uq      max neval
combinat::combn(2:13, 6) 527.378 567.946 629.1268 577.163 604.3270 1816.744   100
   utils::combn(2:13, 6) 663.150 712.872 750.8008 725.716 771.1345 1205.697   100

gtools 使用的总内存是utils 的 3 倍多一点。应该注意的是，对于这 3 个包，我每次运行它们时都会得到不同的结果（例如，对于 combinat::combn，有时我会得到 9000 Mb，然后我会得到 13000 Mb）。

不过，没有一个可以匹配 RcppAlgos OR arrangements。在上面的示例中运行时，两者都只使用 51 Mb。

基准脚本：https://gist.github.com/randy3k/bd5730a6d70101c7471f4ae6f453862e （由https://github.com/tidyverse/reprex渲染）

_{*：向 Miklós Bóna 的 A Walk through Combinatorics 致敬}

【讨论】：

优秀的评论！我想我理解为什么在某些情况下，由于生成器的性质，iterpc 的执行效率不如 RcppAlgos。 iterpc 需要在执行实际算法之前初始化一个生成器对象。我实际上正在将 iterpc 重构为一个新包，但自相矛盾的是，我试图摆脱 RCpp 并仅使用 R C api。再次，优秀的包 RcppAlgos！ @RandyLai，感谢您的客气话。我很高兴这篇评论能在某种程度上有所帮助。我听说 R 中的 C api 至少可以说是 tricky。我祝你在你的目标最好。 @JosephWood 我的排列有问题。我想知道permuteGeneral() 函数是否可以应用于列表中的列表以计算所有可能的排列。即expand.grid(1:10,1:100,1:5) 给出不同长度的排列向量。它也适用于列表。考虑我有一个列表mylist = list(list(c(1,2,3,3,4),c(10,20,30,30,40,40,40,55)),list(c(2,4,6,6),1:10,1:50))，如果使用sapply(mylist,expand.grid)，它会给出预期的结果。我想知道这是否可以使用permuteGeneral() 函数来完成，因为expand.grid() 函数在更高维度上需要很多时间。 @maydin、expand.grid 和 permuteGeneral 做两件不同的事情。前者给出笛卡尔积，后者是纯排列。我曾尝试过实现类似于permuteGeneral 的笛卡尔积，但我遇到了很多障碍。不过它在我的名单上！！我大吃一惊！对这个话题进行了多么彻底的探索！谢谢！【参考方案2】：

编辑：我已经更新了答案以使用更高效的包arrangements

开始使用`arrangement`

arrangements 包含一些用于排列和组合的高效生成器和迭代器。已经证明arrangements 优于大多数现有的类似包。可以在here 找到一些基准。

以上问题的答案如下

# 1) combinations: without replacement: distinct items

combinations(5, 2)

      [,1] [,2]
 [1,]    1    2
 [2,]    1    3
 [3,]    1    4
 [4,]    1    5
 [5,]    2    3
 [6,]    2    4
 [7,]    2    5
 [8,]    3    4
 [9,]    3    5
[10,]    4    5


# 2) combinations: with replacement: distinct items

combinations(5, 2, replace=TRUE)

      [,1] [,2]
 [1,]    1    1
 [2,]    1    2
 [3,]    1    3
 [4,]    1    4
 [5,]    1    5
 [6,]    2    2
 [7,]    2    3
 [8,]    2    4
 [9,]    2    5
[10,]    3    3
[11,]    3    4
[12,]    3    5
[13,]    4    4
[14,]    4    5
[15,]    5    5



# 3) combinations: without replacement: non distinct items

combinations(x = c("a", "b", "c"), freq = c(2, 1, 1), k = 2)

     [,1] [,2]
[1,] "a"  "a" 
[2,] "a"  "b" 
[3,] "a"  "c" 
[4,] "b"  "c" 



# 4) combinations: with replacement: non distinct items

combinations(x = c("a", "b", "c"), k = 2, replace = TRUE)  # as `freq` does not matter

     [,1] [,2]
[1,] "a"  "a" 
[2,] "a"  "b" 
[3,] "a"  "c" 
[4,] "b"  "b" 
[5,] "b"  "c" 
[6,] "c"  "c" 

# 5) permutations: without replacement: distinct items

permutations(5, 2)

      [,1] [,2]
 [1,]    1    2
 [2,]    1    3
 [3,]    1    4
 [4,]    1    5
 [5,]    2    1
 [6,]    2    3
 [7,]    2    4
 [8,]    2    5
 [9,]    3    1
[10,]    3    2
[11,]    3    4
[12,]    3    5
[13,]    4    1
[14,]    4    2
[15,]    4    3
[16,]    4    5
[17,]    5    1
[18,]    5    2
[19,]    5    3
[20,]    5    4



# 6) permutations: with replacement: distinct items

permutations(5, 2, replace = TRUE)

      [,1] [,2]
 [1,]    1    1
 [2,]    1    2
 [3,]    1    3
 [4,]    1    4
 [5,]    1    5
 [6,]    2    1
 [7,]    2    2
 [8,]    2    3
 [9,]    2    4
[10,]    2    5
[11,]    3    1
[12,]    3    2
[13,]    3    3
[14,]    3    4
[15,]    3    5
[16,]    4    1
[17,]    4    2
[18,]    4    3
[19,]    4    4
[20,]    4    5
[21,]    5    1
[22,]    5    2
[23,]    5    3
[24,]    5    4
[25,]    5    5


# 7) permutations: without replacement: non distinct items

permutations(x = c("a", "b", "c"), freq = c(2, 1, 1), k = 2)

     [,1] [,2]
[1,] "a"  "a" 
[2,] "a"  "b" 
[3,] "a"  "c" 
[4,] "b"  "a" 
[5,] "b"  "c" 
[6,] "c"  "a" 
[7,] "c"  "b" 



# 8) permutations: with replacement: non distinct items

permutations(x = c("a", "b", "c"), k = 2, replace = TRUE)  # as `freq` doesn't matter

      [,1] [,2]
 [1,] "a"  "a" 
 [2,] "a"  "b" 
 [3,] "a"  "c" 
 [4,] "b"  "a" 
 [5,] "b"  "b" 
 [6,] "b"  "c" 
 [7,] "c"  "a" 
 [8,] "c"  "b" 
 [9,] "c"  "c"

与其他软件包比较

与现有软件包相比，使用arrangements 的优势很少。

集成框架：您不必为不同的方法使用不同的包。

非常高效。有关一些基准，请参阅 https://randy3k.github.io/arrangements/articles/benchmark.html。

它的内存效率很高，它能够生成所有 13 个！ 1 到 13 的排列，由于矩阵大小的限制，现有的包将无法这样做。迭代器的getnext()方法让用户可以一一获取排列。

生成的排列按字典顺序排列，这可能是某些用户需要的。

【讨论】：

我认为通过显示一些讲述每个“问题”故事的输出可能会改进这个答案。这个答案是你包裹的广告。如果您打算这样做，请演示各种功能以及它们为何优于以前的方法。事实上，在我看来，这个问题和答案并不能取代所有其他关于组合/排列的问题（看起来这是你的意图）。嗨，马修，很抱歉让你觉得这是一个广告（确实是:)..）如果你去查看我的答案的编辑历史，你会看到旧的答案正在使用其他软件包。特别是，没有包进行多集的k-permeation，请参阅自制函数here。由于对现有的包不满意，所以我决定自己编写包。但是我同意你的观点，我应该将我的包与现有的包进行比较。我可以建议您更改函数名称。 gtools 中的函数 combinations/permutations 被广泛使用，您的包可能会破坏依赖项/遗留代码/等。在开发包时，我喜欢使用@DirkEddelbuettel 的格言：“不要伤害”。

以上是关于如何在 R 中生成对象的排列或组合？的主要内容，如果未能解决你的问题，请参考以下文章

如何在 R 中生成对象的排列或组合？

R* 中的一段组合学演练

1。简介

2。组合

3。排列

4。多组

5。总结

arrangements

RcppAlgos

6。内存

开始使用arrangement

与其他软件包比较

`arrangements`

`RcppAlgos`

开始使用`arrangement`