R - 给定一个矩阵和一个幂，生成多个矩阵，其中包含矩阵列的所有唯一组合

Posted 2023-02-14

技术标签:

【中文标题】R - 给定一个矩阵和一个幂，生成多个矩阵，其中包含矩阵列的所有唯一组合【英文标题】：R - Given a matrix and a power, produce multiple matrices containing all unique combinations of matrix columns 【发布时间】：2018-09-07 09:34:14 【问题描述】：

根据我在下面链接的相关问题（请参阅@Aleh 解决方案）：我希望只计算给定幂的矩阵中列之间的唯一乘积。

例如，对于 N=5,M=3, p=2，我们得到列 (1,1), (1,2), (1,3), (2,1), (2 ,2), (2,3), (3,1), (3,2), (3,3)。我想修改（@Aleh's）代码以仅计算列（1,1），（1,2），（1,3），（2,2），（2,3），（3,3）之间的产品.但我想为每个 p 次订单执行此操作。

有人可以帮我在 R 中完成这个吗？

非常感谢！

相关问题提问：R - Given a matrix and a power, produce multiple matrices containing all combinations of matrix columns

【问题讨论】：

如果 M=4 和 p=2 你会期望 16 列正确吗？ @MikeH。你注意到一个错误！对于我上面的例子，我的意思是M=3。它已得到纠正。当M=4 和p=2 时，原来的 16 列应该减少到只有 10 个唯一列 [(1,1,), (1,2), (1,3), (1,4), (2, 2), (2,3), (2,4), (3,3), (3,4), (4,4)]. @MikeH。需要减少到上面给出的 10 个唯一列的原始 16 列是：[(1,1,)、(1,2)、(1,3)、(1,4)、(2,1) , (2,2), (2,3), (2,4), (3,1), (3,2), (3,3), (3,4), (4,1), ( 4,2), (4,3), ((4,4)] 您能否量化您的效率要求？ M、N 和 p 的实际值是多少？感谢您的解决方案！ @RalfStubner M 通常在 25 岁以下，而 N 可以在 5000-10,000 之间。 p 通常不大于 3，但最多为 4。 【参考方案1】：

我们创建了以下函数，它将所有“唯一”排列与所选的p 相乘，并乘以矩阵的相关列：

fun <- function(mat,p) 
  mat <- as.data.frame(mat)
  combs <- do.call(expand.grid,rep(list(seq(ncol(mat))),p)) # all combinations including permutations of same values
  combs <- combs[!apply(combs,1,is.unsorted),]              # "unique" permutations only
  rownames(combs) <- apply(combs,1,paste,collapse="-")      # Just for display of output, we keep info of combinations in rownames
  combs <- combs[order(rownames(combs)),]                   # sort to have desired column order on output
  apply(combs,1,function(x) Reduce(`*`,mat[,x]))            # multiply the relevant columns

示例

N = 5
M = 3
mat1 = matrix(1:(N*M),N,M)
#      [,1] [,2] [,3]
# [1,]    1    6   11
# [2,]    2    7   12
# [3,]    3    8   13
# [4,]    4    9   14
# [5,]    5   10   15

M = 4
mat2 = matrix(1:(N*M),N,M)
#      [,1] [,2] [,3] [,4]
# [1,]    1    6   11   16
# [2,]    2    7   12   17
# [3,]    3    8   13   18
# [4,]    4    9   14   19
# [5,]    5   10   15   20

lapply(2:4,fun,mat=mat1)
# [[1]]
#      1-1 1-2 1-3 2-2 2-3 3-3
# [1,]   1   6  11  36  66 121
# [2,]   4  14  24  49  84 144
# [3,]   9  24  39  64 104 169
# [4,]  16  36  56  81 126 196
# [5,]  25  50  75 100 150 225
# 
# [[2]]
#      1-1-1 1-1-2 1-1-3 1-2-2 1-2-3 1-3-3 2-2-2 2-2-3 2-3-3 3-3-3
# [1,]     1     6    11    36    66   121   216   396   726  1331
# [2,]     8    28    48    98   168   288   343   588  1008  1728
# [3,]    27    72   117   192   312   507   512   832  1352  2197
# [4,]    64   144   224   324   504   784   729  1134  1764  2744
# [5,]   125   250   375   500   750  1125  1000  1500  2250  3375
# 
# [[3]]
#      1-1-1-1 1-1-1-2 1-1-1-3 1-1-2-2 1-1-2-3 1-1-3-3 1-2-2-2 1-2-2-3 1-2-3-3 1-3-3-3 2-2-2-2 2-2-2-3 2-2-3-3 2-3-3-3 3-3-3-3
# [1,]       1       6      11      36      66     121     216     396     726    1331    1296    2376    4356    7986   14641
# [2,]      16      56      96     196     336     576     686    1176    2016    3456    2401    4116    7056   12096   20736
# [3,]      81     216     351     576     936    1521    1536    2496    4056    6591    4096    6656   10816   17576   28561
# [4,]     256     576     896    1296    2016    3136    2916    4536    7056   10976    6561   10206   15876   24696   38416
# [5,]     625    1250    1875    2500    3750    5625    5000    7500   11250   16875   10000   15000   22500   33750   50625

fun(mat2,2)
#      1-1 1-2 1-3 1-4 2-2 2-3 2-4 3-3 3-4 4-4
# [1,]   1   6  11  16  36  66  96 121 176 256
# [2,]   4  14  24  34  49  84 119 144 204 289
# [3,]   9  24  39  54  64 104 144 169 234 324
# [4,]  16  36  56  76  81 126 171 196 266 361
# [5,]  25  50  75 100 100 150 200 225 300 400

【讨论】：

【参考方案2】：

如果我对您的理解正确，那么这就是您要查找的内容：

# all combinations of p elements out of M with repetiton 
# c.f. http://www.mathsisfun.com/combinatorics/combinations-permutations.html
comb_rep <- function(p, M) 
  combn(M + p - 1, p) - 0:(p - 1)


# use cols from mat to form a new matrix
# take row products
col_prod <- function(cols, mat) 
  apply(mat[ ,cols], 1, prod)


N <- 5
M <- 3
p <- 3
mat <- matrix(1:(N*M),N,M)

col_comb <- lapply(2:p, comb_rep, M)
col_comb
#> [[1]]
#>      [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,]    1    1    1    2    2    3
#> [2,]    1    2    3    2    3    3
#> 
#> [[2]]
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,]    1    1    1    1    1    1    2    2    2     3
#> [2,]    1    1    1    2    2    3    2    2    3     3
#> [3,]    1    2    3    2    3    3    2    3    3     3

# prepend original matrix
res_mat <- list()
res_mat[[1]] <- mat
c(res_mat, 
  lapply(col_comb, function(cols) apply(cols, 2, col_prod, mat)))
#> [[1]]
#>      [,1] [,2] [,3]
#> [1,]    1    6   11
#> [2,]    2    7   12
#> [3,]    3    8   13
#> [4,]    4    9   14
#> [5,]    5   10   15
#> 
#> [[2]]
#>      [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,]    1    6   11   36   66  121
#> [2,]    4   14   24   49   84  144
#> [3,]    9   24   39   64  104  169
#> [4,]   16   36   56   81  126  196
#> [5,]   25   50   75  100  150  225
#> 
#> [[3]]
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,]    1    6   11   36   66  121  216  396  726  1331
#> [2,]    8   28   48   98  168  288  343  588 1008  1728
#> [3,]   27   72  117  192  312  507  512  832 1352  2197
#> [4,]   64  144  224  324  504  784  729 1134 1764  2744
#> [5,]  125  250  375  500  750 1125 1000 1500 2250  3375

不过，它并不是很有效，因为例如三次方是从原始矩阵的三列计算的，而不是原始矩阵的一列和二次方的一列。

编辑： 使用 cmets 中提到的实际大小进行测试表明，@Moody_Mudskipper 的乘法方法快得多，而我的组合方法要快一些。所以将两者结合起来是有意义的：

# original function from @Moody_Mudskipper's answer
fun <- function(mat,p) 
  mat <- as.data.frame(mat)
  combs <- do.call(expand.grid,rep(list(seq(ncol(mat))),p)) # all combinations including permutations of same values
  combs <- combs[!apply(combs,1,is.unsorted),]              # "unique" permutations only
  rownames(combs) <- apply(combs,1,paste,collapse="-")      # Just for display of output, we keep info of combinations in rownames
  combs <- combs[order(rownames(combs)),]                   # sort to have desired column order on output
  apply(combs,1,function(x) Reduce(`*`,mat[,x]))            # multiply the relevant columns

combined <- function(mat, p) 
  mat <- as.data.frame(mat)
  combs <- combn(ncol(mat) + p - 1, p) - 0:(p - 1)          # all combinations with repetition
  colnames(combs) <- apply(combs, 2, paste, collapse = "-") # Just for display of output, we keep info of combinations in colnames
  apply(combs, 2, function(x) Reduce(`*`, mat[ ,x]))        # multiply the relevant columns

N <- 10000
M <- 25
p <- 4
mat <- matrix(runif(N*M),N,M)
microbenchmark::microbenchmark(
  fun(mat, p),
  combined(mat, p),
  times = 10
)
#> Unit: seconds
#>              expr      min       lq     mean   median       uq      max neval
#>       fun(mat, p) 3.456853 3.698680 4.067995 4.032647 4.341944 4.869527    10
#>  combined(mat, p) 2.543994 2.738313 2.870446 2.793768 3.090498 3.254232    10

请注意，对于 M > 9，这两个函数不会产生相同的结果，因为列顺序不同是由于 1-10 < 1-2 的词法排序在 fun 中使用的。如果在combined 中插入相同的词法排序，结果将是相同的。

【讨论】：

结合这两种方法做得很好:) 非常好！赏金颁发。感谢你们或你们的努力。

以上是关于R - 给定一个矩阵和一个幂，生成多个矩阵，其中包含矩阵列的所有唯一组合的主要内容，如果未能解决你的问题，请参考以下文章