结合Grep和For-Loop构造矩阵(R)

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了结合Grep和For-Loop构造矩阵(R)相关的知识,希望对你有一定的参考价值。

我有一个巨大的小数据框列表,我想有意义地组合成一个,但是如何做到这一点的逻辑逃避了我。

例如,如果我有一个类似于此类的数据框列表,尽管有更多的文件,其中许多我不希望在我的数据框中:

MyList = c("AthosVersusAthos.csv", "AthosVerusPorthos.csv", "AthosVersusAramis.csv", "PorthosVerusAthos.csv", "PorthosVersusPorthos.csv", "PorthosVersusAramis.csv", "AramisVersusAthos.csv", "AramisVersusPorthos.csv", "AramisVerusPothos.csv", "BobVersusMary.csv", "LostCities.txt")

我想要的是将它们组装成一个大型数据框。这看起来像这样。

                   |                    |
 AthosVersusAthos  | PorthosVersusAthos | AramisVersusAthos
                   |                    |
 ------------------------------------------------------
                   |                    |
 AthosVerusPorthos | PothosVersusPorthos| AramisVersusPorthos
                   |                    |
 ------------------------------------------------------
                   |                    |
 AthosVersusAramis | PorthosVersusAramis| AramisVersusAramis
                   |                    |

或者更正确(只有矩阵的一部分中的样本数):

           |       Athos      |      Porthos       |    Aramis
    -------|------------------------------------------------------
           | 10     9      5  |                    |
    Athos  | 2      10     4  |                    | 
           | 3      0      10 |                    |
    -------|------------------------------------------------------
           |                  |                    |
   Porthos |                  |                    |                  
           |                  |                    |
    -------|------------------------------------------------------
           |                  |                    |
   Aramis  |                  |                    |                  
           |                  |                    |
    -------------------------------------------------------------

到目前为止我所管理的是:

Musketeers = c("Athos", "Porthos", "Aramis")

  for(i in 1:length(Musketeers)) {
    for(j in 1:length(Musketeers)) {

    CombinedMatrix <- cbind (

      rbind(MyList[grep(paste0("^(", Musketeers[i],
      ")(?=.*Versus[", Musketeers[j], "]"), names(MyList),
      value = T, perl=T)])

  )
 }
}

我试图做的是结合我的grep命令(非常重要的给定文件的数量和我需要选择它们的特异性)然后组合rbindcbind,以便矩阵的行和列有意义地连接。

我的总体计划是将所有以'Athos'开头的数据帧合并为一列,并再次对以'Porthos'和'Aramis'开头的数据帧进行此操作,然后将这三列逐行合并到一个列中数据帧。

我知道我离我很远但我不能完全理解从哪里开始。

编辑:@PierreGramme生成了一个有用的模型数据集,我将在下面添加,因为我认为最初提供它是有用的。

Musketeers = c("Athos", "Porthos", "Aramis")
MyList = c("AthosVersusAthos.csv", "AthosVersusPorthos.csv", "AthosVersusAramis.csv", 
                    "PorthosVersusAthos.csv", "PorthosVersusPorthos.csv", "PorthosVersusAramis.csv", 
                    "AramisVersusAthos.csv", "AramisVersusPorthos.csv", "AramisVersusAramis.csv",
                    "BobVersusMary.csv", "LostCities.txt")
MyList = lapply(setNames(nm=MyList), function(x) matrix(rnorm(9), nrow=3, dimnames=list(c("a","b","c"), c("x","y","z"))) )
答案

首先制作一个可重复的例子。它忠实吗?如果是这样,我将添加代码来回答

Musketeers = c("Athos", "Pothos", "Aramis")
MyList = c("AthosVersusAthos.csv", "AthosVersusPothos.csv", "AthosVersusAramis.csv", 
                    "PothosVersusAthos.csv", "PothosVersusPothos.csv", "PothosVersusAramis.csv", 
                    "AramisVersusAthos.csv", "AramisVersusPothos.csv", "AramisVersusAramis.csv",
                    "BobVersusMary.csv", "LostCities.txt")
MyList = lapply(setNames(nm=MyList), function(x) matrix(rnorm(9), nrow=3, dimnames=list(c("a","b","c"), c("x","y","z"))) )

那么你想将这些矩阵中的9个连接成你描述的组合矩阵是正确的吗?

编辑:然后代码解决您的问题:

# Helper function to extract the relevant portion of MyList and rbind() it
makeColumns = function(n){
    re = paste0("^",n,"Versus")
    sublist = MyList[grep(re, names(MyList))]
    names(sublist) = sub(re, "", sub("\.csv$","", names(sublist)))

    # Make sure sublist is sorted correctly and contains info on all musketeers
    sublist = sublist[Musketeers]

    # Change row and col names so that they are unique in the final result
    sublist = lapply(names(sublist), function(m) {
        res = sublist[[m]]
        rownames(res) = paste0(m,"_",rownames(res))
        colnames(res) = paste0(n,"_",colnames(res))
        res
    })

    do.call(rbind, sublist)
}

lColumns = lapply(setNames(nm=Musketeers), makeColumns)
CombinedMatrix = do.call(cbind, lColumns)

以上是关于结合Grep和For-Loop构造矩阵(R)的主要内容,如果未能解决你的问题,请参考以下文章

矩阵权(Matrix weighted)Bezier三角(曲面)片

关于 Swift for-loop 的内存管理问题

R:如何在给定行和列标签以及二元运算符的情况下构造数据框/矩阵

MATLAB学习—矩阵构造和四则运算

itchat和matplotlib的结合使用

python,带有'def'和'for-loop'的指数