在 igraph 中使用两个相关矩阵绘制网络

Posted 2023-02-24

技术标签:

【中文标题】在 igraph 中使用两个相关矩阵绘制网络【英文标题】：plot networks using two correlation matrices in igraph 【发布时间】：2013-12-27 19:57:45 【问题描述】：

我需要从两个矩阵开始绘制一个网络，代表细菌种类和产生的代谢物。我想要一个网络显示细菌和代谢物之间的共现和相关性（不是 bateria VS 细菌和代谢物 VS 代谢物）。

这些是我的数据示例：

> mydata_bac

Taxon                                                          CD1         CD2
Actinomycetaceae;g__Actinomyces                        0.072998825 0.031399459
Coriobacteriaceae;g__Atopobium                         0.040946468 0.002703265
Corynebacteriaceae;g__Corynebacterium                  0.002517201 0.006446247
Micrococcaceae;g__Rothia                               0.001174694 0.002703265
Porphyromonadaceae;g__Porphyromonas                    0.023326061 0.114368892
Prevotellaceae;g__Prevotella                           0.252894781 0.102308172
Flavobacteriaceae;g__Capnocytophaga                    0.001174694 0.029320025
Aerococcaceae;g__Abiotrophia                           0.002013761 0.003327095
Carnobacteriaceae;g__Granulicatella                    0.042960228 0.049490539
Gemellaceae;g__Gemella                                 0.027857023 0.067165731
Streptococcaceae;g__Streptococcus                      0.220506796 0.182782283
ClostridialesFamilyXI.IncertaeSedis;g__                0.000000000 0.000623830
ClostridialesFamilyXIII.IncertaeSedis;g__Mogibacterium 0.006880349 0.002495321
Lachnospiraceae;Other                                  0.000335627 0.000831774
Clostridia                                             0.004363148 0.002079434
Lachnospiraceae;g__Oribacterium                        0.003524081 0.002079434
Peptostreptococcaceae;g__Peptostreptococcus            0.000167813 0.005198586
Veillonellaceae;Other                                  0.001342507 0.001455604
Veillonellaceae;g__Veillonella                         0.047323376 0.082553545
Fusobacteriaceae;g__Fusobacterium                      0.009229737 0.010813059
Fusobacteriaceae;g__Leptotrichia                       0.092465179 0.076523186
Neisseriaceae;g__Neisseria                             0.013592885 0.027656477
Pasteurellaceae;g__Haemophilus                         0.014431952 0.092534831
SR1;c__;f__;g__                                        0.000000000 0.002079434
TM7;c__TM7-3;f__;g__                                   0.065782849 0.018299023
Erysipelotrichaceae;g__Bulleidia                       0.007551603 0.004366812
Bacteroidia                                            0.000000000 0.000415887
Porphyromonadaceae;g__Tannerella                       0.000671254 0.002079434
Flavobacteriaceae                                      0.002013761 0.001247661
Bacilli                                                0.002181574 0.002911208
Clostridia;f__;g__                                     0.000671254 0.002703265
ClostridialesFamilyXIII.IncertaeSedis;g__Eubacterium   0.003020641 0.002079434
Lachnospiraceae;g__Moryella                            0.003188454 0.000623830
Veillonellaceae;g__Selenomonas                         0.004866588 0.021834061
Fusobacteriaceae                                       0.000335627 0.001871491
Campylobacteraceae;g__Campylobacter                    0.001510321 0.001247661
Pasteurellaceae;g__Actinobacillus                      0.002852828 0.000207943
Burkholderiaceae;g__Lautropia                          0.000000000 0.002495321
Lactobacillaceae;g__Lactobacillus                      0.000000000 0.000000000
Staphylococcaceae;g__Staphylococcus                    0.000000000 0.000000000


> mydata_metab

Molecule CD1 CD2
1-chloro-decane 0.15 0.18
1-methyl-2-1-methylethyl-benzene 0.05 0.06
1-Octadecene 1.47 1.6
1.2.3-Trimethylbenzene 0.39 0.47
1.3-Bis1.1-dimethylethyl-benzene 0.24 0.06
1.3H-Isobenzofuranone 0.04 0.05
2-Butanone 0.01 0.03
2-ethyl-1-hexanol 1.44 1.34
2-Hexanol 0 0.02
2-Pentylfuran 0.42 0.56
2-pentylthiophene 8.08 7.21
25bis11dimethylethylphenol 0.15 0.27
2.6-dimethyl-4-heptanone 0.23 0
4-1.1-dimethylpropyl-phenol 0.39 0.3
4-1.1.3.3-tertamethylbutyl-phenol 1.6 2.04
4-Methoxybenzhydrol 0.02 0.02
4-methyl-1-pentene 0.02 0
4-methyl-2-hexanone 0.03 0.01
4-methyl-3-penten-2-one 0.67 0.27
aceticacidethylester 0.5 0.7
Benzene 0.13 0.1
Benzene-octyloxy 0.03 0.03
butanoicacid-2-methyloctylester 0.04 0.03
carbondisulfide 3.05 3.72
Caryophyllene 0.02 0.02
ethyl-alcohol 0.45 0.61
hexadecane 0.02 0.02
menthol 0.08 0.06
methyl-isobutyl-ketone 0.09 0.08
n.d.3-ion97 1.99 1.92
n.d.4-ion102 2.12 2.04
Nonadecane 0 0
Nonanal 0.82 0.71
Octanal 0.70 0.71
Pentadecane 0.03 0.03
phenol 0.12 0.09
Propyl-alcohol 0.20 0.22
Tetradecane 0.12 0.09
Toluene 0.25 0.15
Trichloromethane 0.15 0.12

这是我尝试过的：

> library(psych)
> library(igraph)

> mydata_bac <- read.csv(file="L5_filt.txt", header=T, row.names=1, sep="\t")
> mydata_metab <- read.csv(file="metab.txt", header=T, row.names=1, sep="\t")
> mydata_t <- t(as.matrix(mydata_bac))
> metab_t <- t(as.matrix(mydata_metab))
> cor.matrix <- corr.test(mydata_t,metab_t, method = "spearman")
> graph.f<-graph.adjacency(cor.matrix$r, weighted=TRUE, mode="upper")
> t.names <- colnames(cor.matrix)[as.numeric(V(graph.f)$name)]

> graph.f = simplify(graph.f)
> plot (graph.f, vertex.size=5, vertex.shape="circle", vertex.label.color="red", vertex.label=t.names, vertex.label.cex=0.9, edge.width=1, layout=layout.fruchterman.reingold)

但我得到了一个只有代谢物的网络…… 错误在哪里？

【问题讨论】：

CD1 和 CD2 列代表什么？它们是两个样本。在完整的数据集中，我有 15 个样本。 【参考方案1】：

好的，所以我认为这里发生的事情是您拥有的代谢物数量，恰好与您拥有的分类群数量相同。否则graph.adjacency 会抛出错误：它想要一个邻接方阵，cor.matrix$r 的行AND 应该包含图中的所有节点。

您的图表是bipartite：代谢物可以与分类群相关联，分类群可以与代谢物相关联，但代谢物永远不会与其他代谢物相关联，分类群永远不会与其他分类群相关联。当您运行corr.test 时，您的行就是您的分类群，您的列就是您的代谢物。

我不熟悉igraph 包：可能有比graph.adjacency 更适合处理二分图的方法。取而代之的是，我们可以强制我们的邻接矩阵 cor.matrix 包含所有节点：

获得cor.matrix后立即运行：

nNodes <- ncol(cor.matrix$r) + nrow(cor.matrix$r)
nodeNames <- c(colnames(cor.matrix$r), rownames(cor.matrix$r))
adj <- matrix(0, nNodes, nNodes, dimnames=list(nodeNames, nodeNames))
adj[(ncol(cor.matrix$r)+1):nrow(adj), 1:ncol(cor.matrix$r)] <- cor.matrix$r
adj[1:ncol(cor.matrix$r), (ncol(cor.matrix$r)+1):ncol(adj)] <- t(cor.matrix$r)

现在您可以从adj 获取graph.f，然后运行其余代码：

graph.f <- graph.adjacency(adj, weighted=TRUE, mode="upper")

【讨论】：

我收到一个错误...当我运行adj <- matrix(0, nNodes, nNodes, dimnames=list(nodeNames, nodeNames)) 时，我收到：Error in matrix(0, nNodes, nNodes, dimnames = list(nodeNames, nodeNames)) : invalid 'nrow' value (too large or NA) 有什么建议吗？啊，所有的cor.matrix 都应该是cor.matrix$r。现在试试？成功了！！我只有一个问题：使用这段代码，我是只显示两个矩阵之间的相关性还是在同一个矩阵内？因为我对细菌-细菌或代谢物-代谢物之间的相关性不感兴趣，而只对细菌和代谢物之间的相关性感兴趣...... 对：在adj 中，所有这些边都设置为 0。我希望 igraph 不会创建它们。但是，您应该能够通过将答案调整为 previous question 来删除所有为 0 的边：)

以上是关于在 igraph 中使用两个相关矩阵绘制网络的主要内容，如果未能解决你的问题，请参考以下文章

将 arma::mat 邻接矩阵转换为 C 中的 igraph 图 (Rcpp)

使用Python-iGraph绘制好友关系图