如何在火山图上贴上基因名称标签?

Posted

技术标签:

【中文标题】如何在火山图上贴上基因名称标签?【英文标题】:How can I have Gene name labels on a volcano plot? 【发布时间】:2020-12-30 06:36:39 【问题描述】:

我正在尝试向我的火山图添加标签,但是有些标签没有出现在 VP 上,而有些标签却出现了。谁能告诉我问题出在哪里。

例如,在这个图表中,基因“Nr1h4”没有出现在图表上,并且被标记为 False 而不是 True。

New.df.7vsNO$genelabels <- ""
New.df.7vsNO$genelabels <- ifelse(New.df.7vsNO$Genes == "Shh"
                                  | New.df.7vsNO$Genes == "Ascl3"
                                  | New.df.7vsNO$Genes == "Klk1b27"
                                  | New.df.7vsNO$Genes == "Tenm1"
                                  | New.df.7vsNO$Genes == "Nr1h4", T, F)

                
library(ggplot2)
library(ggrepel)
                          
ggplot(New.df.7vsNO) + 
  geom_point(aes(log2FC,logpv,col= diffexpressed)) +
  geom_text_repel(aes(log2FC, logpv),label = ifelse(New.df.7vsNO$genelabels == TRUE, as.character(New.df.7vsNO$Genes),""), box.padding = unit(.7, "lines"),hjust= 0.30) + 
  theme(legend.title=element_blank(),text = element_text(size= 13))+
  scale_color_manual(values=c("red", "blue"))         

我的数据:

structure(list(log2FC = c(2.5576, -1.7629, 4.5593, -1.6414, 4.7747, 
1.9217, 2.5951, -2.4236, 4.2056, -2.8089, -2.1215, -1.7551, 7.6618, 
1.9732, 1.768, -1.7532, 2.1137, -7.4119, -5.0595, -1.6435), logpv = c(6.23062267392386, 
2.4454139371159, 6.87289520163519, 2.41294040382783, 9.84466396253494, 
3.31880400398931, 5.49214412830417, 5.38090666937326, 10.3914739664228, 
7.39254497678533, 4.19928292171762, 2.43023996241365, 3.67370511218151, 
3.17656489822122, 2.45950785169463, 2.70542356079838, 3.13990167030148, 
3.04151256697968, 14.8041003475908, 2.43438827509794), diffexpressed = c("UP", 
"DOWN", "UP", "DOWN", "UP", "UP", "UP", "DOWN", "UP", "DOWN", 
"DOWN", "DOWN", "UP", "UP", "UP", "DOWN", "UP", "DOWN", "DOWN", 
"DOWN"), Genes = c("Ngfr", "Axin2", "Igsf5", "Dlat", "Scnn1g", 
"Ckmt1", "Tmprss2", "Pparg", "Sema4f", "Hk2", "Pxmp4", "Scn4a", 
"Slc13a2", "Timp1", "Uhrf1", "Cnn1", "Ube2c", "Rhbg", "Tmem79", 
"Cyp51"), genelabels = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE)), row.names = c(NA, 20L), class = "data.frame")

【问题讨论】:

【参考方案1】:

你的情节很好。发生的情况是您的数据集没有您在ifelse 语句中指定的任何基因。如果您检查数据集中的基因,它会返回 charachter(0),即数据集中没有这样的基因。

New.df.7vsNO$Genes[New.df.7vsNO$Genes %in% c("Shh", "Ascl3", "Klk1b27", 
                                             "Tenm1", "Nr1h4")]

但如果你绘制其他基因,它会起作用:

New.df.7vsNO$genelabels <- ifelse(New.df.7vsNO$Genes == "Ngfr"
                                  | New.df.7vsNO$Genes == "Axin2"
                                  | New.df.7vsNO$Genes == "Igsf5", T, F)
ggplot(New.df.7vsNO) + 
      geom_point(aes(log2FC,logpv,col= diffexpressed)) +
      geom_text_repel(aes(log2FC, logpv),
                      label = ifelse(New.df.7vsNO$genelabels == TRUE, 
                                     as.character(New.df.7vsNO$Genes),""), 
                      box.padding = unit(.7, "lines"),hjust= 0.30) + 
      theme(legend.title=element_blank(),text = element_text(size= 13))+
      scale_color_manual(values=c("red", "blue"))  

【讨论】:

【参考方案2】:

试试这个

library(dplyr)
ggplot(New.df.7vsNO) + 
  geom_point(aes(log2FC,logpv,col= diffexpressed)) +
  geom_text_repel(data = New.df.7vsNO %>% 
                    filter(Genes %in% c("Ngfr", "Axin2", "Igsf5", "Dlat", "Tmem79", "Hk2")), 
            aes(label = Genes, x = log2FC, y = logpv), box.padding = unit(.7, "lines"),hjust= 0.30) +
  scale_color_manual(values=c("red", "blue"))

【讨论】:

以上是关于如何在火山图上贴上基因名称标签?的主要内容,如果未能解决你的问题,请参考以下文章

画火山图

如何在所有屏幕上放置一个带有时间的标签

快来看看如何使用R语言绘制一张漂亮的火山图

基因差异火山图怎么看

如何在ggplot2中并排条形图上居中标签

如何在小提琴图上显示胡须和点?