在一张图表ggplot R上结合散点图、箱线图和线性回归线

Posted

技术标签:

【中文标题】在一张图表ggplot R上结合散点图、箱线图和线性回归线【英文标题】:Combine scatter, boxplot and linear regression line on one chart ggplot R 【发布时间】:2021-11-23 02:42:19 【问题描述】:

我想使用 GGplot 将散点图、散点图和线性回归线放置到一个图表上。我可以将三个中的两个放在一张图表上,但无法将回归与箱线图结合起来。

下面是我的数据示例

df <- structure(list(Sample = c(2113, 2113, 2114, 2114, 2115, 2115, 
2116, 2116, 2117, 2117, 2118, 2118, 2119, 2119, 2120, 2120, 2121, 
2121, 2122, 2122, 2123, 2123, 2124, 2124), Rep_No = c("A", "B", 
"A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", 
"B", "A", "B", "A", "B", "A", "B", "A", "B"), Fe = c(57.24, 57.12, 
57.2, 57.13, 57.21, 57.14, 57.16, 57.31, 57.11, 57.18, 57.21, 
57.12, 57.14, 57.17, 57.1, 57.18, 57, 57.06, 57.13, 57.09, 57.17, 
57.23, 57.09, 57.1), SiO2 = c("6.85", "6.83", "6.7", "6.69", 
"6.83", "6.8", "6.76", "6.79", "6.82", "6.82", "6.8", "6.86", 
"6.9", "6.82", "6.81", "6.83", "6.79", "6.76", "6.8", "6.88", 
"6.83", "6.79", "6.8", "6.83"), Al2O3 = c("2.9", "2.88", "2.88", 
"2.88", "2.92", "2.9", "2.89", "2.87", "2.9", "2.89", "2.9", 
"2.89", "2.89", "2.88", "2.89", "2.91", "2.91", "2.91", "2.9", 
"2.9", "2.91", "2.91", "2.88", "2.86")), row.names = c(NA, -24L
), class = "data.frame")

到目前为止我的代码

x <- df$Sample                                                                                                                                                                                                                   
y <- df$Fe

lm_eqn <- function(df,...)
  m <- lm(y ~ x, df);
  eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2, 
                   list(a = format(unname(coef(m)[1]), digits = 2),
                        b = format(unname(coef(m)[2]), digits = 2),
                        r2 = format(summary(m)$r.squared, digits = 3)))
  as.character(as.expression(eq));


a <- lm_eqn(df)


p <- df %>%
  mutate(Sample = factor(Sample)) %>%
  ggplot()+
  geom_boxplot(mapping = aes(x = "All Data", y = Fe))+
  geom_point(mapping = aes(x = Sample, y = Fe, color = Sample))+
  ggtitle("Lab Test Order Fe") +
  theme(plot.title = element_text(hjust = 0.5)) +
  theme(legend.position = "none")+
  xlab(label = "Sample No") +
  ylab("Homogeneity Test Fe %")
p

和我的代码得到线性趋势线

p2 <- df %>% 
  ggplot(aes(Sample, y = Fe))+
  geom_point(mapping = aes(x = Sample, y = Fe))+
  geom_smooth(method = lm, se = FALSE)+
  theme(legend.position = "None")+
  geom_text(x = 2115, y = 57.05, check_overlap = T, label = a, parse = TRUE)

p2

我怎样才能将这三个都放在同一个图表上。我还想把箱线图放在首位,保持点的颜色以及回归线的文本放置在最佳位置,而不是设置放置坐标。

任何帮助表示赞赏。

【问题讨论】:

你的例子对我不起作用。调用 lm_eqn 函数时,我得到:“ eval 中的错误(predvars,data,env):找不到对象 'y'” @dario 我错过了两行代码我已经修改了帖子 【参考方案1】:

我建议两个选项。首先,在scalesggpmisc 包的帮助下,将所有内容放入一个绘图/框架中。这是你问的,字面意思。 然后,在patchwork 的帮助下,得到两个对齐的图。一个是箱线图,另一个是散点图+回归曲线。

选项 1。全部捆绑在一起。

library(tidyverse)
library(scales)  # To get nice looking x-axis breaks
library(ggpmisc) # To help with optimal position for the regression formula

  ggplot(data = df, aes(x = Sample, y = Fe))+
    geom_point(mapping = aes(x = Sample, y = Fe, color = as.factor(Sample))) +
    stat_poly_eq(formula = y ~x , mapping = aes( label = a), parse = TRUE, method = "lm", hjust = -0.35 ) +
    geom_smooth(method = lm, se = FALSE) +
    geom_boxplot(mapping = aes(x = min(Sample) - 1, y = Fe)) +  
    theme(legend.position = "None") +
    labs(title = "Lab Test Order Fe", x = "Sample No", y = "Homogeneity Test Fe %") +
    scale_x_continuous(labels = c("All Data", as.integer(df$Sample)),
                       breaks = c(min(df$Sample)-1, df$Sample))

选项 2.通过patchwork 组装图。

library(tidyverse)
library(scales)    # To get nice looking x-axis breaks
library(ggpmisc)   # To help with optimal position for the regression formula
library(patchwork) # To assemble a composite plot

p_boxplot <- 
  ggplot(data = df, aes(x = Sample, y = Fe))+
  geom_boxplot(data = df, mapping = aes(x = "All Data", y = Fe)) +
  labs(subtitle = "Box Plot", 
       x = "", 
       y = "Homogeneity Test Fe %")

p_scatter <- 
  ggplot(data = df, aes(x = Sample, y = Fe))+
  geom_point(mapping = aes(x = Sample, y = Fe, color = as.factor(Sample))) +
  stat_poly_eq(formula = y ~x , mapping = aes( label = a), parse = TRUE, method = "lm", ) +
  geom_smooth(method = lm, se = FALSE) +
  theme(legend.position = "None") +
  labs(subtitle = "Scatter Plot", 
       x = "Sample No", y = "") +
  scale_x_continuous(labels = as.integer(df$Sample),
                     breaks = df$Sample)


p_boxplot + p_scatter + 
  plot_layout(widths = c(1,5)) + 
  plot_annotation(title = "Lab Test Order Fe")

【讨论】:

非常感谢。选项 1 是我要求的,但我认为选项 2 也可以。

以上是关于在一张图表ggplot R上结合散点图、箱线图和线性回归线的主要内容,如果未能解决你的问题,请参考以下文章

一张图绘制多组散点图和折线图

R语言 | 图像嵌套的实现

【R语言】--- 散点图

用ggplot2画箱线图叠加图层后变成一个很奇怪的样子,求救求救?

如何用R画折线图,散点图,平滑曲线图

JMP图形图表,怎么才能将几列数据的散点图和箱线图集中体现在同一个图形中,如图所示,谢谢!