制作 ggplot2 图表,其中日期列具有不同的颜色

Posted

技术标签:

【中文标题】制作 ggplot2 图表,其中日期列具有不同的颜色【英文标题】:Make ggplot2 graph where date columns have seperate colours 【发布时间】:2020-08-24 10:17:17 【问题描述】:

我正在尝试绘制散点图并一直失败。我的数据df[1:10,] 如下所示:

# A tibble: 13 x 5
   `Ticket Created` `Ticket Closed` `Case Owner`                              Frequency
   <chr>            <chr>           <fct>                                       <dbl>
 1 NA               NA              Animal_Services                             16395   
 2 NA               NA              Public_Works_Road_And_Bridges_16_60         6090
 3 NA               NA              COM_Code_Enforcement                        4099
 4 NA               2017-02-06      COM_Code_Enforcement                        123
 5 NA               2015-09-07      COM_Code_Enforcement                        96
 6 NA               2015-03-12      Animal_Services                             88
 7 NA               2017-01-06      COM_Code_Enforcement                        88
 8 2014-07-04       2014-07-04      Public_Works_Road_And_Bridges_16_60         78
 9 NA               2014-07-10      COM_Code_Enforcement                        65
10 NA               2014-08-09      COM_Code_Enforcement                        65
11 2013-11-03       2013-11-03      Public_Works_Road_And_Bridges_16_60         60
12 2014-07-01       2014-07-01      Public_Works_Road_And_Bridges_16_60         59
13 NA               2015-12-02      COM_Code_Enforcement                        55

我需要一个图表,其中Ticket CreatedTicket Closed 位于x 轴,颜色不同,Frequency 位于y 轴。这就是我做 ggplot 的方式:

ggplot2::ggplot()+
  geom_point(data= c, aes(lubridate::date(`Ticket Created`), Frequency, 
                          color=destring(`Ticket Created`)))+ 
  geom_point(data= c, aes(lubridate::date(`Ticket Closed`), Frequency, 
                          color=destring(`Ticket Closed`)))+ 
  theme_bw()+
  scale_x_date(date_breaks = "1 month", date_labels =  "%d %b %Y") +
  ylim(0, 150)+
  scale_alpha(guide = 'none')+
  theme(plot.title = element_text(hjust = 0.5), legend.position = "top", legend.title = element_text(face = "bold.italic"),
        axis.text.x=element_text(angle=60, hjust=1))+
  facet_wrap(~`Case Owner`, ncol = 1, scales = "free_y")+
  guides(fill= F)+
  labs(x="Day",y= "Freq. of Closing", caption = "**distributed by month-year")+
  ggtitle("Monthly Frequency of Ticket Closing by Case Owners, per Year")

这是我得到的结果: 带有警告:

Warning messages:
1: In destring(`Ticket Created`) : NAs introduced by coercion
2: In destring(`Ticket Created`) : NAs introduced by coercion
3: In destring(`Ticket Closed`) : NAs introduced by coercion
4: Removed 50 rows containing missing values
(geom_point). 
5: Removed 22 rows containing missing values
(geom_point). 

我被告知不要从数据中删除NAs 的原因是我们可以看到哪个Case Owner 在同一天打开和关闭票。我尝试了多种着色方法......包括将日期列转换为整数并关注这些帖子:Scatter plot with ggplot2 colored by dates

Color points by date in ggplot2

dput() 如果你想看的话就是这个。如果我可以将两列以不同的颜色显示,我将不胜感激,这样会更清楚。否则提示也会很有价值!

structure(list(`Ticket Created` = c(NA, NA, NA, NA, NA, NA, NA, 
"2014-07-04", NA, NA, "2013-11-03", "2014-07-01", NA, "2013-04-04", 
"2013-10-04", NA, "2013-09-01", NA, "2014-10-07", NA, "2013-04-02", 
NA, NA, "2014-07-08", "2013-10-07", "2014-02-06", "2015-11-06", 
"2014-09-07", "2014-11-06", NA, "2015-07-07", NA, "2013-08-05", 
"2014-03-09", "2017-06-04", NA, "2014-01-05", "2014-06-01", NA, 
"2014-03-07", "2013-05-11", "2014-01-07", "2014-11-03", "2015-08-07", 
NA, NA, "2013-02-04", "2014-08-07", NA, NA, "2013-09-09", "2013-11-06", 
NA, NA, NA, "2014-08-04", "2014-10-11", "2014-12-02", "2013-03-06", 
"2013-05-02", NA, "2014-05-03", "2014-05-08", "2014-10-03", "2015-09-07", 
NA, "2013-01-04", "2014-09-01", NA, NA, NA, "2013-06-05", "2013-12-06", 
"2014-02-07", NA, NA, NA, "2013-12-08", "2014-10-01", "2014-11-08", 
"2014-12-02", NA, "2013-04-03", "2013-08-08", "2013-11-02", "2014-01-10", 
"2014-07-07", "2014-12-11", NA, NA, NA, "2014-03-04", "2014-12-09", 
"2015-02-07", NA, NA, "2013-07-08", "2013-11-12", "2014-06-05", 
"2014-10-02", "2014-12-05", "2015-01-09", "2015-09-12", "2016-09-02", 
NA, NA, "2013-01-05", "2013-12-12", "2013-12-12", "2014-12-05", 
"2015-02-09", "2016-05-05", "2016-07-06", "2016-12-04", "2016-12-10", 
"2016-12-12", NA, NA, NA, NA, NA, "2013-01-10", "2013-09-12", 
"2013-12-03", "2014-01-08", "2014-07-05", "2015-05-05", "2016-12-02", 
"2017-07-09", NA, NA, NA), `Ticket Closed` = c(NA, NA, NA, "2017-02-06", 
"2015-09-07", "2015-03-12", "2017-01-06", "2014-07-04", "2014-07-10", 
"2014-08-09", "2013-11-03", "2014-07-01", "2015-12-02", "2013-04-04", 
"2013-10-04", "2016-01-12", "2013-09-01", "2016-05-01", "2014-10-07", 
"2017-08-04", "2013-04-02", "2014-02-09", "2015-02-02", "2014-07-08", 
"2013-10-07", "2014-02-06", "2015-11-06", "2014-09-07", "2014-11-06", 
"2017-08-05", "2015-07-07", "2015-09-03", "2013-08-05", "2014-03-09", 
"2017-06-04", "2015-12-06", "2014-01-05", "2014-06-01", "2017-01-11", 
"2014-03-07", "2013-05-11", "2014-01-07", "2014-11-03", "2015-08-07", 
"2016-05-08", "2018-01-02", "2013-02-04", "2014-08-07", "2014-06-10", 
"2014-12-06", "2013-09-09", "2013-11-06", "2014-03-01", "2014-11-06", 
"2015-06-12", "2014-08-04", "2014-10-11", NA, "2013-03-06", "2013-05-02", 
"2015-10-03", "2014-05-03", "2014-05-08", "2014-10-03", "2015-09-07", 
"2013-04-04", "2013-01-04", "2014-09-01", "2014-03-06", "2014-06-12", 
"2014-08-08", "2013-06-05", NA, "2014-02-07", "2014-07-05", "2016-02-08", 
"2017-09-05", NA, NA, "2014-11-08", "2014-12-02", "2017-01-05", 
"2013-04-03", "2013-08-08", "2013-11-02", "2014-01-10", "2014-07-07", 
NA, "2013-04-09", "2016-08-01", "2017-02-05", NA, NA, "2015-02-07", 
"2014-02-04", "2015-07-04", "2013-07-08", "2013-11-12", "2014-06-05", 
NA, "2014-12-05", "2015-01-09", "2015-09-12", NA, "2014-06-03", 
"2016-04-05", "2013-01-05", "2013-12-12", NA, NA, "2015-02-09", 
NA, NA, NA, NA, NA, "2014-01-05", "2014-05-02", "2015-01-09", 
"2015-02-08", "2017-11-01", "2013-01-10", "2013-11-12", "2013-12-03", 
"2014-08-09", NA, "2015-05-05", NA, NA, "2013-01-08", "2015-03-02", 
"2017-08-12"), `Case Owner` = structure(c(1L, 3L, 2L, 2L, 2L, 
1L, 2L, 3L, 2L, 2L, 3L, 3L, 2L, 3L, 3L, 1L, 3L, 1L, 3L, 2L, 3L, 
1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 2L, 3L, 3L, 3L, 1L, 3L, 
3L, 2L, 3L, 3L, 3L, 3L, 3L, 1L, 2L, 3L, 3L, 2L, 2L, 3L, 3L, 1L, 
2L, 1L, 3L, 3L, 1L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 2L, 3L, 3L, 2L, 
2L, 1L, 3L, 1L, 3L, 2L, 2L, 2L, 1L, 1L, 3L, 3L, 1L, 3L, 3L, 3L, 
3L, 3L, 1L, 2L, 1L, 1L, 3L, 1L, 3L, 1L, 1L, 3L, 3L, 3L, 1L, 3L, 
3L, 3L, 1L, 1L, 1L, 3L, 3L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 1L, 1L, 3L, 1L, 3L, 2L, 1L, 3L, 1L, 2L, 1L, 1L, 2L), .Label = c("Animal_Services", 
"COM_Code_Enforcement", "Public_Works_Road_And_Bridges_16_60"
), class = "factor"), Frequency = c(16395L, 6090L, 4099L, 123L, 
96L, 88L, 88L, 78L, 65L, 65L, 60L, 59L, 55L, 54L, 53L, 51L, 50L, 
50L, 49L, 48L, 47L, 47L, 46L, 45L, 44L, 42L, 42L, 41L, 41L, 41L, 
40L, 40L, 39L, 39L, 39L, 39L, 38L, 37L, 37L, 36L, 35L, 35L, 35L, 
35L, 35L, 35L, 34L, 34L, 34L, 33L, 32L, 32L, 32L, 32L, 32L, 31L, 
31L, 31L, 30L, 30L, 30L, 29L, 29L, 29L, 29L, 29L, 28L, 28L, 28L, 
28L, 28L, 27L, 27L, 27L, 27L, 27L, 27L, 26L, 26L, 26L, 26L, 26L, 
25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 24L, 24L, 24L, 24L, 
24L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 22L, 22L, 
22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 
21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L)), row.names = c(NA, 
-132L), class = c("tbl_df", "tbl", "data.frame"))

【问题讨论】:

在您致电ggplot 时,sclae_color_b 应该包含什么内容? 嗨,彼得,抱歉。我打错字了,抱歉。它不在那里。我的意思是说scale_color_brewer(),但这与手头的问题无关。 【参考方案1】:

如果您将数据修改为长格式,它应该可以工作:


library(tidyr)

c1 <- 
  c %>% 
  pivot_longer(cols = c(`Ticket Created`, `Ticket Closed`), names_to = "tick", values_to = "date")



c1


ggplot2::ggplot()+
  geom_point(data= c1, aes(lubridate::date(date), Frequency, colour= tick))+ 
  theme_bw()+
  scale_x_date(date_breaks = "1 month", date_labels =  "%d %b %Y") +
  ylim(0, 150)+
  scale_alpha(guide = 'none')+
  theme(plot.title = element_text(hjust = 0.5), legend.position = "top", legend.title = element_text(face = "bold.italic"),
        axis.text.x=element_text(angle=60, hjust=1))+
  facet_wrap(~`Case Owner`, ncol = 1, scales = "free_y")+
guides(fill= F)+
  labs(x="Day",
       y= "Freq. of Closing", 
       caption = "**distributed by month-year",
       colour = "Ticket type")+
  ggtitle("Monthly Frequency of Ticket Closing by Case Owners, per Year")


结果:

【讨论】:

谢谢彼得。如果您不介意先生,我可以知道您如何推断出解决此问题的方法吗?我从没想过做长格式。 这是对变量进行分组的典型 ggplot/tidyverse 方法,以便将美学映射到变量的因素或不同实例。一旦你掌握了它的窍门,它真的很有用。如果您还没有这样做,那么值得一读:Hadley Wickham 的数据科学和 ggplot2 书籍:r4ds.had.co.nz 和 ggplot2-book.org。但是不要忽视基本 R 和基本 R 图形,因为您可以在基本 R 中更快、更简单地做很多事情 - 只是有很多东西要一直学习! 非常感谢您分享链接和很好的建议,我还没有读过这些书。保重并保持安全:)

以上是关于制作 ggplot2 图表,其中日期列具有不同的颜色的主要内容,如果未能解决你的问题,请参考以下文章

如何在使用 ggplot2 制作的单个图中组合填充(列)和颜色(点和线)图例?

使图表的背景在不同区域不同颜色

Tableau 图表大全5.0之折线图制作

计算日期/时间部分的平均值

为超过/低于阈值的值制作不同颜色的ggplot2热图

计算日期出现的次数并从中制作图表[关闭]