使用 ggplot 绘制 COVID 数据的气泡图
Posted
技术标签:
【中文标题】使用 ggplot 绘制 COVID 数据的气泡图【英文标题】:Bubble plot for COVID data using ggplot 【发布时间】:2021-11-13 09:53:41 【问题描述】:我有一个包含捷克*** 14 个地区的经纬度信息的电子表格(文件 here)。我正在尝试绘制地图并为每个区域的活动案例放置气泡。经纬度坐标适用于每个地区的首府城市。
library(sf)
library(ggplot2)
library(maps)
library(rstudioapi)
library(dplyr)
library(ggmap)
library(mapproj)
library(viridis)
#----------------------------#
# Set your working directory #
#----------------------------#
setwd(dirname(rstudioapi::getActiveDocumentContext()$path)) # RStudio IDE preferred
getwd() # Path to your working directory
# Country Boundary and the 14 regions within the Czech Republic
worldmap <- map_data("world")
worldmap2 <- dplyr::filter(worldmap, region %in% data.frame(countries = "Czech Republic"))
ggplot(worldmap2) + geom_polygon(aes(long,lat, group=group), col = "black", fill = "white", size = 1) +
labs(title = "COVID-19 in the Czech Republic", subtitle = "As of July 1, 2021", x = "Longitude", y = "Latitude",
caption = "(Source: Ministerstvo zdravotnictví České republiky)")
电子表格的第六列包含活动案例编号。我试图让数字在上面的地图上显示为气泡。我尝试了以下方法,但所有点的大小都相同。如何合并情节 1 和情节 2?
my_df <- read.csv("CZE_InitialSeedData.csv", header = T)
class(my_df)
my_sf <- st_as_sf(my_df, coords = c('Lon', 'Lat'))
my_sf <- st_set_crs(my_sf, value = 4326)
my_sf
seedPlot <- ggplot(my_sf) +
geom_sf(aes(fill = InitialInfections))
seedPlot <- seedPlot +
scale_fill_continuous(name = "Active Cases", low = "pink", high = "red", na.value = "grey50")
seedPlot <- seedPlot +
theme(legend.position = "bottom", legend.text.align = 1, legend.title.align = 0.5)
seedPlot
【问题讨论】:
【参考方案1】:无需将您的数据转换为sf
对象。您可以简单地通过 geom_point 将数据添加到地图中。要将气泡与size
美学上的活动案例映射到您的列:
library(ggplot2)
library(maps)
library(dplyr)
worldmap <- map_data("world")
worldmap2 <- dplyr::filter(worldmap, region == "Czech Republic")
base_map <- ggplot(worldmap2) +
geom_polygon(aes(long, lat, group = group), col = "black", fill = "white", size = 1) +
labs(
title = "COVID-19 in the Czech Republic", subtitle = "As of July 1, 2021", x = "Longitude", y = "Latitude",
caption = "(Source: Ministerstvo zdravotnictví České republiky)"
)
base_map +
geom_point(
data = my_df,
aes(x = Lon, y = Lat, color = InitialInfections, size = InitialInfections)
) +
scale_color_continuous(name = "Active Cases", low = "pink", high = "red", na.value = "grey50") +
scale_size_continuous(name = "Active Cases") +
theme(legend.position = "bottom", legend.text.align = 1, legend.title.align = 0.5)
编辑据我所知,您可以为非 SF 坐标添加指北针和比例尺。但是,转换为 sf
对象将自动为比例尺选择正确的单位。为此,将底图和点图层都转换为sf
对象,如下所示:
library(ggplot2)
library(maps)
library(dplyr)
library(ggspatial)
library(sf)
worldmap <- map_data("world")
worldmap2 <- dplyr::filter(worldmap, region == "Czech Republic") %>%
st_as_sf(coords = c("long", "lat"), crs = 4326) %>%
st_combine() %>%
st_cast("POLYGON")
base_map <- ggplot(worldmap2) +
geom_sf(col = "black", fill = "white", size = 1) +
annotation_north_arrow() +
annotation_scale(location = "tl") +
labs(
title = "COVID-19 in the Czech Republic", subtitle = "As of July 1, 2021", x = "Longitude", y = "Latitude",
caption = "(Source: Ministerstvo zdravotnictví České republiky)"
)
my_df <- my_df %>%
st_as_sf(coords = c("Lon", "Lat"), crs = 4326)
base_map +
geom_sf(data = my_df, aes(color = InitialInfections, size = InitialInfections)) +
scale_color_continuous(name = "Active Cases", low = "pink", high = "red", na.value = "grey50") +
scale_size_continuous(name = "Active Cases") +
theme(legend.position = "bottom", legend.text.align = 1, legend.title.align = 0.5)
数据
my_df <- structure(list(Location = c(
"Prague", "CentralBohemian", "SouthBohemian",
"Plzen", "KarlovyVary", "UstinadLabem", "Liberec", "HradecKralove",
"Pardubice", "Vysocina", "SouthMoravian", "Olomouc", "Zlin",
"Moravian-Silesian"
), Lat = c(
50.083333, 50, 49.083333, 49.7475,
50.230556, 50.658333, 50.685584, 50.209167, 49.951136, 49.6079,
49.363161, 49.593889, 49.29786, 49.988449
), Lon = c(
14.416667,
14.533333, 14.666667, 13.3775, 12.8725, 14.041667, 14.537747,
15.831944, 15.795636, 15.580728, 16.643175, 17.250833, 17.393135,
17.464759
), InitialVaccinated = c(
252944L, 159560L, 93490L, 82014L,
40129L, 104454L, 59442L, 82074L, 65060L, 66325L, 165250L, 89116L,
80125L, 159490L
), InitialExposed = c(
1380L, 1274L, 1048L, 500L,
50L, 1098L, 506L, 42L, 492L, 820L, 1406L, 1090L, 1116L, 2404L
), InitialInfections = c(
690L, 637L, 524L, 250L, 25L, 549L, 253L,
21L, 246L, 410L, 703L, 545L, 558L, 1202L
), InitialRecovered = c(
181947L,
226944L, 97405L, 95944L, 43882L, 120416L, 79029L, 102835L, 91729L,
78308L, 151627L, 90887L, 89163L, 174251L
), InitialDead = c(
2736L,
3421L, 1978L, 1912L, 1484L, 2523L, 1280L, 1811L, 1437L, 1375L,
3412L, 1709L, 1594L, 3521L
)), class = "data.frame", row.names = c(
NA,
-14L
))
【讨论】:
为什么过滤条件是region %in% data.frame(countries = "Czech Republic")
而不是region == "Czech Republic"
? - 我很好奇,以前没见过这个公式。
@彼得。哈哈。我也不。只是从 OP 复制和粘贴,没有仔细查看过滤条件。
@stefan 感谢您对创建气泡图的指导。我试图将我的数据转换为 sf 对象的原因是添加了一个指北针和比例尺,但它一直给我一个错误/警告。我什至安装了 ggspatial 并将 + annotation_scale() + annotation_north_arrow() 添加到 baseplot 中,这两个都警告我“没有 coord_sf(),真正的北方是没有意义的”。无论如何在您编码时向 my_df 添加指北针和比例尺?
嗨@Ash。我刚刚进行了编辑以添加指北针和比例尺。最佳 S.
@stefan 非常感谢您的代码。我已经为捷克***和另一个 LMIC 采用了这一点,但我正在努力为现有 ggplot 上的州/省添加边界,例如捷克***有 14 个地区。有没有简单的方法添加这些内部 Level1 国家边界?以上是关于使用 ggplot 绘制 COVID 数据的气泡图的主要内容,如果未能解决你的问题,请参考以下文章