R中性别过剩的人口金字塔
Posted
技术标签:
【中文标题】R中性别过剩的人口金字塔【英文标题】:Population Pyramid with Gender surplus in R 【发布时间】:2021-11-16 02:45:50 【问题描述】:在***上,有一个奇妙的人口金字塔显示性别过剩。 我如何在 R 中使用 ggplot2 和/或 plotly 重新创建它?
它本质上是一个双堆叠的条形图,方向为 90 度。
# Here is some population data
library(wpp2019)
# Male
data(popM)
# Female
data(popF)
Wikipedia: Demographics of the United States
【问题讨论】:
你已经尝试了什么? 只处理数据,我不知道从哪里开始创建绘图方向。 【参考方案1】:这不是最整洁的方法,但这应该可行:
library(ggplot2)
library(wpp2019)
#> Warning: package 'wpp2019' was built under R version 4.1.1
data(popM)
data(popF)
# Assuming structure of popM and popF is parallel
df <- data.frame(
age = factor(popM$age, unique(popM$age)),
male = popM$`2020`,
female = popF$`2020`
)[popM$name == "World",]
ggplot(df, aes(y = age)) +
geom_col(aes(x = female, fill = "female surplus"), width = 1) +
geom_col(aes(x = -male, fill = "male surplus"), width = 1) +
geom_col(aes(x = pmin(male, female), fill = "female"), width = 1) +
geom_col(aes(x = -pmin(male, female), fill = "male"), width = 1)
由reprex package (v2.0.1) 于 2021 年 9 月 22 日创建
【讨论】:
【参考方案2】:在下面的代码中,大部分工作都在数据整形中,而 ggplot 代码相对简单。
library(wpp2019)
library(tidyverse)
data(popM)
data(popF)
list(Male=popM, Female=popF) %>%
imap(~.x %>%
filter(name=="World") %>%
select(age, !!.y:=`2020`)) %>%
reduce(full_join) %>%
mutate(age = factor(age, levels=unique(age)),
`Female surplus` = pmax(Female - Male, 0),
`Male surplus` = pmax(Male - Female, 0),
Male = Male - `Male surplus`,
Female = Female - `Female surplus`) %>%
pivot_longer(-age) %>%
mutate(value = case_when(grepl("Male", name) ~ -value,
TRUE ~ value),
name = factor(name, levels=c("Female surplus", "Female",
"Male surplus", "Male"))) %>%
ggplot(aes(value, age, fill=name)) +
geom_col() +
geom_vline(xintercept=0, colour="white") +
scale_x_continuous(label=function(x) ifelse(x < 0, -x, x),
breaks=scales::pretty_breaks(6)) +
labs(x=NULL, y=NULL, fill=NULL) +
scale_fill_discrete(type=RColorBrewer::brewer.pal(name="RdBu", n=4)[c(1,2,4,3)],
breaks=c("Male surplus", "Male", "Female","Female surplus")) +
theme_bw() +
theme(legend.position="bottom")
作为另一种选择,您可以在条形之间放置垂直轴标签。这个版本也使用了分面,所以我们可以很容易地按性别标记分面。然后在图例中我们只需要标记条的剩余部分。
library(ggpol)
library(ggthemes)
list(Male=popM, Female=popF) %>%
imap(~.x %>%
filter(name=="World") %>%
select(age, !!.y:=`2020`)) %>%
reduce(full_join) %>%
mutate(age = factor(age, levels=unique(age)),
`Female surplus` = pmax(Female - Male, 0),
`Male surplus` = pmax(Male - Female, 0),
Male = Male - `Male surplus`,
Female = Female - `Female surplus`) %>%
pivot_longer(-age) %>%
mutate(facet = factor(ifelse(grepl("Female", name), "Female", "Male"),
c("Male","Female")),
value = case_when(grepl("Male", name) ~ -value,
TRUE ~ value),
name = factor(name, levels=c("Female surplus", "Female",
"Male surplus", "Male"))) %>%
ggplot(aes(value, age, fill=name)) +
geom_col() +
geom_vline(xintercept=0, colour="white") +
scale_x_continuous(label=function(x) ifelse(x < 0, -x, x),
breaks=scales::pretty_breaks(3),
expand=c(0,0)) +
labs(x=NULL, y=NULL, fill=NULL) +
facet_share(vars(facet), scales="free_x") +
scale_fill_discrete(type=RColorBrewer::brewer.pal(name="RdBu", n=4)[c(1,2,4,3)],
breaks=c("Male surplus", "Female surplus")) +
theme_clean() +
theme(legend.position="bottom",
legend.background=element_blank(),
legend.key.height=unit(4,"mm"),
legend.margin=margin(t=0),
plot.background=element_blank(),
strip.text=element_text(face="bold", size=rel(0.9)))
【讨论】:
第二个代码在“阿曼”国家表现得非常奇怪。我还在其他一些数据上进行了尝试,它有时会做一些奇怪的事情。第一个代码非常稳定。以上是关于R中性别过剩的人口金字塔的主要内容,如果未能解决你的问题,请参考以下文章
R语言ggplot2可视化:可视化人口金字塔图直方图(堆叠直方图连续变量堆叠直方图离散变量堆叠直方图)密度图箱图(添加抖动数据点tufte箱图多分类变量分组箱图)小提琴图