R中性别过剩的人口金字塔

Posted 2023-02-16

技术标签:

【中文标题】R中性别过剩的人口金字塔【英文标题】：Population Pyramid with Gender surplus in R 【发布时间】：2021-11-16 02:45:50 【问题描述】：

在***上，有一个奇妙的人口金字塔显示性别过剩。我如何在 R 中使用 ggplot2 和/或 plotly 重新创建它？

它本质上是一个双堆叠的条形图，方向为 90 度。

# Here is some population data

library(wpp2019)
# Male
data(popM)
# Female
data(popF)

Wikipedia: Demographics of the United States

【问题讨论】：

你已经尝试了什么？只处理数据，我不知道从哪里开始创建绘图方向。 【参考方案1】：

这不是最整洁的方法，但这应该可行：

library(ggplot2)
library(wpp2019)
#> Warning: package 'wpp2019' was built under R version 4.1.1

data(popM)
data(popF)

# Assuming structure of popM and popF is parallel
df <- data.frame(
  age = factor(popM$age, unique(popM$age)),
  male = popM$`2020`,
  female = popF$`2020`
)[popM$name == "World",]

ggplot(df, aes(y = age)) +
  geom_col(aes(x = female, fill = "female surplus"), width = 1) +
  geom_col(aes(x = -male,  fill = "male surplus"), width = 1) +
  geom_col(aes(x =  pmin(male, female), fill = "female"), width = 1) +
  geom_col(aes(x = -pmin(male, female), fill = "male"), width = 1)

^{由reprex package (v2.0.1) 于 2021 年 9 月 22 日创建}

【讨论】：

【参考方案2】：

在下面的代码中，大部分工作都在数据整形中，而 ggplot 代码相对简单。

library(wpp2019)
library(tidyverse)
data(popM)
data(popF)

list(Male=popM, Female=popF) %>% 
  imap(~.x %>% 
         filter(name=="World") %>% 
         select(age, !!.y:=`2020`)) %>% 
  reduce(full_join) %>% 
  mutate(age = factor(age, levels=unique(age)),
         `Female surplus` = pmax(Female - Male, 0),
         `Male surplus` = pmax(Male - Female, 0),
         Male = Male - `Male surplus`,
         Female = Female - `Female surplus`) %>% 
  pivot_longer(-age) %>%
  mutate(value = case_when(grepl("Male", name) ~ -value, 
                           TRUE ~ value),
         name = factor(name, levels=c("Female surplus", "Female", 
                                      "Male surplus", "Male"))) %>% 
  ggplot(aes(value, age, fill=name)) +
    geom_col() + 
    geom_vline(xintercept=0, colour="white") + 
    scale_x_continuous(label=function(x) ifelse(x < 0, -x, x),
                       breaks=scales::pretty_breaks(6)) +
    labs(x=NULL, y=NULL, fill=NULL) +
    scale_fill_discrete(type=RColorBrewer::brewer.pal(name="RdBu", n=4)[c(1,2,4,3)],
                        breaks=c("Male surplus", "Male", "Female","Female surplus")) +
    theme_bw() +
    theme(legend.position="bottom")

作为另一种选择，您可以在条形之间放置垂直轴标签。这个版本也使用了分面，所以我们可以很容易地按性别标记分面。然后在图例中我们只需要标记条的剩余部分。

library(ggpol)
library(ggthemes)

list(Male=popM, Female=popF) %>% 
  imap(~.x %>% 
         filter(name=="World") %>% 
         select(age, !!.y:=`2020`)) %>% 
  reduce(full_join) %>% 
  mutate(age = factor(age, levels=unique(age)),
         `Female surplus` = pmax(Female - Male, 0),
         `Male surplus` = pmax(Male - Female, 0),
         Male = Male - `Male surplus`,
         Female = Female - `Female surplus`) %>% 
  pivot_longer(-age) %>%
  mutate(facet = factor(ifelse(grepl("Female", name), "Female", "Male"),
                        c("Male","Female")),
         value = case_when(grepl("Male", name) ~ -value, 
                           TRUE ~ value),
         name = factor(name, levels=c("Female surplus", "Female", 
                                      "Male surplus", "Male"))) %>% 
  ggplot(aes(value, age, fill=name)) +
    geom_col() + 
    geom_vline(xintercept=0, colour="white") + 
    scale_x_continuous(label=function(x) ifelse(x < 0, -x, x),
                       breaks=scales::pretty_breaks(3),
                       expand=c(0,0)) +
    labs(x=NULL, y=NULL, fill=NULL) +
    facet_share(vars(facet), scales="free_x") +
    scale_fill_discrete(type=RColorBrewer::brewer.pal(name="RdBu", n=4)[c(1,2,4,3)],
                        breaks=c("Male surplus", "Female surplus")) +
    theme_clean() +
    theme(legend.position="bottom",
          legend.background=element_blank(),
          legend.key.height=unit(4,"mm"),
          legend.margin=margin(t=0), 
          plot.background=element_blank(),
          strip.text=element_text(face="bold", size=rel(0.9)))

【讨论】：

第二个代码在“阿曼”国家表现得非常奇怪。我还在其他一些数据上进行了尝试，它有时会做一些奇怪的事情。第一个代码非常稳定。

以上是关于R中性别过剩的人口金字塔的主要内容，如果未能解决你的问题，请参考以下文章

Python 和 Seaborn 的人口金字塔

D3.js中Population Pyramid详解

如何删除人口金字塔中的负 X 轴标签？

R语言ggplot2可视化：可视化人口金字塔图直方图（堆叠直方图连续变量堆叠直方图离散变量堆叠直方图）密度图箱图（添加抖动数据点tufte箱图多分类变量分组箱图）小提琴图

使用 pandas 和 seaborn 制作年龄金字塔图