使用 purrr 时如何自定义 ggplot2 facet_grid 标签中的文本？

Posted 2023-02-16

技术标签:

【中文标题】使用 purrr 时如何自定义 ggplot2 facet_grid 标签中的文本？【英文标题】：How to customize text in ggplot2 facet_grid label when using purrr? 【发布时间】：2021-10-24 20:45:10 【问题描述】：

我正在使用 purrr 和 ggplot2 一次创建多个图。对于每个方面的名称，我想保留组的名称，但我还想添加每个子组中的参与者数量。例如，“Manager (N = 200)”和“Employee (N = 3000)”。但是，当我尝试添加这个标签参数时：

    facet_grid(~.data[[group]],
               labeller = paste0(~.data[[group]], "(N = ", group_n$n, ")"))

我收到此错误：

Error in cbind(labels = list(), list(``, if (!is.null(.rows) || !is.null(.cols))  : 
  number of rows of matrices must match (see arg 2)

以下是带有简化数据集的可重现示例。我的目标是在他们的分面标题中包含子组及其样本大小。

library(purrr)
library(dplyr)
library(ggplot2)

#Data
test <- tibble(s1 = c("Agree", "Neutral", "Strongly disagree"),
               s2rl = c("Agree", "Neutral", NA),
               f1 = c("Strongly agree", NA, "Strongly disagree"),
               f2rl = c(NA, "Disagree", "Strongly disagree"),
               level = c("Manager", "Employee", "Employee"),
               location = c("USA", "USA", "AUS"))

#Get just test items for name
test_items <- test %>%
  dplyr::select(s1, s2rl, f1, f2rl)

#titles of plots for R to iterate over
titles <- c("S1 results", "Results for S2RL", "Fiscal Results for F1", "Financial Status of F2RL")


#group levels
group_name <- c("level", "location")

#Custom function to make plots

facet_plots = function(variable, group, title) 
  total_n <- test %>%
    summarize(n = sum(!is.na(.data[[variable]])))
  
  
  group_n <- test %>%
    group_by(.data[[group]], .data[[variable]]) %>%
    summarize(n = sum(!is.na(.data[[variable]])))
  
  
  plot2 <- test %>%
    count(.data[[group]], .data[[variable]]) %>%
    mutate(percent = 100*(n / group_n$n)) %>%
    drop_na() %>%
    ggplot(aes(x = .data[[variable]], y = percent, fill = .data[[variable]])) + 
    geom_bar(stat = "identity") +
    geom_text(aes(label= paste0(percent, "%"), fontface = "bold", family = "Arial", size=14), vjust= 0, hjust = -.5) +
    ylab("\nPercentage") +
    labs(
      title = title,
      subtitle = paste0("(N = ", total_n$n)) +
    coord_flip() +
    theme_minimal() +
    ylim(0, 100) +
    facet_grid(~.data[[group]],
               labeller = paste0(~.data[[group]], "(N = ", group_n$n, ")")) #issue is likely here
  
  output <- list(plot2)
  return(output)



#pmap call
my_plots <- expand_grid(tibble(item = names(test_items), title=titles),
                        group = group_name) %>%
  pmap(function(item, group, title)
    facet_plots(item, group, title))

my_plots

编辑：我也尝试了详细的解决方案here，我收到了同样的错误。

【问题讨论】：

【参考方案1】：

下面将允许您绘制具有特征variable 的group 的百分比，同时使用组名和计数绘制结果。

library(tidyr)
library(dplyr)
library(ggplot2)
library(purrr)
facet_plots <- function(variable, group, title="Title", dat) 
    
    variable <- sym(variable)
    group <- sym(group)
    sumdat <- dat %>%
        filter(!is.na(!!variable)) %>%
        group_by(!!group) %>%
        add_count() %>%
        mutate(lbl = paste0(!!group, " (N = ", n, ")")) %>%
        group_by(!!group, !!variable) %>%
        mutate(pct = 100 * n() / n) %>%
        slice(1L) %>%
        ungroup() %>%
        select(!!variable, !!group, n, pct, lbl)

    ggplot(sumdat, aes(x = !!variable, y = pct, group = !!group)) +
        geom_bar(stat = "identity") +
        labs(
            title = title
        ) +
        facet_grid(~lbl)



## Using starwars data
expand_grid(
    tibble(
        variable = c("hair_color", "skin_color", "birth_year"),
        title = c("Hair color", "Skin color", "Birth year")
    ),
    group = c("sex", "gender")) %>%
    mutate(title = paste(title, "by", group)) %>%
    pmap(facet_plots, dat = starwars)

使用pmap() 将创建一个存储为字符串的组合数据框。因此，facet_plots() 函数的参数将是字符串。前两行将字符串 variable 和 group 转换为 R 可以在不带引号的情况下使用的符号（阅读更多 here 了解这意味着什么）。 “bang-bang 运算符”!! 告诉 R 您希望将值存储在变量中，而不是名称本身（请参阅help("!!")）。任何时候 R 看到!!variable，它都会将值理解为存储在参数variable 中的数据框中的变量名称。

下面，我展示了这适用于 OP 的原始数据，而不仅仅是 starwars 示例数据。

## Using OP's data
test <- tibble(s1 = c("Agree", "Neutral", "Strongly disagree"),
               s2rl = c("Agree", "Neutral", NA),
               f1 = c("Strongly agree", NA, "Strongly disagree"),
               f2rl = c(NA, "Disagree", "Strongly disagree"),
               level = c("Manager", "Employee", "Employee"),
               location = c("USA", "USA", "AUS"))

expand_grid(
    tibble(
        variable = c("s1", "s2rl", "f1", "f2rl"),
        title = c("S1 results", "Results for S2RL", 
                  "Fiscal Results for F1", "Financial Status of F2RL")
    ),
    group = c("level", "location")
) %>%
    mutate(title = paste(title, "by", group)) %>%
    pmap(facet_plots, dat = test)

我认为您的贴标机无法正常工作的原因是您传递的类型不正确。 labeller() 函数采用var = fxn 形式的参数，其中var 是构面网格中的变量名称fxn 是用于如何转换名称的函数。您向它传递了数据，然后传递了一个调用单独向量的函数。

【讨论】：

感谢您的回复！ pmap 所做的是按组（例如，按位置的 S1、按员工的 S1、按位置的 S2rl 等）或在星球大战数据集中迭代我所有变量的整体组合，这相当于得到一个按性别绘制头发颜色，按性别绘制眼睛颜色等。迭代的某些东西使 R 感到困惑，当我不只做一个特定的变量/组时，我会遇到与您相同的问题。啊，我明白了。我认为我已修复它，以便您可以使用pmap 并包含一个示例。我使用了starwars 数据，因为我很难准确地理解你想要从数据中得到什么。但是，我认为我更改了它，以便您可以通过facet_plots 的dat 参数使用您喜欢的任何数据。谢谢你的工作！它说我不能再奖励 18 小时的赏金，但我会的。如果您不介意，有几个问题：和 !!意思是？另外，你能解释一下 slice(1L) 和 sym 在这里做什么吗？再次感谢您的帮助！我扩展了解释，试图提供更好的上下文来帮助理解它为什么起作用。拥抱运算符在这里解释得很好：***.com/a/62792494/12586249。 bang-bang 运算符!! 做同样的事情，所以我更改了答案以始终使用它。 slice(1L) 函数只接受每个变量组组合的第一次观察（1L 是表示整数 1 的一种方式），因为相同的 n 和 pct 值重复变量组组合中的所有观察值。我们只需要一个，所以我们拿第一个。希望有帮助！

以上是关于使用 purrr 时如何自定义 ggplot2 facet_grid 标签中的文本？的主要内容，如果未能解决你的问题，请参考以下文章