创建多列分面函数
Posted
技术标签:
【中文标题】创建多列分面函数【英文标题】:Creating a multiple column facet function 【发布时间】:2020-02-28 06:16:15 【问题描述】:我正在尝试创建一个facet_multi_col()
函数,类似于ggforce
中的facet_col()
函数-允许带有空格参数的构面布局(facet_wrap()
中不可用)-但有多个列。如下面的最后一个图(使用grid.arrange()
创建),我不希望分面必须跨行对齐,因为每个分面的高度会根据我希望使用的分类y
变量而变化。
在阅读了扩展名 guide 后,我发现自己已经超出了我的深度 ggproto
。我认为最好的方法是传递一个布局矩阵来指示在哪里为相应的数据子集断开列,并构建 facet_col
in ggforce 以包含一个空间参数 - 请参阅问题的结尾。
我不满意的选项的简要说明
无方面
library(tidyverse)
library(gapminder)
global_tile <- ggplot(data = gapminder, mapping = aes(x = year, y = fct_rev(country), fill = lifeExp)) +
geom_tile()
global_tile
我想按大洲分解情节。我不想要这么长的身材。
facet_wrap()
global_tile +
facet_wrap(facets = "continent", scales = "free")
facet_wrap()
没有空间参数,这意味着每个大陆的瓷砖大小不同,使用 coord_equal()
会引发错误
ggforce 中的 facet_col()
library(ggforce)
global_tile +
facet_col(facets = "continent", scales = "free", space = "free", strip.position = "right") +
theme(strip.text.y = element_text(angle = 0))
就像侧面的条带一样。 space
参数将所有图块设置为相同大小。仍然太长,无法放入页面。
gridExtra 中的grid.arrange()
为每个大陆应放置的位置的数据添加一列
d <- gapminder %>%
as_tibble() %>%
mutate(col = as.numeric(continent),
col = ifelse(test = continent == "Europe", yes = 2, no = col),
col = ifelse(test = continent == "Oceania", yes = 3, no = col))
head(d)
# # A tibble: 6 x 7
# country continent year lifeExp pop gdpPercap col
# <fct> <fct> <int> <dbl> <int> <dbl> <dbl>
# 1 Afghanistan Asia 1952 28.8 8425333 779. 3
# 2 Afghanistan Asia 1957 30.3 9240934 821. 3
# 3 Afghanistan Asia 1962 32.0 10267083 853. 3
# 4 Afghanistan Asia 1967 34.0 11537966 836. 3
# 5 Afghanistan Asia 1972 36.1 13079460 740. 3
# 6 Afghanistan Asia 1977 38.4 14880372 786. 3
tail(d)
# # A tibble: 6 x 7
# country continent year lifeExp pop gdpPercap col
# <fct> <fct> <int> <dbl> <int> <dbl> <dbl>
# 1 Zimbabwe Africa 1982 60.4 7636524 789. 1
# 2 Zimbabwe Africa 1987 62.4 9216418 706. 1
# 3 Zimbabwe Africa 1992 60.4 10704340 693. 1
# 4 Zimbabwe Africa 1997 46.8 11404948 792. 1
# 5 Zimbabwe Africa 2002 40.0 11926563 672. 1
# 6 Zimbabwe Africa 2007 43.5 12311143 470. 1
使用facet_col()
绘制每一列
g <- list()
for(i in unique(d$col))
g[[i]] <- d %>%
filter(col == i) %>%
ggplot(mapping = aes(x = year, y = fct_rev(country), fill = lifeExp)) +
geom_tile() +
facet_col(facets = "continent", scales = "free_y", space = "free", strip.position = "right") +
theme(strip.text.y = element_text(angle = 0)) +
# aviod legends in every column
guides(fill = FALSE) +
labs(x = "", y = "")
在cowplot
中使用get_legend()
创建图例
library(cowplot)
gg <- ggplot(data = d, mapping = aes(x = year, y = country, fill = lifeExp)) +
geom_tile()
leg <- get_legend(gg)
根据每列中的国家/地区数量创建具有高度的布局矩阵。
m <-
d %>%
group_by(col) %>%
summarise(row = n_distinct(country)) %>%
rowwise() %>%
mutate(row = paste(1:row, collapse = ",")) %>%
separate_rows(row) %>%
mutate(row = as.numeric(row),
col = col,
p = col) %>%
xtabs(formula = p ~ row + col) %>%
cbind(max(d$col) + 1) %>%
ifelse(. == 0, NA, .)
head(m)
# 1 2 3
# 1 1 2 3 4
# 2 1 2 3 4
# 3 1 2 3 4
# 4 1 2 3 4
# 5 1 2 3 4
# 6 1 2 3 4
tail(m)
# 1 2 3
# 50 1 2 NA 4
# 51 1 2 NA 4
# 52 1 2 NA 4
# 53 NA 2 NA 4
# 54 NA 2 NA 4
# 55 NA 2 NA 4
在gridExtra
中使用grid.arrange()
将g
和leg
放在一起
library(gridExtra)
grid.arrange(g[[1]], g[[2]], g[[3]], leg, layout_matrix = m, widths=c(0.32, 0.32, 0.32, 0.06))
这几乎是我所追求的,但我不满意,因为 a) 不同列中的图块具有不同的宽度,因为最长的国家和大陆名称的长度不相等 b) 它有很多代码需要调整时间我想制作这样的情节 - 使用其他数据我想按区域排列方面,例如“西欧”而不是大陆或国家数量的变化 - gapminder
数据中没有中亚国家。
创建 facet_multi_cols() 函数的进展
我想将布局矩阵传递给 facet 函数,其中矩阵将引用每个 facet,然后该函数可以根据每个面板中的空格数计算出高度。对于上面的例子,矩阵是:
my_layout <- matrix(c(1, NA, 2, 3, 4, 5), nrow = 2)
my_layout
# [,1] [,2] [,3]
# [1,] 1 2 4
# [2,] NA 3 5
如上所述,我一直在改编facet_col()
中的代码以尝试构建facet_multi_col()
函数。我添加了一个layout
参数来提供矩阵,例如上面的my_layout
,其想法是,例如,赋予facets
参数的变量的第四和第五级绘制在第三列中。
facet_multi_col <- function(facets, layout, scales = "fixed", space = "fixed",
shrink = TRUE, labeller = "label_value",
drop = TRUE, strip.position = 'top')
# add space argument as in facet_col
space <- match.arg(space, c('free', 'fixed'))
facet <- facet_wrap(facets, col = col, dir = dir, scales = scales, shrink = shrink, labeller = labeller, drop = drop, strip.position = strip.position)
params <- facet$params
params <- facet$layout
params$space_free <- space == 'free'
ggproto(NULL, FacetMultiCols, shrink = shrink, params = params)
FacetMultiCols <- ggproto('FacetMultiCols', FacetWrap,
# from FacetCols to allow for space argument to work
draw_panels = function(self, panels, layout, x_scales, y_scales, ranges, coord, data, theme, params)
combined <- ggproto_parent(FacetWrap, self)$draw_panels(panels, layout, x_scales, y_scales, ranges, coord, data, theme, params)
if (params$space_free)
widths <- vapply(layout$PANEL, function(i) diff(ranges[[i]]$x.range), numeric(1))
panel_widths <- unit(widths, "null")
combined$widths[panel_cols(combined)$l] <- panel_widths
combined
# adapt FacetWrap layout to set position on panels following the matrix given to layout in facet_multi_col().
compute_layout = function(self, panels, layout, x_scales, y_scales, ranges, coord, data, theme, params)
layout <- ggproto_parent(FacetWrap, self)$compute_layout(panels, layout, x_scales, y_scales, ranges, coord, data, theme, params)
# ???
)
我想我需要为compute_layout
部分写一些东西,但我正在努力弄清楚如何做到这一点。
【问题讨论】:
您是否尝试过制作一个地块列表,每个大陆一个,并将它们与其中一个包(如cowplot或patchwork)对齐?可能比构建 ggproto 更容易 @camille 我有点……在上面的grid.arrange
示例中……除非你的意思不同?我认为每列中不同的标签长度会存在相同的问题吗?
我正在想象类似的东西,但这些布局包可能比grid.arrange
更好地帮助对齐。这是一篇很长的帖子,因此很难遵循您尝试过的所有内容。有点 hacky,但您可以尝试为标签使用等宽/更接近均匀间隔的字体,以便它们的长度更可预测。您甚至可以用空格填充标签,以确保文本更接近相同的长度。
【参考方案1】:
免责声明
我从来没有开发过任何facet
,但我发现这个问题很有趣,也很有挑战性,所以我试了一下。它还不是完美的,到目前为止还没有根据你的情节可能发生的所有微妙之处进行测试,但它是你可以开始工作的初稿。
想法
facet_wrap
在表格中设置面板,每一行都有一定的高度,面板完全占据。 gtable_add_grob
说:
在 gtable 模型中,grobs 总是填满 完整的表格单元格。如果您想要自定义理由,您可能需要以绝对单位定义 grob 维度,或者将其放入另一个 gtable,然后可以将其添加到 gtable 而不是 grob。
这可能是一个有趣的解决方案。但是,我不知道如何追求。因此,我采取了不同的方法:
-
根据传递的布局参数创建自定义布局
让
facet_wrap
渲染所有面板w.r.t。到布局
使用gtable_filter
抓取面板,包括其轴和条带
创建一个布局矩阵。我尝试了 2 种方法:使用最少的行数和使用高度差异。并且只需添加大约与 y 轴上的刻度一样多的行。两者工作相似,后者产生更清晰的代码,所以我会使用这个。
使用gridExtra::arrangeGrob
根据通过的设计和创建的布局矩阵排列面板
结果
完整的代码有点长,但可以在下面找到。以下是一些图表:
my_layout1 <- matrix(c(1, NA, 2, 3, 4, 5), nrow = 2)
my_layout2 <- matrix(c(1, 2, 3, 4, 5, NA), ncol = 2)
## Ex1
global_tile + facet_multi_col("continent", my_layout1, scales = "free_y",
space = "free", strip.position = "top")
## Ex 2
global_tile + facet_multi_col("continent", my_layout1, scales = "free_y",
space = "free", strip.position = "right")
## Ex 3 - shows that we need a minimum space for any plot
global_tile + facet_multi_col("continent", my_layout1, scales = "free_y",
space = "free", strip.position = "top", min_prop = 0)
## Ex 4
global_tile + facet_multi_col("continent", my_layout1, scales = "free_y",
space = "fixed", strip.position = "right")
## Ex 5
global_tile + facet_multi_col("continent", my_layout2, scales = "free_y",
space = "free")
前 1 前 2 前 3 前 4 前 5
限制
代码远非万无一失。我已经看到了一些问题:
我们(默默地)假设设计中的每一列都以非 NA 值开头(通常对于生产代码,需要仔细检查传递的布局(尺寸是否合适?是否有与面板一样多的条目?等) 非常小的面板无法很好地渲染,因此我必须根据条带的位置添加一个最小值 尚未测试移动或添加轴或条的效果。代码:每个刻度一行
## get strip and axis of a given panel
## Assumptions:
## - axis are adjacent to the panel, that is exactly +1/-1 positions to the t/b/l/r ...
## - ... unless there is a strip then it is +2/-2
get_whole_panel <- function(panel_name,
table_layout)
target <- table_layout$layout %>%
dplyr::filter(name == panel_name) %>%
dplyr::select(row = t, col = l)
stopifnot(NROW(target) == 1)
pos <- unlist(target)
dirs <- list(t = c(-1, 0),
b = c(1, 0),
l = c(0, -1),
r = c(0, 1))
filter_elems <- function(dir,
type = c("axis", "strip"))
type <- match.arg(type)
new_pos <- pos + dir
res <- table_layout$layout %>%
dplyr::filter(grepl(type, name),
l == new_pos["col"],
t == new_pos["row"]) %>%
dplyr::pull(name)
if (length(res)) res else NA
strip <- purrr::map_chr(dirs, filter_elems, type = "strip")
strip <- strip[!is.na(strip)]
dirs[[names(strip)]] <- 2 * dirs[[names(strip)]]
axes <- purrr::map_chr(dirs, filter_elems, type = "axis")
gtable::gtable_filter(table_layout, paste(c(panel_name, axes, strip), collapse = "|"))
facet_multi_col <- function(facets, layout, scales = "fixed", space = "fixed",
shrink = TRUE, labeller = "label_value",
drop = TRUE, strip.position = "top",
min_prop = ifelse(strip.position %in% c("top", "bottom"),
0.12, 0.1))
space <- match.arg(space, c("free", "fixed"))
if (space == "free")
## if we ask for free space we need scales everywhere, so make sure they are included
scales <- "free"
facet <- facet_wrap(facets, ncol = 1, scales = scales, shrink = shrink,
labeller = labeller, drop = drop, strip.position = strip.position)
params <- facet$params
params$space_free <- space == "free"
params$layout <- layout
params$parent <- facet
params$min_prop <- min_prop
ggproto(NULL, FacetMultiCol, shrink = shrink, params = params)
render <- function(self, panels, layout,
x_scales, y_scales, ranges,
coord, data, theme, params)
combined <- ggproto_parent(FacetWrap, self)$draw_panels(panels, layout,
x_scales, y_scales, ranges,
coord, data, theme, params)
if (params$space_free)
panel_names <- combined$layout$name
panels <- lapply(panel_names[grepl("panel", panel_names)],
get_whole_panel,
table_layout = combined)
## remove zeroGrob panels
zG <- sapply(panels, function(tg) all(sapply(tg$grobs, ggplot2:::is.zero)))
panels <- panels[!zG]
## calculate height for each panel
heights <- matrix(NA, NROW(params$layout), NCOL(params$layout))
## store the rounded range in the matrix cell corresponding to its position
## allow for a minimum space in dependence of the overall number of rows to
## render small panels well
heights[as.matrix(layout[, c("ROW", "COL")])] <- vapply(ranges, function(r)
round(diff(r$y.range), 0), numeric(1))
## 12% should be the minimum height used by any panel if strip is on top otherwise 10%
## these values are empirical and can be changed
min_height <- round(params$min_prop * max(colSums(heights, TRUE)), 0)
heights[heights < min_height] <- min_height
idx <- c(heights)
idx[!is.na(idx)] <- seq_along(idx[!is.na(idx)])
len_out <- max(colSums(heights, TRUE))
i <- 0
layout_matrix <- apply(heights, 2, function(col)
res <- unlist(lapply(col, function(n)
i <<- i + 1
mark <- idx[i]
if (is.na(n))
NA
else
rep(mark, n)
))
len <- length(res)
if (len < len_out)
res <- c(res, rep(NA, len_out - len))
res
)
## set width of left axis to maximum width to align plots
max_width <- max(do.call(grid::unit.c, lapply(panels, function(gt) gt$widths[1])))
panels <- lapply(panels, function(p)
p$widths[1] <- max_width
p
)
combined <- gridExtra::arrangeGrob(grobs = panels,
layout_matrix = layout_matrix,
as.table = FALSE)
## add name, such that find_panel can find the plotting area
combined$layout$name <- paste("panel_", layout$LAB)
combined
layout <- function(data, params)
parent_layout <- params$parent$compute_layout(data, params)
msg <- paste0("invalid ",
sQuote("layout"),
". Falling back to ",
sQuote("facet_wrap"),
" layout")
if (is.null(params$layout) ||
!is.matrix(params$layout))
warning(msg)
parent_layout
else
## smash layout into vector and remove NAs all done by sort
layout <- params$layout
panel_numbers <- sort(layout)
if (!isTRUE(all.equal(sort(as.numeric(as.character(parent_layout$PANEL))),
panel_numbers)))
warning(msg)
parent_layout
else
## all good
indices <- cbind(ROW = c(row(layout)),
COL = c(col(layout)),
PANEL = c(layout))
indices <- indices[!is.na(indices[, "PANEL"]), ]
## delete row and col number from parent layout
parent_layout$ROW <- parent_layout$COL <- NULL
new_layout <- merge(parent_layout,
indices,
by = "PANEL") %>%
dplyr::arrange(PANEL)
new_layout$PANEL <- factor(new_layout$PANEL)
labs <- new_layout %>%
dplyr::select(-PANEL,
-SCALE_X,
-SCALE_Y,
-ROW,
-COL) %>%
dplyr::mutate(sep = "_") %>%
do.call(paste, .)
new_layout$LAB <- labs
new_layout
FacetMultiCol <- ggproto("FacetMultiCol", FacetWrap,
compute_layout = layout,
draw_panels = render)
代码:不同高度的行
## get strip and axis of a given panel
## Assumptions:
## - axis are adjacent to the panel, that is exactly +1/-1 positions to the t/b/l/r ...
## - ... unless there is a strip then it is +2/-2
get_whole_panel <- function(panel_name,
table_layout)
target <- table_layout$layout %>%
dplyr::filter(name == panel_name) %>%
dplyr::select(row = t, col = l)
stopifnot(NROW(target) == 1)
pos <- unlist(target)
dirs <- list(t = c(-1, 0),
b = c(1, 0),
l = c(0, -1),
r = c(0, 1))
filter_elems <- function(dir,
type = c("axis", "strip"))
type <- match.arg(type)
new_pos <- pos + dir
res <- table_layout$layout %>%
dplyr::filter(grepl(type, name),
l == new_pos["col"],
t == new_pos["row"]) %>%
dplyr::pull(name)
if (length(res)) res else NA
strip <- purrr::map_chr(dirs, filter_elems, type = "strip")
strip <- strip[!is.na(strip)]
dirs[[names(strip)]] <- 2 * dirs[[names(strip)]]
axes <- purrr::map_chr(dirs, filter_elems, type = "axis")
gtable::gtable_filter(table_layout, paste(c(panel_name, axes, strip), collapse = "|"))
facet_multi_col <- function(facets, layout, scales = "fixed", space = "fixed",
shrink = TRUE, labeller = "label_value",
drop = TRUE, strip.position = "top")
space <- match.arg(space, c("free", "fixed"))
if (space == "free")
## if we ask for free space we need scales everywhere, so make sure they are included
scales <- "free"
facet <- facet_wrap(facets, ncol = 1, scales = scales, shrink = shrink,
labeller = labeller, drop = drop, strip.position = strip.position)
params <- facet$params
params$space_free <- space == "free"
params$layout <- layout
params$parent <- facet
ggproto(NULL, FacetMultiCol, shrink = shrink, params = params)
render <- function(self, panels, layout,
x_scales, y_scales, ranges,
coord, data, theme, params)
combined <- ggproto_parent(FacetWrap, self)$draw_panels(panels, layout,
x_scales, y_scales, ranges,
coord, data, theme, params)
if (params$space_free)
panel_names <- combined$layout$name
panels <- lapply(panel_names[grepl("panel", panel_names)],
get_whole_panel,
table_layout = combined)
## remove zeroGrob panels
zG <- sapply(panels, function(tg) all(sapply(tg$grobs, ggplot2:::is.zero)))
panels <- panels[!zG]
## calculate height for each panel
heights <- matrix(NA, NROW(params$layout), NCOL(params$layout))
## need to add a minimum height as otherwise the space is too narrow
heights[as.matrix(layout[, c("ROW", "COL")])] <- vapply(layout$PANEL, function(i)
max(diff(ranges[[i]]$y.range), 8), numeric(1))
heights_cum <- sort(unique(unlist(apply(heights, 2,
function(col) cumsum(col[!is.na(col)])))))
heights_units <- unit(c(heights_cum[1], diff(heights_cum)), "null")
## set width of left axis to maximum width to align plots
max_width <- max(do.call(grid::unit.c, lapply(panels, function(gt) gt$widths[1])))
panels <- lapply(panels, function(p)
p$widths[1] <- max_width
p
)
mark <- 0
## create layout matrix
layout_matrix <- apply(heights, 2, function(h)
idx <- match(cumsum(h),
cumsum(c(heights_units)))
idx <- idx[!is.na(idx)]
res <- unlist(purrr::imap(idx, function(len_out, pos)
mark <<- mark + 1
offset <- if (pos != 1) idx[pos - 1] else 0
rep(mark, len_out - offset)
))
len_out <- length(res)
if (len_out < length(heights_units))
res <- c(res, rep(NA, length(heights_units) - len_out))
res
)
combined <- gridExtra::arrangeGrob(grobs = panels,
layout_matrix = layout_matrix,
heights = heights_units,
as.table = FALSE)
## add name, such that find_panel can find the plotting area
combined$layout$name <- paste("panel_", layout$LAB)
combined
layout <- function(data, params)
parent_layout <- params$parent$compute_layout(data, params)
msg <- paste0("invalid ",
sQuote("layout"),
". Falling back to ",
sQuote("facet_wrap"),
" layout")
if (is.null(params$layout) ||
!is.matrix(params$layout))
warning(msg)
parent_layout
else
## smash layout into vector and remove NAs all done by sort
layout <- params$layout
panel_numbers <- sort(layout)
if (!isTRUE(all.equal(sort(as.numeric(as.character(parent_layout$PANEL))),
panel_numbers)))
warning(msg)
parent_layout
else
## all good
indices <- cbind(ROW = c(row(layout)),
COL = c(col(layout)),
PANEL = c(layout))
indices <- indices[!is.na(indices[, "PANEL"]), ]
## delete row and col number from parent layout
parent_layout$ROW <- parent_layout$COL <- NULL
new_layout <- merge(parent_layout,
indices,
by = "PANEL") %>%
dplyr::arrange(PANEL)
new_layout$PANEL <- factor(new_layout$PANEL)
labs <- new_layout %>%
dplyr::select(-PANEL,
-SCALE_X,
-SCALE_Y,
-ROW,
-COL) %>%
dplyr::mutate(sep = "_") %>%
do.call(paste, .)
new_layout$LAB <- labs
new_layout
FacetMultiCol <- ggproto("FacetMultiCol", FacetWrap,
compute_layout = layout,
draw_panels = render)
【讨论】:
非常感谢。我已经尝试过一些其他数据 - 区域,而不是大陆(我在问题中提到)......我把代码放在这里......gist.github.com/gjabel/3e4fb31214b5932aa0978dc6d3258dc1......它引发了一些我无法想象的非常奇怪的行为出去? 你能分享(快照)数据吗?我研究了要点,但由于明显的原因无法重现该问题...... 数据在 wpp2019 包中.. 在 CRAN 上 对不起,我的错。会试一试的。 发现bug了,基本上布局必须按照PANEL排序,否则不行。您的示例现在可以正常渲染了。【参考方案2】:正如 cmets 中所建议的,cowplot 和拼凑而成的组合可以让您走得更远。请参阅下面的解决方案。
基本思路是:
首先根据行数计算比例因子, 然后制作一系列单列网格,在其中我使用空图通过计算的比例因子来约束图的高度。 (并删除图例) 然后我将它们添加到网格中并添加图例。 一开始,我还计算了填充比例的最大值。library(tidyverse)
library(gapminder)
library(patchwork)
max_life <- max(gapminder$lifeExp)
generate_plot <- function(data, title)
ggplot(data = data, mapping = aes(x = year, y = fct_rev(country), fill = lifeExp)) +
geom_tile()+
scale_fill_continuous(limits = c(0, max_life)) +
ggtitle(title)
scale_plot <- function(plot, ratio)
plot + theme(legend.position="none") +
plot_spacer() +
plot_layout(ncol = 1,
heights = c(
ratio,
1-ratio
)
)
df <- gapminder %>%
group_by(continent) %>%
nest() %>%
ungroup() %>%
arrange(continent) %>%
mutate(
rows = map_dbl(data, nrow),
rel_height = (rows/max(rows)),
plot = map2(
data,
continent,
generate_plot
),
spaced_plot = map2(
plot,
rel_height,
scale_plot
)
)
wrap_plots(df$spaced_plot) + cowplot::get_legend(df$plot[[1]])
由reprex package (v0.3.0) 于 2019 年 11 月 6 日创建
【讨论】:
以上是关于创建多列分面函数的主要内容,如果未能解决你的问题,请参考以下文章
R语言ggplot2可视化分面图(faceting): 堆叠柱状图的分面图编写自定义函数在分面图的左侧添加图例信息(legend)
R语言ggplot2可视化分面图使用facet_wrap函数和facet_grid函数实战
R语言ggplot2可视化分面图(faceting)可视化分面条形图(facet_wrap bar plot)使用strip.text函数自定义分面图每个分面标题条带strip的大小(cutomi
pandas数据预处理(字段筛选query函数进行数据筛选缺失值删除)seaborn可视化分面图(facet)seaborn使用Catplot可视化分面箱图(Faceted Boxplot)