一个图例中的重复变量（据我所知，这不是常见问题！）

Posted 2023-02-16

技术标签:

【中文标题】一个图例中的重复变量（据我所知，这不是常见问题！）【英文标题】：Duplicated variables in one legend (not common problem as far as I can see!) 【发布时间】：2019-05-28 10:13:43 【问题描述】：

我有以下问题，这似乎很常见，但事实并非如此。我制作了一个手动设置线型和颜色的 ggplot 图，两个图例具有相同的名称和相同的变量标签，df 为长格式。生成一个图例，但每个变量显示两次。为了让您了解我想要达到的目标，我需要稍微备份一下。

我正在开发一个功能，该功能允许我更新包含今年每月支出的数据框，然后生成不同的图表来跟进我的预算。可以这么说，我的变量有两个“属性”。它们属于特定项目，每个项目要么是预测（即计划），要么是实际支出。我最初想要的是让每个项目都拥有一种颜色和两种线型（实心表示预计，实线表示实际支出）。因此，例如，绿色表示储蓄，预计储蓄用实线，实际储蓄用虚线。我想要两个图例，一个图例只显示颜色（即项目），另一个只显示两种线型（实线，虚线），以便读者将两者放在一起（因此也少图例项目总数）。如果有人有这个问题的解决方案，我会很高兴知道。但是，以下是我现在要解决的问题：

我现在已经放弃了这个初衷，选择了一个图例，每种行都有一个图例条目。这就是介绍（上图）的内容。尽管具有相同的图例名称和变量标签以及正确的编号，但每个变量现在出现两次。我想知道为什么我会收到这些重复条目并找到解决方案。我在几个小时内尝试了各种各样的事情，但没有人遇到类似的问题（因为我的关键字搜索遇到了更“正常”的问题）。

我还注意到一个奇怪的事情是变量“Add.income”的行为与其他变量不同，因为它只出现一次。

数据框（下）中有许多 NA 值的原因是因为这些是要填充到 df 中的数字，然后随着年份的进行绘制。

代码：

ggplot(fin2019Long, aes(x=month, y=value, colour=variable)) +   geom_line(aes(linetype=variable)) + geom_point() +
labs(title = "Projected expenditure and saving", y = "Euros", x = "Month") +
scale_x_continuous("Month", breaks= c(1:12)) +
scale_colour_manual(name = "Items", 
                  values=c("green","green", "yellow", "yellow", "blue", "blue", "red", "red", "orange"), 
                  labels=c(rep("Living expend.", 2), rep("Debt repay.", 2), rep("Saving", 2), rep("Furn. fund", 2), "Extra pay")) +
scale_linetype_manual(name = "Items", 
                    values=c(rep(c("solid", "twodash"), 4), "twodash"), 
                    labels=c(rep("Living expend.", 2), rep("Debt repay.", 2), rep("Saving", 2), rep("Furn. fund", 2), "Extra pay"))

数据：

structure(list(month = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 
6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L), variable = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L), .Label = c("livingExpProj", 
"livingExp", "debtRepayProj", "debtRepay", "savingProj", "saving", 
"furnFundProj", "furnFund", "addIncome"), class = "factor"), 
value = c(1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 
1000, 1000, 1000, 1000, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, 600, 600, 600, 600, 600, 600, 600, 600, 600, 
600, 600, 600, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 
500, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 100, 
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -108L
), class = "data.frame")

【问题讨论】：

【参考方案1】：

将variable列分成两列会更容易控制：

fin2019Long$type <- ifelse(grepl('Proj$', fin2019Long$variable), 'Planned', 'Spending')
fin2019Long$variable2 <- gsub('Proj$', '', fin2019Long$variable)

ggplot(fin2019Long, aes(x=month, y=value, colour=variable2)) +
    geom_line(aes(linetype=type)) + geom_point() +
    labs(title = "Projected expenditure and saving", y = "Euros", x = "Month") +
    scale_x_continuous("Month", breaks= c(1:12))

【讨论】：

漂亮的解决方案，非常感谢！不是 100% 确定 grepl 和 gsub 做什么，但我会查一下！再次感谢，新年快乐！

以上是关于一个图例中的重复变量（据我所知，这不是常见问题！）的主要内容，如果未能解决你的问题，请参考以下文章

Plotly 可移动/可拖动图例，python

为啥在增加数组名称时没有“需要左值”错误[重复]