如何在ggplot2中包含忽略NA个案的IF语句
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何在ggplot2中包含忽略NA个案的IF语句相关的知识,希望对你有一定的参考价值。
大家好,感谢您阅读我的问题,
我试图通过类似的主题找到解决方案,但没有找到任何合适的解决方案。这可能是由于我使用过的搜索词。如果我错过了什么,请接受我的道歉。
这是我的数据(有点缩短,但可以重现):
country year sector UN ETS
BG 2000 Energy 24076856.07 NA
BG 2001 Energy 27943916.88 NA
BG 2002 Energy 25263464.92 NA
BG 2003 Energy 27154117.22 NA
BG 2004 Energy 26936616.77 NA
BG 2005 Energy 27148080.12 NA
BG 2006 Energy 27444820.45 NA
BG 2007 Energy 30789683.97 31120644
BG 2008 Energy 32319694.49 30453798
BG 2009 Energy 29694118.01 27669012
BG 2010 Energy 31638282.52 29543392
BG 2011 Energy 36421966.96 34669936
BG 2012 Energy 31628708.27 30777290
BG 2013 Energy 27332059.98 27070570
BG 2014 Energy 29036437.07 28583008
BG 2015 Energy 30316871.19 29935784
BG 2016 Energy 27127914.93 26531704
BG 2017 Energy NA 27966156
CH 2000 Energy 3171899.5 NA
CH 2001 Energy 3313509.6 NA
CH 2002 Energy 3390115.69 NA
CH 2003 Energy 3387122.65 NA
CH 2004 Energy 3682404.04 NA
CH 2005 Energy 3815915.41 NA
CH 2006 Energy 4031766.36 NA
CH 2007 Energy 3718892.16 NA
CH 2008 Energy 3837098.91 NA
CH 2009 Energy 3673731.74 NA
CH 2010 Energy 3846523.62 NA
CH 2011 Energy 3598219.48 NA
CH 2012 Energy 3640743.25 NA
CH 2013 Energy 3735935.29 NA
CH 2014 Energy 3607411.44 NA
CH 2015 Energy 3292576.93 NA
CH 2016 Energy 3380402.57 NA
CY 2000 Energy 2964656.86 NA
CY 2001 Energy 2847105.45 NA
CY 2002 Energy 3008827.44 NA
CY 2003 Energy 3235739.95 NA
CY 2004 Energy 3294769.3 NA
CY 2005 Energy 3483623.91 3471844
CY 2006 Energy 3665461.17 3653380
CY 2007 Energy 3814469.11 3801667
CY 2008 Energy 3980439.76 3967293
CY 2009 Energy 4005649.27 3992467
CY 2010 Energy 3880758.22 3868001
CY 2011 Energy 3722369.39 3728038
CY 2012 Energy 3557560.24 3545929
CY 2013 Energy 2839148.88 2829732
CY 2014 Energy 2950111.64 2940320
CY 2015 Energy 3032961.55 3023003
CY 2016 Energy 3310941.55 3300001
CY 2017 Energy NA 3287834
下面的代码运行平稳并提供它应该的但是,一旦循环到达一个国家(这里是CH),它只有energy$ETS
中的NA值,循环就会停止。我需要的是添加一个IF语句,它允许忽略所描述的情况,然后跳转到下一个国家(而不是中止操作)或只绘制energy$UN
(即它只绘制变量('UN')可用数据,因为energy$ETS
仅提供NA值)。
重要提示:我不想排除所有NA值,但如果遇到一个没有energy$ETS
值的国家/地区,我需要循环继续运行
ctry <- unique(energy$country)
# Color settings: colorblind-friendly palette
cols <- c("#999999", "#E69F00", "#56B4E9", "#009E73",
"#F0E442", "#0072B2", "#D55E00", "#CC79A7")
for(i in (1:length(ctry))) {
plot.df <- energy[energy$country==ctry[i],]
ets.initial <- min(plot.df$year)
x <- plot.df$UN[plot.df$year >= ets.initial & plot.df$year < 2017]
y <- plot.df$ETS[plot.df$year >= ets.initial & plot.df$year < 2017]
m1 <- round(summary(lm(y~x))$r.squared, 3)
m2 <- round(lm(y~x-1)$coef, 3)
p <- ggplot() +
geom_line(data=plot.df,aes(x=plot.df$year, y=plot.df$UN, color='UN 1.A.1'), na.rm=TRUE) +
geom_line(data=plot.df, aes(x=plot.df$year, y=plot.df$ETS, color='ETS 20')) +
annotate(geom='text', label=paste0("R^2==", m1),
x=2014, y=Inf, vjust=2, hjust=0, parse=TRUE, cex=3) +
annotate(geom='text', label=paste0("beta==", m2),
x=2014, y=Inf, vjust=4, hjust=-0.15, parse=TRUE, cex=3) +
labs(x="Year", y="CO2 Emissions (metric tons)", z="",
title=paste("Energy sector emissions for", ctry[i])) +
theme(plot.margin=unit(c(.5, .5, .5, .5), "cm")) +
scale_color_manual(values = cols) +
scale_y_continuous(labels = scales::comma) +
scale_x_continuous(breaks = seq(2000, 2017, by=5)) +
labs(color="Datasets")
p
ggsave(p, filename=paste("H:/figures_energy/", ctry[i], ".png", sep=""),
width=6.5, height=6)
}
非常感谢您的任何帮助!
最好,
康斯坦丁
答案
我实施了我的评论(并做了一些一般的清理),它对我有用。我不想创建一堆文件,因此我将这些图放在列表中而不是保存它们。确保你的p ggsave(...)
行在p
和ggsave
之间有一个换行符 - 你在同一行的问题中的方式是语法错误。
ctry <- unique(energy$country)
# Color settings: colorblind-friendly palette
cols <- c(
"#999999",
"#E69F00",
"#56B4E9",
"#009E73",
"#F0E442",
"#0072B2",
"#D55E00",
"#CC79A7"
)
plot_list = list()
for (i in (1:length(ctry))) {
plot.df <- energy[energy$country == ctry[i], ]
# Go to next iteration if ETS is all NA
if(all(is.na(plot.df$ETS))) {
next
}
# clean up modeling code. It is pointless to define the minimum and then
# subset everything above the minimum. By definition, everything is already
# above the minimum. It's also cleaner to subset the data frame and use
# the `data` argument of `lm`:
mod.df = plot.df[plot.df$year < 2017, ]
m1 <- round(summary(lm(ETS ~ UN, data = mod.df))$r.squared, 3)
m2 <- round(lm(ETS ~ UN - 1, data = mod.df)$coef, 3)
# Only using one data frame, so set it in the initial `ggplot()`, not
# re-specify it in every layer. Similarly, set `aes(x = year)` once.
p <- ggplot(data = plot.df, aes(x = year)) +
# use bare column names in aes()
geom_line(aes(y = UN, color = 'UN 1.A.1'), na.rm = TRUE) +
geom_line(aes(y = ETS, color = 'ETS 20')) +
annotate(
geom = 'text',
label = paste0("R^2==", m1),
x = 2014, y = Inf,
vjust = 2, hjust = 0,
parse = TRUE,
cex = 3
) +
annotate(
geom = 'text',
label = paste0("beta==", m2),
x = 2014, y = Inf,
vjust = 4, hjust = -0.15,
parse = TRUE,
cex = 3
) +
labs(
x = "Year",
y = "CO2 Emissions (metric tons)",
z = "",
title = paste("Energy sector emissions for", ctry[i])
) +
theme(plot.margin = unit(c(.5, .5, .5, .5), "cm")) +
scale_color_manual(values = cols) +
scale_y_continuous(labels = scales::comma) +
scale_x_continuous(breaks = seq(2000, 2017, by = 5)) +
labs(color = "Datasets")
plot_list[[i]] = p
}
使用此数据:
energy = read.table(header = T, text = "country year sector UN ETS
BG 2000 Energy 24076856.07 NA
BG 2001 Energy 27943916.88 NA
BG 2002 Energy 25263464.92 NA
BG 2003 Energy 27154117.22 NA
BG 2004 Energy 26936616.77 NA
BG 2005 Energy 27148080.12 NA
BG 2006 Energy 27444820.45 NA
BG 2007 Energy 30789683.97 31120644
BG 2008 Energy 32319694.49 30453798
BG 2009 Energy 29694118.01 27669012
BG 2010 Energy 31638282.52 29543392
BG 2011 Energy 36421966.96 34669936
BG 2012 Energy 31628708.27 30777290
BG 2013 Energy 27332059.98 27070570
BG 2014 Energy 29036437.07 28583008
BG 2015 Energy 30316871.19 29935784
BG 2016 Energy 27127914.93 26531704
BG 2017 Energy NA 27966156
CH 2000 Energy 3171899.5 NA
CH 2001 Energy 3313509.6 NA
CH 2002 Energy 3390115.69 NA
CH 2003 Energy 3387122.65 NA
CH 2004 Energy 3682404.04 NA
CH 2005 Energy 3815915.41 NA
CH 2006 Energy 4031766.36 NA
CH 2007 Energy 3718892.16 NA
CH 2008 Energy 3837098.91 NA
CH 2009 Energy 3673731.74 NA
CH 2010 Energy 3846523.62 NA
CH 2011 Energy 3598219.48 NA
CH 2012 Energy 3640743.25 NA
CH 2013 Energy 3735935.29 NA
CH 2014 Energy 3607411.44 NA
CH 2015 Energy 3292576.93 NA
CH 2016 Energy 3380402.57 NA
CY 2000 Energy 2964656.86 NA
CY 2001 Energy 2847105.45 NA
CY 2002 Energy 3008827.44 NA
CY 2003 Energy 3235739.95 NA
CY 2004 Energy 3294769.3 NA
CY 2005 Energy 3483623.91 3471844
CY 2006 Energy 3665461.17 3653380
CY 2007 Energy 3814469.11 3801667
CY 2008 Energy 3980439.76 3967293
CY 2009 Energy 4005649.27 3992467
CY 2010 Energy 3880758.22 3868001
CY 2011 Energy 3722369.39 3728038
CY 2012 Energy 3557560.24 3545929
CY 2013 Energy 2839148.88 2829732
CY 2014 Energy 2950111.64 2940320
CY 2015 Energy 3032961.55 3023003
CY 2016 Energy 3310941.55 3300001
CY 2017 Energy NA 3287834")
另一答案
for(i in (1:length(ctry))){
plot.df <- energy[energy$country==ctry[i],]
ets.initial <- min(plot.df$year)
if(FALSE %in% is.na(plot.df$ETS) # Checks if there is any non-NA value in plot.df$ETS
(produce plots and rest of output as planned)
}
将是使用基数R的解决方案。
以上是关于如何在ggplot2中包含忽略NA个案的IF语句的主要内容,如果未能解决你的问题,请参考以下文章
如何在多行 innerHTML 属性的 javascript 中包含 if 语句?
R:ggplot2:facet_grid:如何在少数(不是全部)标签中包含数学表达式?