如何为图形子集高大数据

Question

我的数据格式很高。我有兴趣使用ggplot为每个区域生成折线图。但是，我一直收到的错误是美学必须是长度1或与数据相同。

硬编码解决方案：

date_q <- HPF$date[1:167]
CumulativeSubset_region1 <- HPF$BaseCumulative[1:167]
ggplot(HPF[1:167, ], aes(x = date_q, y= CumulativeSubset_region1)) + 
  geom_line()

没有硬编码：

ggplot(data = HPF, aes(x=date, y= BaseC)) + geom_line(na.rm = FALSE) + theme_light()

正如您所看到的，峰值的原因是日期范围在所有区域都是不变的，但区域累积量是不同的。

数据：

#Rows 1-3 (Region 1 Sample): 
dput(head(HPF[1:3, ]))
    structure(list(region = c(1, 1, 1), path = c(1, 1, 1), date = c(20140215, 
    20140515, 20140815), index_value = c(1, 1.033852765, 1.041697122
    ), index = 0:2, counter = 1:3, BaseQoQ = c(NA, 0.033852765, 0.00758749917354029
    ), BaseCumulative = c(100, 103.3852765, 104.1697122), StressCumulative = c(110, 
    113.3852765, 114.1697122), StressQoQ = c(NA, 0.0307752409090909, 
    0.00691832065162346)), .Names = c("region", "path", "date", "index_value", 
    "index", "counter", "BaseQoQ", "BaseCumulative", "StressCumulative", 
    "StressQoQ"), row.names = c(NA, -3L), class = c("tbl_df", "tbl", 
    "data.frame"))

#Rows 168:200 (Region 2 Sample):
dput(head(HPF[168:200, ]))
    structure(list(region = c(2, 2, 2, 2, 2, 2), path = c(1, 1, 1, 
    1, 1, 1), date = c(20140215, 20140515, 20140815, 20141115, 20150215, 
    20150515), index_value = c(1, 1.014162265, 1.01964828, 1.009372314, 
    1.007210703, 1.018695493), index = 0:5, counter = 1:6, BaseQoQ = c(NA, 
    0.014162265, 0.00540940556489744, -0.0100779515854232, -0.0021415398163972, 
    0.0114025694582001), BaseCumulative = c(100, 101.4162265, 101.964828, 
    100.9372314, 100.7210703, 101.8695493), StressCumulative = c(110, 
    111.4162265, 111.964828, 110.9372314, 110.7210703, 101.8695493
    ), StressQoQ = c(NA, 0.0128747863636363, 0.00492389230216839, 
    -0.00917785181610786, -0.00194849914020834, -0.0799443229370588
    )), .Names = c("region", "path", "date", "index_value", "index", 
    "counter", "BaseQoQ", "BaseCumulative", "StressCumulative", "StressQoQ"
    ), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
    ))

Answer 1

另一答案