什么策略建议按月和年分配数据帧以获得R中的单个数据帧列表

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了什么策略建议按月和年分配数据帧以获得R中的单个数据帧列表相关的知识,希望对你有一定的参考价值。

我希望按月和年(即01-2000)子集获取单个数据框列表以继续使用它们,日期不连续,有些月份有三行,两行或一行,我试过这个:

# DF is my Original data frame
# Defining the vectors of month and yaers to subset
     months <- c("01","02","03","04","05","06","07","08","09","10","11","12")
     years <- as.character(seq(2000,2017,by=1))

# Obtaining a list of DF subset
     list_blank <- list()
     for (j in 1:length(years)){
     list_dirs <- list()
            for(k in 1:length(months)){
            list_dirs[[k]] <- subset(DF, format.Date(DATE,"%m")==months[k] & 
                                          format.Date(DATE, "%Y")==years[j])
            }
     list_blank[[j]] <- list_dirs[sapply(list_dirs, nrow)>0]
     }
list_blank

但是在列表_blank中......我只是获得了一个嵌套的DF列表,如下所示:

> list_blank
[[1]]
[[1]][[1]]
        DATE     TEMP      BIO               DIR
1 2000-06-29 9.019003 4.047207 /R/user/data1.txt

[[1]][[2]]
        DATE     TEMP        BIO               DIR
2 2000-07-15 13.38281 -0.9780719 /R/user/data2.txt

[[1]][[3]]
        DATE     TEMP      BIO               DIR
3 2000-09-17 14.23595 8.064344 /R/user/data3.txt

[[1]][[4]]
        DATE      TEMP       BIO               DIR
4 2000-10-03  7.022689 -9.940022 /R/user/data4.txt
5 2000-10-19 12.010593  5.031642 /R/user/data5.txt

[[1]][[5]]
        DATE      TEMP       BIO               DIR
6 2000-11-04  6.582599 -2.854467 /R/user/data6.txt
7 2000-11-20 43.911833  3.003842 /R/user/data7.txt

[[1]][[6]]
        DATE     TEMP      BIO               DIR
8 2000-12-06 41.91008 -3.97713 /R/user/data8.txt


[[2]]
[[2]][[1]]
        DATE       TEMP          BIO               DIR
9 2001-05-31 -0.9638592 -0.001780529 /R/user/data9.txt

[[2]][[2]]
         DATE     TEMP       BIO                DIR
10 2001-06-16 41.77771 -3.996572 /R/user/data10.txt

[[2]][[3]]
         DATE     TEMP      BIO                DIR
11 2001-07-03 3.527663 3.148131 /R/user/data11.txt

[[2]][[4]]
         DATE     TEMP       BIO                DIR
12 2001-08-04 36.25167 -3.972604 /R/user/data12.txt

[[2]][[5]]
         DATE     TEMP      BIO                DIR
13 2001-09-21 4.840813 1.163948 /R/user/data13.txt

[[2]][[6]]
         DATE     TEMP       BIO                DIR
14 2001-10-23 0.342628 -7.611015 /R/user/data14.txt


[[3]]
[[3]][[1]]
         DATE     TEMP     BIO                DIR
15 2002-01-27 8.990325 5.04787 /R/user/data15.txt

[[3]][[2]]
         DATE     TEMP        BIO                DIR
16 2002-03-16 8.200143 -0.9392823 /R/user/data16.txt

[[3]][[3]]
         DATE     TEMP        BIO                DIR
17 2002-05-03 14.49365 0.06034713 /R/user/data17.txt

[[3]][[4]]
         DATE     TEMP       BIO                DIR
18 2002-07-23 18.33042 -4.995815 /R/user/data18.txt

[[3]][[5]]
         DATE     TEMP      BIO                DIR
19 2002-09-09 38.57021 5.000115 /R/user/data19.txt

[[3]][[6]]
         DATE      TEMP         BIO                DIR
20 2002-10-11 -4.740848 -0.09915025 /R/user/data20.txt

[[3]][[7]]
         DATE     TEMP      BIO                DIR
21 2002-11-12 4.589545 9.151785 /R/user/data21.txt

[[3]][[8]]
         DATE     TEMP       BIO                DIR
22 2002-12-30 3.710634 -3.840208 /R/user/data22.txt

什么策略建议我将数据子集化以获得单个数据帧列表?

欢迎任何帮助。

我原来的DF是:

DF <- as.data.frame(structure(list(structure(1:152, .Label = c("2000-06-29","2000-07-15", 
"2000-09-17", "2000-10-03", "2000-10-19", "2000-11-04", "2000-11-20", 
"2000-12-06", "2001-05-31", "2001-06-16", "2001-07-03", "2001-08-04", 
"2001-09-21", "2001-10-23", "2002-01-27", "2002-03-16", "2002-05-03", 
"2002-07-23", "2002-09-09", "2002-10-11", "2002-11-12", "2002-12-30", 
"2003-03-20", "2003-04-05", "2003-04-21", "2003-05-07", "2003-07-27", 
"2003-08-12", "2003-08-28", "2003-09-29", "2003-10-31", "2004-04-24", 
"2004-05-10", "2004-05-26", "2004-06-28", "2004-07-14", "2004-08-31", 
"2004-09-16", "2004-11-03", "2005-02-23", "2005-04-04", "2005-04-12", 
"2005-05-14", "2005-05-22", "2005-06-07", "2005-07-02", "2005-07-18", 
"2005-07-26", "2005-08-19", "2005-09-12", "2005-10-06", "2005-10-22", 
"2005-11-07", "2005-11-15", "2006-02-27", "2006-04-24", "2006-05-02", 
"2006-06-11", "2006-07-14", "2006-09-08", "2006-09-16", "2006-10-18", 
"2007-02-15", "2007-03-19", "2007-04-04", "2007-05-06", "2007-07-02", 
"2007-07-26", "2007-08-03", "2007-10-30", "2007-11-15", "2007-12-01", 
"2007-12-17", "2008-02-11", "2008-05-01", "2008-05-09", "2008-06-02", 
"2008-06-10", "2008-06-27", "2008-07-13", "2008-08-22", "2008-09-07", 
"2008-10-01", "2008-10-17", "2008-11-02", "2008-11-10", "2008-11-18", 
"2009-01-05", "2009-01-21", "2009-02-22", "2009-03-26", "2009-04-11", 
"2009-04-27", "2009-05-13", "2009-07-01", "2009-09-19", "2009-11-06", 
"2009-11-22", "2009-12-08", "2010-02-10", "2010-03-30", "2010-05-01", 
"2010-06-18", "2010-07-05", "2010-08-22", "2010-09-07", "2010-09-23", 
"2010-10-25", "2010-11-10", "2011-03-18", "2011-04-19", "2011-05-05", 
"2011-11-14", "2011-12-16", "2012-01-01", "2012-02-18", "2012-05-08", 
"2012-06-09", "2012-08-13", "2012-08-29", "2012-09-30", "2012-11-01", 
"2012-11-17", "2013-08-01", "2013-08-17", "2013-09-02", "2013-09-18", 
"2013-10-04", "2013-11-05", "2013-12-23", "2014-04-14", "2014-04-30", 
"2014-05-16", "2014-06-01", "2014-08-05", "2014-11-25", "2015-06-05", 
"2015-07-24", "2015-08-09", "2015-09-26", "2015-11-13", "2015-11-29", 
"2015-12-15", "2016-05-07", "2016-05-23", "2016-06-08", "2016-06-24", 
"2016-07-11", "2016-07-27", "2016-08-28", "2016-09-13", "2017-05-11"
), class = "factor"), c(9.019002745, 13.38280903, 14.2359526, 
7.02268875, 12.0105926, 6.582598703, 43.91183269, 41.9100792, 
-0.963859179, 41.77771314, 3.527663232, 36.25167196, 4.840813341, 
0.342627997, 8.990324917, 8.200143073, 14.49365246, 18.33041769, 
38.57020518, -4.740848277, 4.589545271, 3.710634326, 44.71131919, 
48.31530897, 40.73581856, 8.590515833, 26.80755263, 14.55646993, 
41.623884, 20.65603433, 44.99225806, 2.862221194, 9.389363964, 
44.30106364, 14.40674254, 19.60333037, 46.3459391, 5.498115095, 
12.20907822, 29.59519173, 18.86875823, 44.00330238, 48.12024135, 
9.938699338, 5.967900346, 42.56517154, 18.14282128, 1.367904531, 
13.63187106, 38.27535697, 38.48259515, 8.914401699, 47.78084236, 
7.742576151, 34.00819791, 10.24828176, 46.24698041, 4.053884184, 
28.05228078, 44.07920451, 9.556626224, 20.21050249, 3.219616433, 
14.57752231, 7.745048002, 31.34427721, 27.03156268, 24.07758642, 
41.54027966, 20.98718537, 0.484196337, 44.67014205, 17.2716998, 
43.31391339, 43.45600729, 10.59062952, 27.52240792, 22.50072766, 
25.13289498, 9.848386032, 30.2590926, 29.21841757, 21.9001874, 
38.99733619, 2.679753575, 21.95446182, 3.909738002, 18.34847693, 
10.2531444, 27.49872543, 7.074224135, 26.69889114, 4.516975978, 
5.056692651, 18.83999983, 12.29807673, 44.52083147, 44.7383122, 
38.80497128, 23.43375586, 24.53213961, 41.81034217, -1.566339796, 
9.935546695, 36.52755411, 25.32403138, 15.07663698, -2.009304568, 
20.5564721, 5.382169325, -2.75833765, 45.6762325, 27.99027954, 
5.58370937, 38.06938684, 41.52819007, 34.63169634, 28.60222438, 
26.80450595, 0.462128759, 29.35287359, 2.395094612, 24.71249393, 
32.36071934, 21.61766031, 23.46290408, 27.12918609, 4.4079775, 
14.92758589, 31.70462481, 25.55145303, 1.401294079, 38.83606373, 
4.597671191, 8.951049338, 5.607093845, 12.64761583, 25.96605947, 
32.70096539, 21.87109539, 49.06385265, 20.6286949, -3.910850422, 
34.04723625, 13.22923582, 30.44847464, 6.753090508, 18.497554, 
-4.674471581, 35.49162939, 12.87265885, 20.58361941), c(4.047206988, 
-0.978071864, 8.064343633, -9.940021903, 5.031642336, -2.854467401, 
3.003841743, -3.977130333, -0.001780529, -3.996571923, 3.14813094, 
-3.972604149, 1.163947895, -7.611014825, 5.047870144, -0.939282251, 
0.060347126, -4.995814616, 5.00011473, -0.099150252, 9.151784937, 
-3.840207915, 0.012025386, -2.996290324, 1.005600743, -9.918799422, 
6.010015094, -2.988758068, 10.02202957, -7.956647147, 8.014827596, 
-4.656225573, 8.095687882, -8.986967199, 10.04209071, -0.979651251, 
10.00020155, -2.829962457, 10.06371652, -6.992802007, 2.002127702, 
-7.991727777, 9.008144692, -6.990450799, 6.0696827, -0.980441054, 
0.044317365, -8.645938786, 3.042487752, -8.998353964, 5.014430567, 
-6.926582841, 4.01335907, -6.9554085, 6.018550364, -2.925630439, 
0.019097947, 0.004029105, 5.006380597, -7.986370947, 5.062702551, 
-6.973820337, 5.281322095, -3.946919794, 1.070867429, -8.969144341, 
5.011123776, -6.988352615, 8.004484897, -9.998045194, 8.802793362, 
-2.981626855, 1.035404161, -4.984899458, 7.00084085, -1.927044064, 
3.000359654, -8.986719518, 0.019614007, -5.905777303, 4.025102908, 
-0.966314147, 0.040501334, -0.990844513, 8.215796805, -0.956109294, 
1.124473878, -3.95104908, 8.076691104, -9.983958268, 4.040031998, 
-0.973408457, 3.040326082, -2.896775539, 4.050626175, -7.965117263, 
4.010225548, -8.982055599, 8.013724278, -0.970167339, 4.015507399, 
-3.980564803, 6.52272698, -5.943096232, 1.003802529, -4.964800446, 
7.018898455, -10.22019131, 9.034231282, -2.81837997, 2.784343809, 
-3.992337931, 10.01512081, 0.080260982, 4.013484598, -9.983292414, 
0.001433342, -0.99352202, 8.017329484, -5.082248864, 9.018689774, 
-8.617788597, 9.027182794, -8.998192527, 7.03168444, -9.963271505, 
4.026238539, -7.931625922, 4.027930203, -6.968731248, 1.0140945, 
-8.348782123, 10.02264364, 0.14204043, 3.080408833, -3.921496731, 
3.061657616, -1.969588616, 6.023729809, -7.987423129, 3.008832067, 
-1.963496132, -0.13632816, -3.998955926, 10.05317123, -2.984520602, 
0.073301289, -2.950840094, -0.171828567, 0.02800582, 1.011777695, 
-3.975580335), structure(c(1L, 65L, 76L, 87L, 98L, 109L, 120L, 
131L, 142L, 2L, 13L, 24L, 35L, 46L, 57L, 61L, 62L, 63L, 64L, 
66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 77L, 78L, 79L, 
80L, 81L, 82L, 83L, 84L, 85L, 86L, 88L, 89L, 90L, 91L, 92L, 93L, 
94L, 95L, 96L, 97L, 99L, 100L, 101L, 102L, 103L, 104L, 105L, 
106L, 107L, 108L, 110L, 111L, 112L, 113L, 114L, 115L, 116L, 117L, 
118L, 119L, 121L, 122L, 123L, 124L, 125L, 126L, 127L, 128L, 129L, 
130L, 132L, 133L, 134L, 135L, 136L, 137L, 138L, 139L, 140L, 141L, 
143L, 144L, 145L, 146L, 147L, 148L, 149L, 150L, 151L, 152L, 3L, 
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 14L, 15L, 16L, 17L, 18L, 
19L, 20L, 21L, 22L, 23L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 
33L, 34L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 47L, 
48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 58L, 59L, 60L), .Label = c("/R/user/data1.txt", 
"/R/user/data10.txt", "/R/user/data100.txt", "/R/user/data101.txt", 
"/R/user/data102.txt", "/R/user/data103.txt", "/R/user/data104.txt", 
"/R/user/data105.txt", "/R/user/data106.txt", "/R/user/data107.txt", 
"/R/user/data108.txt", "/R/user/data109.txt", "/R/user/data11.txt", 
"/R/user/data110.txt", "/R/user/data111.txt", "/R/user/data112.txt", 
"/R/user/data113.txt", "/R/user/data114.txt", "/R/user/data115.txt", 
"/R/user/data116.txt", "/R/user/data117.txt", "/R/user/data118.txt", 
"/R/user/data119.txt", "/R/user/data12.txt", "/R/user/data120.txt", 
"/R/user/data121.txt", "/R/user/data122.txt", "/R/user/data123.txt", 
"/R/user/data124.txt", "/R/user/data125.txt", "/R/user/data126.txt", 
"/R/user/data127.txt", "/R/user/data128.txt", "/R/user/data129.txt", 
"/R/user/data13.txt", "/R/user/data130.txt", "/R/user/data131.txt", 
"/R/user/data132.txt", "/R/user/data133.txt", "/R/user/data134.txt", 
"/R/user/data135.txt", "/R/user/data136.txt", "/R/user/data137.txt", 
"/R/user/data138.txt", "/R/user/data139.txt", "/R/user/data14.txt", 
"/R/user/data140.txt", "/R/user/data141.txt", "/R/user/data142.txt", 
"/R/user/data143.txt", "/R/user/data144.txt", "/R/user/data145.txt", 
"/R/user/data146.txt", "/R/user/data147.txt", "/R/user/data148.txt", 
"/R/user/data149.txt", "/R/user/data15.txt", "/R/user/data150.txt", 
"/R/user/data151.txt", "/R/user/data152.txt", "/R/user/data16.txt", 
"/R/user/data17.txt", "/R/user/data18.txt", "/R/user/data19.txt", 
"/R/user/data2.txt", "/R/user/data20.txt", "/R/user/data21.txt", 
"/R/user/data22.txt", "/R/user/data23.txt", "/R/user/data24.txt", 
"/R/user/data25.txt", "/R/user/data26.txt", "/R/user/data27.txt", 
"/R/user/data28.txt", "/R/user/data29.txt", "/R/user/data3.txt", 
"/R/user/data30.txt", "/R/user/data31.txt", "/R/user/data32.txt", 
"/R/user/data33.txt", "/R/user/data34.txt", "/R/user/data35.txt", 
"/R/user/data36.txt", "/R/user/data37.txt", "/R/user/data38.txt", 
"/R/user/data39.txt", "/R/user/data4.txt", "/R/user/data40.txt", 
"/R/user/data41.txt", "/R/user/data42.txt", "/R/user/data43.txt", 
"/R/user/data44.txt", "/R/user/data45.txt", "/R/user/data46.txt", 
"/R/user/data47.txt", "/R/user/data48.txt", "/R/user/data49.txt", 
"/R/user/data5.txt", "/R/user/data50.txt", "/R/user/data51.txt", 
"/R/user/data52.txt", "/R/user/data53.txt", "/R/user/data54.txt", 
"/R/user/data55.txt", "/R/user/data56.txt", "/R/user/data57.txt", 
"/R/user/data58.txt", "/R/user/data59.txt", "/R/user/data6.txt", 
"/R/user/data60.txt", "/R/user/data61.txt", "/R/user/data62.txt", 
"/R/user/data63.txt", "/R/user/data64.txt", "/R/user/data65.txt", 
"/R/user/data66.txt", "/R/user/data67.txt", "/R/user/data68.txt", 
"/R/user/data69.txt", "/R/user/data7.txt", "/R/user/data70.txt", 
"/R/user/data71.txt", "/R/user/data72.txt", "/R/user/data73.txt", 
"/R/user/data74.txt", "/R/user/data75.txt", "/R/user/data76.txt", 
"/R/user/data77.txt", "/R/user/data78.txt", "/R/user/data79.txt", 
"/R/user/data8.txt", "/R/user/data80.txt", "/R/user/data81.txt", 
"/R/user/data82.txt", "/R/user/data83.txt", "/R/user/data84.txt", 
"/R/user/data85.txt", "/R/user/data86.txt", "/R/user/data87.txt", 
"/R/user/data88.txt", "/R/user/data89.txt", "/R/user/data9.txt", 
"/R/user/data90.txt", "/R/user/data91.txt", "/R/user/data92.txt", 
"/R/user/data93.txt", "/R/user/data94.txt", "/R/user/data95.txt", 
"/R/user/data96.txt", "/R/user/data97.txt", "/R/user/data98.txt", 
"/R/user/data99.txt"), class = "factor")), .Names = c("DATE", 
"TEMP", "BIO", "DIR")))
答案

我不完全确定我明白你想做什么。但也许这应该让你开始。我会用lubridate

library(lubridate)
library(tidyverse)
DF %>%
    mutate(DATE = ymd(DATE)) %>%
    filter(month(DATE) %in% as.numeric(months) & year(DATE) %in% as.numeric(years))
#    DATE       TEMP          BIO                DIR
#1  2000-06-29  9.0190027  4.047206988  /R/user/data1.txt
#2  2000-07-15 13.3828090 -0.978071864  /R/user/data2.txt
#3  2000-09-17 14.2359526  8.064343633  /R/user/data3.txt
#4  2000-10-03  7.0226888 -9.940021903  /R/user/data4.txt
#5  2000-10-19 12.0105926  5.031642336  /R/user/data5.txt
#6  2000-11-04  6.5825987 -2.854467401  /R/user/data6.txt
#7  2000-11-20 43.9118327  3.003841743  /R/user/data7.txt
#8  2000-12-06 41.9100792 -3.977130333  /R/user/data8.txt
#9  2001-05-31 -0.9638592 -0.001780529  /R/user/data9.txt
#10 2001-06-16 41.7777131 -3.996571923 /R/user/data10.txt

说明:我们将DATE转换为Date对象,然后选择月份与months中的条目匹配的行以及年份与years中的条目匹配的行。

不幸的是,对于您提供的示例数据,没有任何内容。


更新

我不确定你的问题与子集有什么关系,在我看来你问的是如何将DF拆分数年和数月。如果是这种情况,你可以这样做

lst <- lapply(split(DF, year(ymd(DF$DATE))), function(x) split(x, month(ymd(x$DATE))))

要将嵌套的list折叠成一个简单的list,你可以做到

unlist(lst, recursive = F)

以上是关于什么策略建议按月和年分配数据帧以获得R中的单个数据帧列表的主要内容,如果未能解决你的问题,请参考以下文章

在 postgresql 中按月和年对查询结果进行分组

Drupal 视图 UI,过滤器暴露,集合字段内容中的日期仅按月和年(无天)

Django按月和年过滤

LINQ:在日期时间字段中按月和年分组

如何在 Jooq 中按月和年将毫秒翻译成日期和分组?

圆形图和圆形统计