R如何使用case_when()确定列中的先前值是否大于有序向量中的后续值
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了R如何使用case_when()确定列中的先前值是否大于有序向量中的后续值相关的知识,希望对你有一定的参考价值。
[我正在为珊瑚人口统计数据集计算生长,需要比较Max Diameter (cm)
以确定什么TimeStep
的珊瑚萎缩。我尝试使用滞后,但是由于某种原因,我的新列全为NA,而不仅仅是更改为新珊瑚ID
的行。是否有人知道我需要做些什么,所以我的Diff
列仅包含发生向新菌落过渡的NA?
数据框
A tibble: 20 x 22
`Taxonomic Code` ID Date Year Site_long Shelter `Module #` Side Location Settlement_Area TimeStep size_class `Cover Code` `Max Diameter (… `Max Orthogonal…
<chr> <fct> <date> <chr> <fct> <fct> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 PR H30 2018-11-27 18 Hanauma … Low 216 S D3 0.759 7 3 2 22 17
2 PR H30 2019-02-26 19 Hanauma … Low 216 S D3 0.751 8 3 1 24 19
3 PR H30 2019-05-28 19 Hanauma … Low 216 S D3 0.607 9 3 1 30 20
4 PR H30 2019-08-27 19 Hanauma … Low 216 S D3 0.615 10 1 1 8 8
5 PR H30 2019-11-26 19 Hanauma … Low 216 S D3 0.622 11 5 1 46 30
6 PR H37 2018-09-09 18 Hanauma … High 215 S C1 0.759 6 2 1 14 12
7 PR H37 2018-11-27 18 Hanauma … High 215 S C1 0.751 7 3 1 22 19
8 PR H37 2019-03-12 19 Hanauma … High 215 S C1 0.759 8 3 1 26 20
9 PR H37 2019-05-21 19 Hanauma … High 215 S C1 0.759 9 3 3 29 21
10 PR H37 2019-09-03 19 Hanauma … High 215 S C1 0.683 10 3 1 30 26
11 PR H66 2018-06-05 18 Hanauma … High 213 N A1 0.759 5 2 1 20 19
12 PR H66 2018-09-09 18 Hanauma … High 213 N A1 0.759 6 2 1 20 19
13 PR H66 2018-12-04 18 Hanauma … High 213 N A1 0.653 7 3 1 24 22
14 PR H66 2019-03-05 19 Hanauma … High 213 N A1 0.759 8 3 1 25 24
15 PR H66 2019-05-28 19 Hanauma … High 213 N A1 0.615 9 3 1 28 24
16 PR H66 2019-09-03 19 Hanauma … High 213 N A1 0.531 10 3 1 23 20
17 PR H66 2019-12-03 19 Hanauma … High 213 N A1 0.600 11 3 1 23 16
18 PR H76 2018-09-09 18 Hanauma … High 213 N A4 0.759 6 3 1 21 18
19 PR H76 2018-12-04 18 Hanauma … High 213 N A4 0.653 7 3 1 24 12
20 PR H76 2019-03-05 19 Hanauma … High 213 N A4 0.759 8 3 1 22 19
# … with 7 more variables: `Height (cm)` <dbl>, `Status Code` <chr>, area_mm_squared <dbl>, area_cm_squared <dbl>, Volume_mm_cubed <dbl>, Volume_cm_cubed <dbl>, MD <dbl>
数据框代码
data <- structure(list(`Taxonomic Code` = c("PR", "PR", "PR", "PR", "PR",
"PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR",
"PR", "PR", "PR", "PR"), ID = structure(c(35L, 35L, 35L, 35L,
35L, 38L, 38L, 38L, 38L, 38L, 55L, 55L, 55L, 55L, 55L, 55L, 55L,
61L, 61L, 61L), .Label = c("H1051", "H108", "H110", "H1101",
"H112", "H113", "H116", "H118", "H1188", "H1211", "H122", "H125",
"H1253", "H1289", "H171", "H172", "H174", "H186", "H187", "H188",
"H189", "H191", "H192", "H236", "H237", "H244", "H252", "H254",
"H258", "H274", "H277", "H288", "H292", "H293", "H30", "H332",
"H366", "H37", "H374", "H396", "H466", "H479", "H484", "H499",
"H531", "H560", "H580", "H593", "H597", "H625", "H644", "H647",
"H649", "H653", "H66", "H693", "H695", "H712", "H728", "H737",
"H76", "H760", "H774", "H854", "H926", "H96", "H963", "H98",
"H985", "H991", "H996", "W1038", "W1101", "W1152", "W1154", "W1192",
"W1208", "W1209", "W1214", "W1227", "W1243", "W1245", "W1315",
"W1345", "W1361", "W1377", "W1399", "W1438", "W1494", "W1495",
"W1537", "W1557", "W1614", "W1636", "W1655", "W1669", "W1690",
"W1697", "W1729", "W1741", "W1758", "W1782", "W1785", "W1847",
"W1919", "W2000", "W2004", "W2011", "W2036", "W2044", "W2046",
"W2131", "W2133", "W234", "W249", "W251", "W254", "W307", "W355",
"W359", "W369", "W433", "W450", "W461", "W470", "W480", "W538",
"W542", "W544", "W584", "W601", "W606", "W781", "W79", "W807",
"W872", "W874", "W887", "W890", "W891", "W923", "W952"), class = "factor"),
Date = structure(c(17862, 17953, 18044, 18135, 18226, 17783,
17862, 17967, 18037, 18142, 17687, 17783, 17869, 17960, 18044,
18142, 18233, 17783, 17869, 17960), class = "Date"), Year = c("18",
"19", "19", "19", "19", "18", "18", "19", "19", "19", "18",
"18", "18", "19", "19", "19", "19", "18", "18", "19"), Site_long = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = c("Hanauma Bay", "Waikiki"), class = "factor"),
Shelter = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("High",
"Low"), class = "factor"), `Module #` = c(216, 216, 216,
216, 216, 215, 215, 215, 215, 215, 213, 213, 213, 213, 213,
213, 213, 213, 213, 213), Side = c("S", "S", "S", "S", "S",
"S", "S", "S", "S", "S", "N", "N", "N", "N", "N", "N", "N",
"N", "N", "N"), Location = c("D3", "D3", "D3", "D3", "D3",
"C1", "C1", "C1", "C1", "C1", "A1", "A1", "A1", "A1", "A1",
"A1", "A1", "A4", "A4", "A4"), Settlement_Area = c(0.75902336,
0.751433126, 0.607218688, 0.614808922, 0.622399155, 0.75902336,
0.751433126, 0.75902336, 0.75902336, 0.683121024, 0.75902336,
0.75902336, 0.65276009, 0.75902336, 0.614808922, 0.531316352,
0.599628454, 0.75902336, 0.65276009, 0.75902336), TimeStep = c(7,
8, 9, 10, 11, 6, 7, 8, 9, 10, 5, 6, 7, 8, 9, 10, 11, 6, 7,
8), size_class = c(3, 3, 3, 1, 5, 2, 3, 3, 3, 3, 2, 2, 3,
3, 3, 3, 3, 3, 3, 3), `Cover Code` = c(2, 1, 1, 1, 1, 1,
1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), `Max Diameter (cm)` = c(22,
24, 30, 8, 46, 14, 22, 26, 29, 30, 20, 20, 24, 25, 28, 23,
23, 21, 24, 22), `Max Orthogonal (cm)` = c(17, 19, 20, 8,
30, 12, 19, 20, 21, 26, 19, 19, 22, 24, 24, 20, 16, 18, 12,
19), `Height (cm)` = c(2, 2, 3, 1, 3, 1, 2, 1, 1, 3, 1, 1,
1, 2, 2, 2, 2, 1, 1, 1), `Status Code` = c(NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, "B", NA, NA, "PB", NA, NA,
NA, NA), area_mm_squared = c(374, 456, 600, 64, 1380, 168,
418, 520, 609, 780, 380, 380, 528, 600, 672, 460, 368, 378,
288, 418), area_cm_squared = c(3.74, 4.56, 6, 0.64, 13.8,
1.68, 4.18, 5.2, 6.09, 7.8, 3.8, 3.8, 5.28, 6, 6.72, 4.6,
3.68, 3.78, 2.88, 4.18), Volume_mm_cubed = c(391.651884147528,
477.522083345649, 942.477796076938, 33.5103216382911, 2167.69893097696,
87.9645943005142, 437.728576400178, 272.271363311115, 318.871654339364,
1225.22113490002, 198.967534727354, 198.967534727354, 276.460153515902,
628.318530717959, 703.716754404114, 481.710873550435, 385.368698840348,
197.920337176157, 150.79644737231, 218.864288200089), Volume_cm_cubed = c(0.391651884147528,
0.477522083345649, 0.942477796076938, 0.0335103216382911,
2.16769893097696, 0.0879645943005142, 0.437728576400178,
0.272271363311115, 0.318871654339364, 1.22522113490002, 0.198967534727354,
0.198967534727354, 0.276460153515902, 0.628318530717959,
0.703716754404114, 0.481710873550435, 0.385368698840348,
0.197920337176157, 0.15079644737231, 0.218864288200089),
MD = c(22, 24, 30, 8, 46, 14, 22, 26, 29, 30, 20, 20, 24,
25, 28, 23, 23, 21, 24, 22)), row.names = c(NA, -20L), class = c("tbl_df",
"tbl", "data.frame"))
代码
data_new <- data %>% group_by(ID, TimeStep) %>%
mutate(Diff = `Max Diameter (cm)` - dplyr::lag(`Max Diameter (cm)`))
输出
data_output <- structure(list(`Taxonomic Code` = c("PR", "PR", "PR", "PR", "PR",
"PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR",
"PR", "PR", "PR", "PR"), ID = structure(c(35L, 35L, 35L, 35L,
35L, 38L, 38L, 38L, 38L, 38L, 55L, 55L, 55L, 55L, 55L, 55L, 55L,
61L, 61L, 61L), .Label = c("H1051", "H108", "H110", "H1101",
"H112", "H113", "H116", "H118", "H1188", "H1211", "H122", "H125",
"H1253", "H1289", "H171", "H172", "H174", "H186", "H187", "H188",
"H189", "H191", "H192", "H236", "H237", "H244", "H252", "H254",
"H258", "H274", "H277", "H288", "H292", "H293", "H30", "H332",
"H366", "H37", "H374", "H396", "H466", "H479", "H484", "H499",
"H531", "H560", "H580", "H593", "H597", "H625", "H644", "H647",
"H649", "H653", "H66", "H693", "H695", "H712", "H728", "H737",
"H76", "H760", "H774", "H854", "H926", "H96", "H963", "H98",
"H985", "H991", "H996", "W1038", "W1101", "W1152", "W1154", "W1192",
"W1208", "W1209", "W1214", "W1227", "W1243", "W1245", "W1315",
"W1345", "W1361", "W1377", "W1399", "W1438", "W1494", "W1495",
"W1537", "W1557", "W1614", "W1636", "W1655", "W1669", "W1690",
"W1697", "W1729", "W1741", "W1758", "W1782", "W1785", "W1847",
"W1919", "W2000", "W2004", "W2011", "W2036", "W2044", "W2046",
"W2131", "W2133", "W234", "W249", "W251", "W254", "W307", "W355",
"W359", "W369", "W433", "W450", "W461", "W470", "W480", "W538",
"W542", "W544", "W584", "W601", "W606", "W781", "W79", "W807",
"W872", "W874", "W887", "W890", "W891", "W923", "W952"), class = "factor"),
Date = structure(c(17862, 17953, 18044, 18135, 18226, 17783,
17862, 17967, 18037, 18142, 17687, 17783, 17869, 17960, 18044,
18142, 18233, 17783, 17869, 17960), class = "Date"), Year = c("18",
"19", "19", "19", "19", "18", "18", "19", "19", "19", "18",
"18", "18", "19", "19", "19", "19", "18", "18", "19"), Site_long = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = c("Hanauma Bay", "Waikiki"), class = "factor"),
Shelter = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("High",
"Low"), class = "factor"), `Module #` = c(216, 216, 216,
216, 216, 215, 215, 215, 215, 215, 213, 213, 213, 213, 213,
213, 213, 213, 213, 213), Side = c("S", "S", "S", "S", "S",
"S", "S", "S", "S", "S", "N", "N", "N", "N", "N", "N", "N",
"N", "N", "N"), Location = c("D3", "D3", "D3", "D3", "D3",
"C1", "C1", "C1", "C1", "C1", "A1", "A1", "A1", "A1", "A1",
"A1", "A1", "A4", "A4", "A4"), Settlement_Area = c(0.75902336,
0.751433126, 0.607218688, 0.614808922, 0.622399155, 0.75902336,
0.751433126, 0.75902336, 0.75902336, 0.683121024, 0.75902336,
0.75902336, 0.65276009, 0.75902336, 0.614808922, 0.531316352,
0.599628454, 0.75902336, 0.65276009, 0.75902336), TimeStep = c(7,
8, 9, 10, 11, 6, 7, 8, 9, 10, 5, 6, 7, 8, 9, 10, 11, 6, 7,
8), size_class = c(3, 3, 3, 1, 5, 2, 3, 3, 3, 3, 2, 2, 3,
3, 3, 3, 3, 3, 3, 3), `Cover Code` = c(2, 1, 1, 1, 1, 1,
1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), `Max Diameter (cm)` = c(22,
24, 30, 8, 46, 14, 22, 26, 29, 30, 20, 20, 24, 25, 28, 23,
23, 21, 24, 22), `Max Orthogonal (cm)` = c(17, 19, 20, 8,
30, 12, 19, 20, 21, 26, 19, 19, 22, 24, 24, 20, 16, 18, 12,
19), `Height (cm)` = c(2, 2, 3, 1, 3, 1, 2, 1, 1, 3, 1, 1,
1, 2, 2, 2, 2, 1, 1, 1), `Status Code` = c(NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, "B", NA, NA, "PB", NA, NA,
NA, NA), area_mm_squared = c(374, 456, 600, 64, 1380, 168,
418, 520, 609, 780, 380, 380, 528, 600, 672, 460, 368, 378,
288, 418), area_cm_squared = c(3.74, 4.56, 6, 0.64, 13.8,
1.68, 4.18, 5.2, 6.09, 7.8, 3.8, 3.8, 5.28, 6, 6.72, 4.6,
3.68, 3.78, 2.88, 4.18), Volume_mm_cubed = c(391.651884147528,
477.522083345649, 942.477796076938, 33.5103216382911, 2167.69893097696,
87.9645943005142, 437.728576400178, 272.271363311115, 318.871654339364,
1225.22113490002, 198.967534727354, 198.967534727354, 276.460153515902,
628.318530717959, 703.716754404114, 481.710873550435, 385.368698840348,
197.920337176157, 150.79644737231, 218.864288200089), Volume_cm_cubed = c(0.391651884147528,
0.477522083345649, 0.942477796076938, 0.0335103216382911,
2.16769893097696, 0.0879645943005142, 0.437728576400178,
0.272271363311115, 0.318871654339364, 1.22522113490002, 0.198967534727354,
0.198967534727354, 0.276460153515902, 0.628318530717959,
0.703716754404114, 0.481710873550435, 0.385368698840348,
0.197920337176157, 0.15079644737231, 0.218864288200089),
MD = c(22, 24, 30, 8, 46, 14, 22, 26, 29, 30, 20, 20, 24,
25, 28, 23, 23, 21, 24, 22), Diff = c(NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -20L), groups = structure(list(ID = structure(c(35L,
35L, 35L, 35L, 35L, 38L, 38L, 38L, 38L, 38L, 55L, 55L, 55L, 55L,
55L, 55L, 55L, 61L, 61L, 61L), .Label = c("H1051", "H108", "H110",
"H1101", "H112", "H113", "H116", "H118", "H1188", "H1211", "H122",
"H125", "H1253", "H1289", "H171", "H172", "H174", "H186", "H187",
"H188", "H189", "H191", "H192", "H236", "H237", "H244", "H252",
"H254", "H258", "H274", "H277", "H288", "H292", "H293", "H30",
"H332", "H366", "H37", "H374", "H396", "H466", "H479", "H484",
"H499", "H531", "H560", "H580", "H593", "H597", "H625", "H644",
"H647", "H649", "H653", "H66", "H693", "H695", "H712", "H728",
"H737", "H76", "H760", "H774", "H854", "H926", "H96", "H963",
"H98", "H985", "H991", "H996", "W1038", "W1101", "W1152", "W1154",
"W1192", "W1208", "W1209", "W1214", "W1227", "W1243", "W1245",
"W1315", "W1345", "W1361", "W1377", "W1399", "W1438", "W1494",
"W1495", "W1537", "W1557", "W1614", "W1636", "W1655", "W1669",
"W1690", "W1697", "W1729", "W1741", "W1758", "W1782", "W1785",
"W1847", "W1919", "W2000", "W2004", "W2011", "W2036", "W2044",
"W2046", "W2131", "W2133", "W234", "W249", "W251", "W254", "W307",
"W355", "W359", "W369", "W433", "W450", "W461", "W470", "W480",
"W538", "W542", "W544", "W584", "W601", "W606", "W781", "W79",
"W807", "W872", "W874", "W887", "W890", "W891", "W923", "W952"
), class = "factor"), TimeStep = c(7, 8, 9, 10, 11, 6, 7, 8,
9, 10, 5, 6, 7, 8, 9, 10, 11, 6, 7, 8), .rows = list(1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 20L)), row.names = c(NA, -20L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE))
答案
问题在于分组。当我们包含“ TimeStep”时,每个组只有一行,并且单个元素的lag
为NA
library(dplyr)
data %>%
group_by(ID %>%
mutate(Diff = `Max Diameter (cm)` - dplyr::lag(`Max Diameter (cm)`))
以上是关于R如何使用case_when()确定列中的先前值是否大于有序向量中的后续值的主要内容,如果未能解决你的问题,请参考以下文章
R语言dplyr包使用case_when函数和mutate函数生成新的数据列实战:基于单列生成新的数据列基于多列生成新的数据列
R测试值是不是是组中最低的,如果值是组中最低的,则在新列中添加“是”/“否”