如果日期相同或 +- 7 天且 ID 相同，则合并 2 行

Posted 2023-02-22

技术标签:

【中文标题】如果日期相同或 +- 7 天且 ID 相同，则合并 2 行【英文标题】：Merging of 2 rows if the date is the same or +- 7 days, and the ID is the same 【发布时间】：2018-03-05 11:10:47 【问题描述】：

所以我一直在努力解决这个问题，但我不知道该怎么做。

这是一个例子：

ID  Hosp. date  Discharge date
1   2006-02-02  2006-02-04
1   2006-02-04  2006-02-18
1   2006-02-22  2006-03-24
1   2008-08-09  2008-09-14
2   2004-01-03  2004-01-08
2   2004-01-13  2004-01-15
2   2004-06-08  2004-06-28

如果出院日期与医院相同，我想要的是一种按 ID 组合行的方法。日期（或 +-7 天）在下一行。所以它看起来像这样：

ID  Hosp. date  Discharge date
1   2006-02-02  2006-03-24
1   2008-08-09  2008-09-14
2   2004-01-03  2004-01-15
2   2004-06-08  2004-06-28

【问题讨论】：

相关：Collapse rows with overlapping ranges 【参考方案1】：

使用data.table-package：

# load the package
library(data.table)

# convert to a 'data.table'
setDT(d)
# make sure you have the correct order
setorder(d, ID, Hosp.date)

# summarise
d[, grp := cumsum(Hosp.date > (shift(Discharge.date, fill = Discharge.date[1]) + 7))
  , by = ID
  ][, .(Hosp.date = min(Hosp.date), Discharge.date = max(Discharge.date))
    , by = .(ID,grp)]

你得到：

   ID grp  Hosp.date Discharge.date
1:  1   0 2006-02-02     2006-03-24
2:  1   1 2008-08-09     2008-09-14
3:  2   0 2004-01-03     2004-01-15
4:  2   1 2004-06-08     2004-06-28

与dplyr的逻辑相同：

library(dplyr)
d %>% 
  arrange(ID, Hosp.date) %>%
  group_by(ID) %>% 
  mutate(grp = cumsum(Hosp.date > (lag(Discharge.date, default = Discharge.date[1]) + 7))) %>% 
  group_by(grp, add = TRUE) %>% 
  summarise(Hosp.date = min(Hosp.date), Discharge.date = max(Discharge.date))

使用过的数据：

d <- structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L),
                    Hosp.date = structure(c(13181, 13183, 13201, 14100, 12420, 12430, 12577), class = "Date"),
                    Discharge.date = structure(c(13183, 13197, 13231, 14136, 12425, 12432, 12597), class = "Date")),
               .Names = c("ID", "Hosp.date", "Discharge.date"), class = "data.frame", row.names = c(NA, -7L))

【讨论】：

以上是关于如果日期相同或 +- 7 天且 ID 相同，则合并 2 行的主要内容，如果未能解决你的问题，请参考以下文章