根据R中的条件计算日期之间的平均差

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了根据R中的条件计算日期之间的平均差相关的知识,希望对你有一定的参考价值。

我有这个数据集:

Date New_Renew
 2019-01-10 22:11:16  Renewing
 2019-02-23 00:21:48  Renewing
 2019-03-05 05:26:17  Renewing
 2019-04-18 15:05:10       NEW
 2019-04-18 15:07:52       NEW
 2019-04-26 11:32:25  Renewing
 2019-05-03 14:15:25  Renewing
 2019-05-08 21:10:08       NEW
 2019-05-16 13:35:57  Renewing
 2019-05-24 13:18:23  Renewing
 2019-06-01 12:42:21  Renewing
 2019-06-17 18:08:09  Renewing
 2019-06-26 13:40:29  Renewing
 2019-12-13 17:57:43  Renewing
 2020-01-03 11:49:14  Renewing
 2020-01-11 11:46:51  Renewing
 2020-01-14 21:08:08       NEW
 2020-01-18 21:14:30       NEW
 2020-01-21 16:08:37       NEW
 2020-01-28 11:41:44  Renewing
 2020-01-30 13:34:21  Renewing
 2020-02-03 13:29:37  Renewing
 2020-02-18 17:15:52  Renewing
 2020-02-20 13:37:52  Renewing
 2020-02-24 12:55:25  Renewing
 2020-02-26 21:13:38       NEW
 2020-03-04 13:23:41  Renewing
 2020-03-09 16:48:36  Renewing

我想要的是,当New_Renew变量等于NEW时,计算与NEW相关联的日期之间的差的平均值。简而言之,用户多久执行一次新交易。

答案
library(data.table)
library(xts)
library(lubridate)
library(tbl2xts)


DT <- read.table(text = 'Date, New_Renew
 2019-01-10 22:11:16,Renewing
 2019-02-23 00:21:48,Renewing
 2019-03-05 05:26:17,Renewing
 2019-04-18 15:05:10,NEW
 2019-04-18 15:07:52,NEW
 2019-04-26 11:32:25,Renewing
 2019-05-03 14:15:25,Renewing
 2019-05-08 21:10:08,NEW
 2019-05-16 13:35:57,Renewing
 2019-05-24 13:18:23,Renewing
 2019-06-01 12:42:21,Renewing
 2019-06-17 18:08:09,Renewing
 2019-06-26 13:40:29,Renewing
 2019-12-13 17:57:43,Renewing
 2020-01-03 11:49:14,Renewing
 2020-01-11 11:46:51,Renewing
 2020-01-14 21:08:08,NEW
 2020-01-18 21:14:30,NEW
 2020-01-21 16:08:37,NEW
 2020-01-28 11:41:44,Renewing
 2020-01-30 13:34:21,Renewing
 2020-02-03 13:29:37,Renewing
 2020-02-18 17:15:52,Renewing
 2020-02-20 13:37:52,Renewing
 2020-02-24 12:55:25,Renewing
 2020-02-26 21:13:38,NEW
 2020-03-04 13:23:41,Renewing
 2020-03-09 16:48:36,Renewing', 
                 sep = ',', 
                 header = T)

df <- xts(DT, order.by = ymd_hms(DT$Date))

new_items <- which(DT$New_Renew=="NEW")

dif <- DT

dif$difference <- NA

renewal <- 0

for (i in 1:nrow(df)){

  if (df[i,2]=='Renewing' & renewal == 0){
    renewal <- i
  } else if (df[i,2]=='Renewing' & renewal != 0){
    next
  } else if (df[i, 2]=='NEW' & renewal != 0) {
    dif[i, 'difference'] <- index(df[i, 2]) - index(df[renewal, 2])
  } else {
    dif[i, 'difference'] <- index(df[i, 2]) - index(df[renewal, 2])
    renewal <- 0
  }

}

mean_diff <- mean(dif$difference, na.rm = T)
另一答案

使用aggregatediff60*24将产生秒数转换为天数。

aggregate(Date ~ New_Renew, dat, function(x) mean(diff(x))/(60*24))
#  New_Renew         Date
# 1       NEW 52.38292438 
# 2  Renewing  0.01471444 

以上是关于根据R中的条件计算日期之间的平均差的主要内容,如果未能解决你的问题,请参考以下文章

(运行的干净代码)根据来自另一个数据帧的日期间隔和字符串条件获取一个数据帧中的值的平均值

我正在尝试计算 R 中每年的标准差和平均回报

java中计算两个日期之间差的天数

R - 根据条件组合行以获得平均值/平均值

是否有一个 R 函数可以根据条件计算数据框中先前日期的数量

Infopath 计算两个时间日期之间差的天数,免代码