R语言 数据框时间列处理,时区转化

Posted 基督徒Isaac

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了R语言 数据框时间列处理,时区转化相关的知识,希望对你有一定的参考价值。

有一个数据框cooling,其中有一列时间向量ProcessStartDate

# 跟着as.Date.POSIXct()后面的加法单位是日期
# 跟着with_tz()后面的加法单位是秒
# 美国西部时间比中国晚16小时,美东晚13小时,夏令时晚12小时



# 选择时间列的第一行
cooling %>% select(ProcessStartDate) %>%
  slice(1)


ProcessStartDate
<S3: POSIXct>
2017-06-25	

[1] "2017-06-25 00:00:00 CST" 



# unlist转化为秒为单位的10位数字
# as.Date.POSIXct转化为时间格式,但时间变成美国时间,单位为日
# with_tz转化为所在时区,单位为秒
# + 16 *60*60 时间变成中国时间
cooling %>% select(ProcessStartDate) %>%
  slice(1) %>% unlist(use.names = F) %>%
  as.Date.POSIXct() %>% with_tz() + 16 *60*60

# as.Date.POSIXct转化为时间格式,但时间变成美国时间,单位为日
# + 16 / 24  时间变成中国时间
# 转化时区,单位为秒,输出结果只显示到日,如果想体现到秒,可再+1
( cooling %>% select(ProcessStartDate) %>%
  slice(1) %>% unlist(use.names = F) %>%
  as.Date.POSIXct() + 16 /24
  ) %>% with_tz()

# 提取日期
cooling$ProcessStartDate %>% anydate()


# 按年、月分组
cooling %>%
  mutate(
    year  = ProcessStartDate %>% year(),
    month = ProcessStartDate %>% month()) %>%
  group_by(year, month) %>%
  summarise(
    SumWeight = EndWeight %>% sum(),
    .groups = "keep")

强制转化时区

x为一个时间数据向量,
例如x[1] = lubridate::ymd_hms(“2021.12.31 0:0:0”)

# 强制转化为中国时区
lubridate::force_tz(x[i], tzone='asia/shanghai')

应用

# 案例:对某生产报表进行划分时间段并排班

myconn <-odbcConnect(

sqlQuery(
	channel = myconn,
  	query =) %>% 
  	filter(StartTime >= anytime::anytime("2022-1-1 07:45:00"))
baiye <- 
  function(x = lubridate::ymd_hms("2021.12.31 0:0:0")) 
    
    origin = hms::as_hms("07:45:00")
    TimeStamp = c()
    
    for (i in 1:NROW(x)) 
      
      StartTime = lubridate::force_tz(x[i], tzone='asia/shanghai')
      date = anytime::anydate(StartTime)
      
      time = hms::as_hms(StartTime)
      hour = (time - origin) / 3600
      
      if (hour >= 0 & hour < 12) 
        stamp = hms::as_hms("07:45:00")
        TimeStamp[i] = paste(date, stamp); TimeStamp[i]
        
       else if (hour >= 12) 
        stamp = hms::as_hms("19:45:00")
        TimeStamp[i] = paste(date, stamp); TimeStamp[i]
        
       else if (hour < 0) 
        stamp = hms::as_hms("19:45:00")
        TimeStamp[i] = paste(date - 1 , stamp); TimeStamp[i]
        
       else  
        stamp = ""
        TimeStamp[i] = ""
      
    
    
    # 字符向量,需转化为时间
    y = lubridate::force_tz(lubridate::ymd_hms(TimeStamp),
                             tzone='asia/shanghai')
    return(y)
  

# 该函数可实现将7:45-19:44的时间与19:45-次日7:44的时间分开处理
panduan <- 
  function(X = lubridate::force_tz(lubridate::ymd_hms("2021.12.31 7:45:0"), 
                                   tzone='asia/shanghai')) 
    
    a = lubridate::force_tz(lubridate::ymd_hms("2021.12.31 7:45:0"), 
                             tzone='asia/shanghai')
    b = 1/2
    d = c()
    
    for (i in 1:NROW(X)) 
      
      x = X[i]
      e = as.numeric((x - a) / b) %% 12
    
      if (e %in% c(0,2,5,7)) 
        d[i] = "甲"
       else if (e %in% c(8,10,1,3)) 
        d[i] = "乙"
       else if (e %in% c(4,6,9,11)) 
        d[i] = "丙"
       else 
        d[i] = ""
      
    
    return(d)
  

hj <- hj %>% mutate(
  banzu = TimeStamp %>% panduan()
)

baiye1 <- function(x = hms::as_hms("07:45:0")) 
  
  y = c()
  
  for (i in 1:NROW(x)) 
    a = hms::as_hms(x[i])
    
    if (a == hms::as_hms("07:45:0")) 
      y[i] = "白班"
     else if (a == hms::as_hms("19:45:0")) 
      y[i] = "夜班"
     else 
      y[i] = ""
    
  
  
  return(y)
  

# 该函数可将六天分为三班、做四休二、甲乙甲乙丙甲丙甲乙丙乙丙(即白班甲甲丙丙乙乙,夜班乙乙甲甲丙丙)的顺序排班,是不是很bx

日产量

chanliang <- hj %>% 
    group_by(TimeStamp, banci) %>% 
    summarise(ExitWeight = sum(ExitWeight) / 1000, .groups = "drop_last") %>% 
    mutate(TimeStamp = TimeStamp %>% anydate())

chanliang %>% ggplot(aes(
    x = TimeStamp, y = ExitWeight,
    group = banci,
    colour = banci,
    linetype = banci,
    shape = banci
    )) +
    
    geom_line() + geom_point() + 
    theme_bw() +
    
    labs(x = "日期", y = "产量(吨)") + 
    theme(legend.title = element_blank()) +
    
    geom_hline(aes(yintercept = ExitWeight %>% mean()), colour = 3)

班组产量

tianshu <- hj %>% 
    group_by(banzu) %>% 
    summarise(days = TimeStamp %>% anydate() %>% n_distinct()) %>% 
    
    mutate(banzu = banzu %>% factor(levels = c("甲","乙","丙"))) %>% 
    arrange(banzu)

chanliang <- hj %>% 
  group_by(banzu) %>% 
  summarise(ExitWeight = sum(ExitWeight) / 1000) %>% 
  
  mutate(banzu = banzu %>% factor(levels = c("甲","乙","丙"))) %>% 
  arrange(banzu) %>% 
  
  mutate(xiaolv = (ExitWeight / tianshu[,2]) %>% unlist(use.names = FALSE))

chanliang %>% ggplot(aes(
  x = banzu %>% factor(levels = c("甲","乙","丙")),
  y = xiaolv * 30,
  
  group = 1,
  colour = banzu %>% factor(levels = c("甲","乙","丙")),
  fill = banzu %>% factor(levels = c("甲","乙","丙")),
  shape = banzu %>% factor(levels = c("甲","乙","丙"))
  )) +
  
  geom_line(size = 1) + geom_point(size = 3) +
  geom_col(aes(y = ExitWeight), alpha = 0.1) +
  
  theme_bw() +
  theme(legend.position = 'none') +
  labs(x = "班组", y = "产量(吨)") + 
  
  geom_text(aes(label = xiaolv %>% round(1)), vjust = 2)

提醒

导入数据(例如:数据命名为Cross,第一列为日期)时,即使时间列是字符格式,但是lubridate包和tidyr包仍然可以进行时间操作,但是基础plot需要转化:

as.Date(Cross[,1] %>% unlist)

班组也可根据操作员姓名抓取

cross[stringr::str_detect(cross$姓名, "张"),]

以上是关于R语言 数据框时间列处理,时区转化的主要内容,如果未能解决你的问题,请参考以下文章

R语言数值型转化成字符串

如何在R中的数据框中转换日期/时间列

R语言数据转换(一)2021.2.25

R语言数据结构-数据框&矩阵&列表

R语言 时间处理 时间分段

R语言dataframe数据列格式转换(从整型integer转化为浮点型float)