R语言 数据框时间列处理,时区转化
Posted 基督徒Isaac
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了R语言 数据框时间列处理,时区转化相关的知识,希望对你有一定的参考价值。
有一个数据框cooling,其中有一列时间向量ProcessStartDate
# 跟着as.Date.POSIXct()后面的加法单位是日期
# 跟着with_tz()后面的加法单位是秒
# 美国西部时间比中国晚16小时,美东晚13小时,夏令时晚12小时
# 选择时间列的第一行
cooling %>% select(ProcessStartDate) %>%
slice(1)
ProcessStartDate
<S3: POSIXct>
2017-06-25
[1] "2017-06-25 00:00:00 CST"
# unlist转化为秒为单位的10位数字
# as.Date.POSIXct转化为时间格式,但时间变成美国时间,单位为日
# with_tz转化为所在时区,单位为秒
# + 16 *60*60 时间变成中国时间
cooling %>% select(ProcessStartDate) %>%
slice(1) %>% unlist(use.names = F) %>%
as.Date.POSIXct() %>% with_tz() + 16 *60*60
# as.Date.POSIXct转化为时间格式,但时间变成美国时间,单位为日
# + 16 / 24 时间变成中国时间
# 转化时区,单位为秒,输出结果只显示到日,如果想体现到秒,可再+1
( cooling %>% select(ProcessStartDate) %>%
slice(1) %>% unlist(use.names = F) %>%
as.Date.POSIXct() + 16 /24
) %>% with_tz()
# 提取日期
cooling$ProcessStartDate %>% anydate()
# 按年、月分组
cooling %>%
mutate(
year = ProcessStartDate %>% year(),
month = ProcessStartDate %>% month()) %>%
group_by(year, month) %>%
summarise(
SumWeight = EndWeight %>% sum(),
.groups = "keep")
强制转化时区
x为一个时间数据向量,
例如x[1] = lubridate::ymd_hms(“2021.12.31 0:0:0”)
# 强制转化为中国时区
lubridate::force_tz(x[i], tzone='asia/shanghai')
应用
# 案例:对某生产报表进行划分时间段并排班
myconn <-odbcConnect(
sqlQuery(
channel = myconn,
query =) %>%
filter(StartTime >= anytime::anytime("2022-1-1 07:45:00"))
baiye <-
function(x = lubridate::ymd_hms("2021.12.31 0:0:0"))
origin = hms::as_hms("07:45:00")
TimeStamp = c()
for (i in 1:NROW(x))
StartTime = lubridate::force_tz(x[i], tzone='asia/shanghai')
date = anytime::anydate(StartTime)
time = hms::as_hms(StartTime)
hour = (time - origin) / 3600
if (hour >= 0 & hour < 12)
stamp = hms::as_hms("07:45:00")
TimeStamp[i] = paste(date, stamp); TimeStamp[i]
else if (hour >= 12)
stamp = hms::as_hms("19:45:00")
TimeStamp[i] = paste(date, stamp); TimeStamp[i]
else if (hour < 0)
stamp = hms::as_hms("19:45:00")
TimeStamp[i] = paste(date - 1 , stamp); TimeStamp[i]
else
stamp = ""
TimeStamp[i] = ""
# 字符向量,需转化为时间
y = lubridate::force_tz(lubridate::ymd_hms(TimeStamp),
tzone='asia/shanghai')
return(y)
# 该函数可实现将7:45-19:44的时间与19:45-次日7:44的时间分开处理
panduan <-
function(X = lubridate::force_tz(lubridate::ymd_hms("2021.12.31 7:45:0"),
tzone='asia/shanghai'))
a = lubridate::force_tz(lubridate::ymd_hms("2021.12.31 7:45:0"),
tzone='asia/shanghai')
b = 1/2
d = c()
for (i in 1:NROW(X))
x = X[i]
e = as.numeric((x - a) / b) %% 12
if (e %in% c(0,2,5,7))
d[i] = "甲"
else if (e %in% c(8,10,1,3))
d[i] = "乙"
else if (e %in% c(4,6,9,11))
d[i] = "丙"
else
d[i] = ""
return(d)
hj <- hj %>% mutate(
banzu = TimeStamp %>% panduan()
)
baiye1 <- function(x = hms::as_hms("07:45:0"))
y = c()
for (i in 1:NROW(x))
a = hms::as_hms(x[i])
if (a == hms::as_hms("07:45:0"))
y[i] = "白班"
else if (a == hms::as_hms("19:45:0"))
y[i] = "夜班"
else
y[i] = ""
return(y)
# 该函数可将六天分为三班、做四休二、甲乙甲乙丙甲丙甲乙丙乙丙(即白班甲甲丙丙乙乙,夜班乙乙甲甲丙丙)的顺序排班,是不是很bx
日产量
chanliang <- hj %>%
group_by(TimeStamp, banci) %>%
summarise(ExitWeight = sum(ExitWeight) / 1000, .groups = "drop_last") %>%
mutate(TimeStamp = TimeStamp %>% anydate())
chanliang %>% ggplot(aes(
x = TimeStamp, y = ExitWeight,
group = banci,
colour = banci,
linetype = banci,
shape = banci
)) +
geom_line() + geom_point() +
theme_bw() +
labs(x = "日期", y = "产量(吨)") +
theme(legend.title = element_blank()) +
geom_hline(aes(yintercept = ExitWeight %>% mean()), colour = 3)
班组产量
tianshu <- hj %>%
group_by(banzu) %>%
summarise(days = TimeStamp %>% anydate() %>% n_distinct()) %>%
mutate(banzu = banzu %>% factor(levels = c("甲","乙","丙"))) %>%
arrange(banzu)
chanliang <- hj %>%
group_by(banzu) %>%
summarise(ExitWeight = sum(ExitWeight) / 1000) %>%
mutate(banzu = banzu %>% factor(levels = c("甲","乙","丙"))) %>%
arrange(banzu) %>%
mutate(xiaolv = (ExitWeight / tianshu[,2]) %>% unlist(use.names = FALSE))
chanliang %>% ggplot(aes(
x = banzu %>% factor(levels = c("甲","乙","丙")),
y = xiaolv * 30,
group = 1,
colour = banzu %>% factor(levels = c("甲","乙","丙")),
fill = banzu %>% factor(levels = c("甲","乙","丙")),
shape = banzu %>% factor(levels = c("甲","乙","丙"))
)) +
geom_line(size = 1) + geom_point(size = 3) +
geom_col(aes(y = ExitWeight), alpha = 0.1) +
theme_bw() +
theme(legend.position = 'none') +
labs(x = "班组", y = "产量(吨)") +
geom_text(aes(label = xiaolv %>% round(1)), vjust = 2)
提醒
导入数据(例如:数据命名为Cross,第一列为日期)时,即使时间列是字符格式,但是lubridate包和tidyr包仍然可以进行时间操作,但是基础plot需要转化:
as.Date(Cross[,1] %>% unlist)
班组也可根据操作员姓名抓取
cross[stringr::str_detect(cross$姓名, "张"),]
以上是关于R语言 数据框时间列处理,时区转化的主要内容,如果未能解决你的问题,请参考以下文章