在 Google BigQuery 中使用 TIME_DIFF 和多个条件
Posted
技术标签:
【中文标题】在 Google BigQuery 中使用 TIME_DIFF 和多个条件【英文标题】:Using TIME_DIFF with multiple conditions in Google BigQuery 【发布时间】:2021-04-08 03:01:04 【问题描述】:我正在尝试在 Google BigQuery (SQL) 中计算特定日期的工作时间。
白天工作的工资是 10 美元,而夜间工作的工资是 15 美元。 白天时间定义为早上 6 点到晚上 10 点,而晚上时间定义为晚上 10 点到早上 6 点。
员工可以灵活地工作,因为他们是豪华轿车司机。
以下是我的表格示例:
id | start_at | end_at | date |
---|---|---|---|
abc123 | 04:00:00 | 07:00:00 | 2020-01-05 |
abc123 | 09:00:00 | 15:32:00 | 2020-01-05 |
abc123 | 23:00:00 | 23:35:00 | 2020-01-05 |
abc123 | 23:40:00 | 23:59:00 | 2020-01-05 |
abc123 | 23:59:00 | 01:35:00 | 2020-01-05 |
abc123 | 02:02:00 | 04:35:00 | 2020-01-06 |
abc123 | 05:40:00 | 06:59:00 | 2020-01-06 |
因此,实际工作时间是通过计算 start_at 和 end_at 之间的差异来计算的,但是白天和晚上的时间条件在我的查询中变得很麻烦..
*日期列基于 start_at。即使您从晚上 11:59 开始并在第二天凌晨 12:05 结束,日期也会跟随 start_at 而不是 end_at 的日期。
有什么想法吗?提前致谢!
【问题讨论】:
付款的最短时间单位是多少?也就是说,如果你工作 72 分钟,你会得到 1 小时的报酬吗?或超过1小时?你被这个数据模型困住了吗?即您可以添加到数据模型中吗? 【参考方案1】:考虑以下解决方案
create temp function night_day_split(start_at time, end_at time, date date) as (array(
select as struct
extract(date from time_point) day,
if(extract(hour from time_point) between 6 and 22, 'day', 'night') day_night,
count(1) minutes
from unnest(generate_timestamp_array(
timestamp(datetime(date, start_at)),
timestamp(datetime(if(start_at < end_at, date, date + 1), end_at)),
interval 1 minute
)) time_point
group by 1, 2
));
select id, day,
sum(if(day_night = 'day', minutes, null)) day_minutes,
sum(if(day_night = 'night', minutes, null)) night_minutes
from yourtable,
unnest(night_day_split(start_at, end_at, date)) v
group by id, day
如果应用于您问题中的样本数据 - 输出是
【讨论】:
【参考方案2】:你可以试试下面的代码:-
with mytable as (
select 'abc123' id, cast( '04:00:00' as time) start_dt, cast( '07:00:00' as time) end_dt, date('2020-01-05' ) date union all
select 'abc123', cast( '09:00:00' as time), cast( '15:32:00' as time), date('2020-01-05') union all
select 'abc123', cast( '23:00:00' as time), cast( '23:35:00' as time), date('2020-01-05' ) union all
select 'abc123', cast('23:40:00' as time), cast( '23:59:00' as time), date('2020-01-05') union all
select 'abc123', cast ('23:59:00' as time), cast( '01:35:00' as time), date('2020-01-05') union all
select 'abc123', cast('02:02:00' as time), cast( '04:35:00' as time), date('2020-01-06') union all
select 'abc123', cast('05:40:00' as time), cast( '06:59:00' as time), date('2020-01-06')
)
select id, date, sum (value) as sal from(
select id, date,
case when start_dt > cast( '06:00:00' as time) and end_dt < cast( '22:00:00' as time) and start_dt < end_dt then (time_diff(end_dt, start_dt, Minute)/60) * 10
when start_dt < cast( '06:00:00' as time) and end_dt < cast( '06:00:00' as time) then (time_diff(end_dt, start_dt, Minute)/60) * 15
when start_dt < cast( '06:00:00' as time) and end_dt < cast( '22:00:00' as time) then (time_diff(cast( '06:00:00' as time), start_dt, Minute)/60) * 15 + (time_diff( end_dt,cast( '06:00:00' as time), Minute)/60) * 10
when start_dt > cast( '22:00:00' as time) and end_dt < cast( '06:00:00' as time) then (time_diff(cast( '23:59:00' as time), start_dt, Minute)/60) * 15 + (time_diff( end_dt,cast( '00:00:00' as time), Minute)/60) * 15
when start_dt > cast( '22:00:00' as time) and end_dt > cast( '22:00:00' as time) then (time_diff(end_dt, start_dt, Minute)/60) * 15
else 0
end as value
from mytable) group by id, date
输出:-
您可以进一步按月分组以获得月薪。
【讨论】:
以上是关于在 Google BigQuery 中使用 TIME_DIFF 和多个条件的主要内容,如果未能解决你的问题,请参考以下文章
在 Google 表格中使用 BigQuery,如何授予其他用户按“刷新”的权限?
在 google bigquery 中,如何使用 google python 客户端使用 javascript UDF