对生成的列进行分组的案例查询
Posted
技术标签:
【中文标题】对生成的列进行分组的案例查询【英文标题】:Case query with groupings on generated column 【发布时间】:2018-08-01 23:22:45 【问题描述】:这是我正在处理的一些伪 SQL 的示例。
select count(*) as "count", time2.iso_timestamp - time1.iso_timestamp
as "time_to_active",
case
when ("time_to_active" >= 1day and "time_to_active" <= 5days) then '1'
when ("time_to_active" >= 6days and "time_to_active" <= 11days) then
'2'
when ("time_to_active" >= 12days and "time_to_active" <= 20days) then
'3'
when ("time_to_active" >= 21days and "time_to_active" <= 30days) then
'4'
when ("time_to_active" >= 31days) then '5'
end as timetoactivegroup
from t
inner join t1 on t.p_id = t1.p_id
join timestamp time1 on t.timestamp_id = t1.id
join timestamp time2 on t1.timestamp_id = t2.id
我实际上是在尝试查询计算列适合某个范围的组。 n 和 y 天之间的订单。我遇到的主要问题是根据分组生成计数。
我可以毫无问题地让选择查询显示计算值。
【问题讨论】:
我不确定您要做什么。您能否详细说明您对样本数据和预期输出的问题?您正在尝试将一些持续时间分组到不同的组中,然后您想计算每个组包含多少个元素?曾经我遇到过类似的问题。我将组作为范围类型放入一个表中,并使用JOIN ON a.range @> b.element
加入它,结果为a.id as group_id
。第二步:group by group_id
【参考方案1】:
postgresql 不允许您按别名进行分组,因此您需要在 group by 子句中重复分组表达式。
GROUP BY case
when ("time_to_active" >= 1day and "time_to_active" <= 5days) then '1'
when ("time_to_active" >= 6days and "time_to_active" <= 11days) then
'2'
when ("time_to_active" >= 12days and "time_to_active" <= 20days) then
'3'
when ("time_to_active" >= 21days and "time_to_active" <= 30days) then
'4'
when ("time_to_active" >= 31days) then '5'
end
或者您可以按列号分组:
GROUP BY 3
【讨论】:
在雪花数据库中,您可以按列别名进行分组。但由于命名了两个数据库,因此令人困惑。 为什么这个问题被标记为 postgresql?【参考方案2】:忽略伪 SQL(时间码),也忽略表连接,这里指的是一个未命名的表 T2
因此,如果您有一些带有两个时间戳的行 timestamp_a
早于 timestamp_b
那么我看到您可能遇到的错误是通过将差异作为选定列 time2.iso_timestamp - time1.iso_timestamp as "time_to_active",
您有两列您需要分组,但您实际上并不希望在您的答案中使用time_to_active
,否则聚合答案的案例块没有多大意义。
因此,如果我有一个有几行的表(这只是代表您连接的表的外观......)
create or replace table t (timestamp_a timestamp_ntz, timestamp_b timestamp);
insert into t values ('2018-11-10','2018-11-11')
,('2018-11-08','2018-11-11')
,('2018-10-08','2018-11-11');
select datediff('day', timestamp_a, timestamp_b) as time_to_active from t;
给出1,3,34
,从而将它们包装到子选择中(也可以表示为 CTE)
select case when (time_to_active >= 1 and time_to_active < 6) then '1'
when (time_to_active >= 6 and time_to_active < 12) then '2'
when (time_to_active >= 12 and time_to_active < 21) then '3'
when (time_to_active >= 21 and time_to_active < 31) then '4'
when (time_to_active >= 31) then '5'
end as time_to_active_group
,count(*) as count
from (
select datediff('day', timestamp_a, timestamp_b) as time_to_active from t
) as A
group by time_to_active_group;
给予:
1, 2
5, 1
因为我们在 >= 31 存储桶中有 2 行介于 1-5 和 1 之间。
另一个问题,您是否没有处理“同一天”或结束时间早于开始时间的时间戳,即time_to_active <= 0
【讨论】:
以上是关于对生成的列进行分组的案例查询的主要内容,如果未能解决你的问题,请参考以下文章