使用 Query 基于特定事件值对表进行透视
Posted
技术标签:
【中文标题】使用 Query 基于特定事件值对表进行透视【英文标题】:To pivot a table based on a specific event value using Query 【发布时间】:2020-08-05 08:18:54 【问题描述】:我想让表 A 像表 B 一样。
我想看看用户在购买事件之前引发了哪些事件。
我用过row_number() over (partition by client_id, event_type order by time)
,它只是一个支点。怎么做逻辑?
表 A
client_id event_type count time
A cart 1 AM 12:00:00
A view 4 AM 12:01:00
A purchase 2 AM 12:05:00
A view 2 AM 12:10:00
B view 3 AM 12:03:00
B purchase 1 AM 12:05:00
B view 2 AM 12:10:00
表 B
client_id view cart purchase
A 4 1 2
A 2 0 0
B 3 0 1
B 2 0 0
【问题讨论】:
为什么表 A 中有重复的客户端 ID? 客户 A 在上午 12:01:00 有 4 次查看,因为他们在 12:05 购买产品之前已经看过 4 次。由于客户 A 在 12:10 看到了两次其他产品(2 次查看),因此客户 A 稍后(上午 12:10:00)又有了另一次查看。 @AminShojaei Amazon Redshift 基于 PostgreSQL,所以我写了那个。 @a_horse_with_no_name 表 B 呢?为什么表B中有重复的ID?我建议您重命名您的表格并对其进行简短描述。或者至少写下你想要的结果。 正如我所说:虽然 Redshift 基于(非常、非常)旧版本的 Postgres,但它们有很大不同(例如,对于 Postgres,我的回答将包括一个filter (..)
运算符来做到这一点。
【参考方案1】:
这是一种方法,我在购买之前使用块 grp_split 将一组事件定义为属于单个“会话/活动”。
然后我在块 x 中正确完成了这个分组,方法是使用 max(grp) over(partition by client_id order by time1) as grp2 将 null 值替换为以前的非 null 值。
之后就是旋转列以供查看、购物车和购买
with data
as (
select 'A' as client_id,'cart' as event_type , 1 as count1, cast('AM 12:00:00' as time) as time1 union all
select 'A' as client_id,'view' as event_type , 4 as count1, cast('AM 12:01:00' as time) as time1 union all
select 'A' as client_id,'purchase' as event_type , 2 as count1, cast('AM 12:05:00' as time) as time1 union all
select 'A' as client_id,'view' as event_type , 2 as count1, cast('AM 12:10:00' as time) as time1 union all
select 'B' as client_id,'view' as event_type , 3 as count1, cast('AM 12:03:00' as time) as time1 union all
select 'B' as client_id,'purchase' as event_type , 1 as count1, cast('AM 12:05:00' as time) as time1 union all
select 'B' as client_id,'view' as event_type , 2 as count1, cast('AM 12:10:00' as time) as time1
)
,grp_split
as(
select case when lag(event_type) over(partition by client_id order by time1)='purchase'
or lag(event_type) over(partition by client_id order by time1) is null
then
row_number() over(partition by client_id order by time1)
end as grp
,*
from data
)
select x.client_id
,max(case when event_type='view' then count1 else 0 end) as view
,max(case when event_type='cart' then count1 else 0 end) as cart
,max(case when event_type='purchase' then count1 else 0 end) as purchase
from (
select *
,max(grp) over(partition by client_id order by time1) as grp2
from grp_split
)x
group by client_id
,grp2
order by client_id
输出
+-----------+------+------+----------+
| client_id | view | cart | purchase |
+-----------+------+------+----------+
| A | 4 | 1 | 2 |
| A | 2 | 0 | 0 |
| B | 3 | 0 | 1 |
| B | 2 | 0 | 0 |
+-----------+------+------+----------+
工作示例
https://dbfiddle.uk/?rdbms=postgres_12&fiddle=aeeb0878b9094e061c469bb0efb7a024
【讨论】:
乐于助人 :-)以上是关于使用 Query 基于特定事件值对表进行透视的主要内容,如果未能解决你的问题,请参考以下文章