使用 Query 基于特定事件值对表进行透视

Posted

技术标签:

【中文标题】使用 Query 基于特定事件值对表进行透视【英文标题】:To pivot a table based on a specific event value using Query 【发布时间】:2020-08-05 08:18:54 【问题描述】:

我想让表 A 像表 B 一样。 我想看看用户在购买事件之前引发了哪些事件。 我用过row_number() over (partition by client_id, event_type order by time),它只是一个支点。怎么做逻辑?

表 A

client_id   event_type  count      time 
    A         cart        1     AM 12:00:00 
    A         view        4     AM 12:01:00
    A         purchase    2     AM 12:05:00
    A         view        2     AM 12:10:00 
    B         view        3     AM 12:03:00
    B         purchase    1     AM 12:05:00
    B         view        2     AM 12:10:00 

表 B

client_id     view     cart   purchase 
    A           4        1        2     
    A           2        0        0
    B           3        0        1
    B           2        0        0

【问题讨论】:

为什么表 A 中有重复的客户端 ID? 客户 A 在上午 12:01:00 有 4 次查看,因为他们在 12:05 购买产品之前已经看过 4 次。由于客户 A 在 12:10 看到了两次其他产品(2 次查看),因此客户 A 稍后(上午 12:10:00)又有了另一次查看。 @AminShojaei Amazon Redshift 基于 PostgreSQL,所以我写了那个。 @a_horse_with_no_name 表 B 呢?为什么表B中有重复的ID?我建议您重命名您的表格并对其进行简短描述。或者至少写下你想要的结果。 正如我所说:虽然 Redshift 基于(非常、非常)旧版本的 Postgres,但它们有很大不同(例如,对于 Postgres,我的回答将包括一个 filter (..) 运算符来做到这一点。 【参考方案1】:

这是一种方法,我在购买之前使用块 grp_split 将一组事件定义为属于单个“会话/活动”。

然后我在块 x 中正确完成了这个分组,方法是使用 max(grp) over(partition by client_id order by time1) as grp2 将 null 值替换为以前的非 null 值。

之后就是旋转列以供查看、购物车和购买

with data
  as (
    select 'A' as client_id,'cart'     as event_type   , 1  as count1, cast('AM 12:00:00' as time) as time1 union all 
    select 'A' as client_id,'view'     as event_type   , 4  as count1, cast('AM 12:01:00' as time) as time1 union all
    select 'A' as client_id,'purchase' as event_type   , 2  as count1, cast('AM 12:05:00' as time) as time1 union all
    select 'A' as client_id,'view'     as event_type   , 2  as count1, cast('AM 12:10:00' as time) as time1 union all
    select 'B' as client_id,'view'     as event_type   , 3  as count1, cast('AM 12:03:00' as time) as time1 union all
    select 'B' as client_id,'purchase' as event_type   , 1  as count1, cast('AM 12:05:00' as time) as time1 union all
    select 'B' as client_id,'view' as event_type   , 2  as count1, cast('AM 12:10:00' as time) as time1 
     )
   ,grp_split
   as(
select case when lag(event_type) over(partition by client_id order by time1)='purchase' 
              or lag(event_type) over(partition by client_id order by time1) is null 
             then
                 row_number() over(partition by client_id order by time1)
        end as grp
      ,*
  from data
      )
 select x.client_id
       ,max(case when event_type='view' then count1 else 0 end) as view
       ,max(case when event_type='cart' then count1 else 0 end) as cart
       ,max(case when event_type='purchase' then count1 else 0 end) as purchase
  from (
  select *
        ,max(grp) over(partition by client_id order by time1) as grp2
    from grp_split
       )x
  group by client_id
           ,grp2 
  order by client_id

输出

+-----------+------+------+----------+
| client_id | view | cart | purchase |
+-----------+------+------+----------+
| A         |    4 |    1 |        2 |
| A         |    2 |    0 |        0 |
| B         |    3 |    0 |        1 |
| B         |    2 |    0 |        0 |
+-----------+------+------+----------+

工作示例

https://dbfiddle.uk/?rdbms=postgres_12&fiddle=aeeb0878b9094e061c469bb0efb7a024

【讨论】:

乐于助人 :-)

以上是关于使用 Query 基于特定事件值对表进行透视的主要内容,如果未能解决你的问题,请参考以下文章

Eloquent Query:所有具有相同类别的新闻。数据透视表

使用 VBA 将数据透视表设置为基于另一个字段的特定日期

如何减去 Power Query 数据透视表中的两列?

Excel power query 逆透视

如何用非数字值对数据框进行分组和透视。

如何在不使用 SQL 中的 PIVOT 函数的情况下进行透视?