带变量的红移计数

Posted 2023-03-31

技术标签:

【中文标题】带变量的红移计数【英文标题】：Redshift count with variable 【发布时间】：2016-11-21 13:29:30 【问题描述】：

想象一下，我在 Redshift 上有一张具有类似结构的表格。 Product_Bill_ID 是该表的主键。

| Store_ID | Product_Bill_ID |    Payment_Date    
| 1        | 1               | 01/10/2016 11:49:33     
| 1        | 2               | 01/10/2016 12:38:56      
| 1        | 3               | 01/10/2016 12:55:02    
| 2        | 4               | 01/10/2016 16:25:05     
| 2        | 5               | 02/10/2016 08:02:28     
| 3        | 6               | 03/10/2016 02:32:09

如果我想查询一家商店在售出第一个 Product_Bill_ID 后的第一个小时内售出的 Product_Bill_ID 数量，我该怎么做？

这个例子应该有结果

| Store_ID | First_Payment_Date  | Sold_First_Hour    
| 1        | 01/10/2016 11:49:33 | 2                   
| 2        | 01/10/2016 16:25:05 | 1                    
| 3        | 03/10/2016 02:32:09 | 1

【问题讨论】：

【参考方案1】：

您需要获得第一个小时。使用窗口函数很容易：

  select s.*,
         min(payment_date) over (partition by store_id) as first_payment_date
  from sales s

然后，您需要进行日期过滤和聚合：

select store_id, count(*)
from (select s.*,
             min(payment_date) over (partition by store_id) as first_payment_date
      from sales s
     ) s
where payment_date <= first_payment_date + interval '1 hour'
group by store_id;

【讨论】：

【参考方案2】：

SELECT
    store_id,
    first_payment_date,
    SUM(
        CASE WHEN payment_date < DATEADD(hour, 1, first_payment_date) THEN 1 END
    )   AS sold_first_hour
FROM
(
    SELECT
        *,
        MIN(payment_date) OVER (PARTITION BY store_id)   AS first_payment_date
    FROM
        yourtable
)
    parsed_table
GROUP BY
    store_id,
    first_payment_date

【讨论】：

以上是关于带变量的红移计数的主要内容，如果未能解决你的问题，请参考以下文章