在case sql语句中对范围间隔求和
Posted
技术标签:
【中文标题】在case sql语句中对范围间隔求和【英文标题】:Sum over range interval within case sql statment 【发布时间】:2018-05-18 21:17:08 【问题描述】:我正在尝试获取每个客户开始日期之后的每个日期的平均支出(这是为了新近-频率-货币分析的目的)。这是下面的货币价值元素,我希望得到客户开始日期之后所有交易的总和除以他们购买的天数。我正在使用 Oracle 12c。
我有以下工作,但包括完整的日期范围。
RFM AS (
SELECT SRC_USER_ID,
COUNT(distinct PICKUP_DATE) -1 as frequency,
(MAX(PICKUP_DATE) - MIN(PICKUP_DATE)) as recency,
(TO_DATE ('2018/05/12', 'yyyy/mm/dd') - MIN(PICKUP_DATE)) as T,
(CASE WHEN COUNT(distinct PICKUP_DATE)-1=0 THEN 0 ELSE
SUM(PRICE_TOTAL)/COUNT(distinct PICKUP_DATE) END) AS monetary_value
FROM TRANSACTIONS
group by SRC_USER_ID
我认为我需要使用窗口聚合函数 (https://ss64.com/ora/syntax-analytic-aggregate.html)。但是,当我尝试以下方法时,它不起作用。
RFM AS (
SELECT SRC_USER_ID,
COUNT(distinct PICKUP_DATE) -1 as frequency,
(MAX(PICKUP_DATE) - MIN(PICKUP_DATE)) as recency,
(TO_DATE ('2018/05/12', 'yyyy/mm/dd') - MIN(PICKUP_DATE)) as T,
(CASE WHEN COUNT(distinct PICKUP_DATE)-1=0 THEN 0 ELSE
SUM(PRICE_TOTAL) OVER (ORDER BY PICKUP_DATE) RANGE INTERVAL '1' DAY FOLLOWING UNBOUNDED/COUNT(distinct PICKUP_DATE) END) AS monetary_value
FROM TRANSACTIONS
group by SRC_USER_ID
任何帮助将不胜感激。
【问题讨论】:
如果您准备了样本测试输入数据和该数据的预期结果,这将很有帮助,最好使用以下站点之一:sqlfiddle.com 或 dbfiddle.uk/?rdbms=oracle_11.2 查看这些查询,很难猜猜是怎么回事,我不确定这个函数是否是最好的解决方案。 【参考方案1】:在学习分析函数时,看看documentation 和oracle-base 中的示例可能是个好主意。这是一个小测试表,有 3 个列,其名称与您的查询中的名称相似。 (注意:日期和价格是随机值。)
create table transactions
as
select
mod( level, 3 ) + 1 as srcuserid
, to_date( trunc( dbms_random.value( 2451925, 2458258 ) ), 'J' ) pickupdate
, round( dbms_random.value() * 10000, 2 ) pricetotal
from dual
connect by level <= 12 ;
select * from transactions order by srcuserid, pickupdate ;
SRCUSERID PICKUPDATE PRICETOTAL
1 27-JUL-03 9447.05
1 04-APR-05 9595.6
1 28-SEP-07 408.09
1 16-AUG-13 5643.33
2 20-JAN-01 6253.87
2 26-OCT-05 5981.7
2 16-DEC-08 8138.03
2 20-JUL-17 49.67
3 08-AUG-03 7411.74
3 29-OCT-06 2218.95
3 11-FEB-10 111.07
3 26-JUL-17 600.15
12 rows selected.
为了开发您的查询,请尝试使用分析函数来计算所有列的值(根据需要)。避免为此使用 GROUP BY,因为在这种情况下会抛出“不是 GROUP BY 表达式”错误。此外,您会发现结果集包含原始表中每一行的一行。您可以在此处使用 DISTINCT,因为我们只处理聚合。
select distinct -- without "distinct", you'll get a multiple identical rows "per window"
srcuserid
, count( pickupdate ) over ( partition by srcuserid ) as frequency
, max( pickupdate ) over ( partition by srcuserid ) as max_date
, min( pickupdate ) over ( partition by srcuserid ) as min_date
, sum( pricetotal ) over ( partition by srcuserid ) as sum_pricetotal
from transactions
-- group by srcuserid -- ORA-00979: not a GROUP BY expression
;
SRCUSERID FREQUENCY MAX_DATE MIN_DATE SUM_PRICETOTAL
2 4 20-JUL-17 20-JAN-01 20423.27
3 4 26-JUL-17 08-AUG-03 10341.91
1 4 16-AUG-13 27-JUL-03 25094.07
一旦这种(某种)工作,将查询用作内联视图,并为外部 SELECT 添加一些收尾工作。请注意,此处的最终查询也使用 first_value() - 这可能是您查找“窗口”第一个条目的一种方式。
select
srcuserid
, count_ - 1 as frequency
, max_date - min_date as recency
, trunc( sysdate - min_date ) as T
, case
when count_ - 1 = 0 then 0
else round( ( sum_pricetotal - firstpricetotal ) / ( count_ - 1 ), 2 )
end as monetary_value
from (
select distinct
srcuserid
, count( pickupdate ) over ( partition by srcuserid ) as count_
, max( pickupdate ) over ( partition by srcuserid ) as max_date
, min( pickupdate ) over ( partition by srcuserid ) as min_date
, sum( pricetotal ) over ( partition by srcuserid ) as sum_pricetotal
-- first_value(): find the first ie oldest "pricetotal" for each client
, first_value( pricetotal ) over (
partition by srcuserid order by pickupdate ) as firstpricetotal
from transactions
)
;
-- result
SRCUSERID FREQUENCY RECENCY T MONETARY_VALUE
2 3 6025 6328 4723.13
3 3 5101 5398 976.72
1 3 3673 5410 5215.67
另请参阅:dbfiddle here。
【讨论】:
太棒了,谢谢!我最终获得了 4 个 CTE,但这更好更快。剩下一个细节,你所拥有的与我正在寻找的不一致。每天发生多笔交易,所以我想要它们的汇总(例如 firstpricetotal 应该超过第一天)。包含它是一个小的变化吗? 用 ((SELECT srcuserid, sum(pricetotal) AS pricetotal, pickupdate FROM transactions GROUP BY srcuserid, pickupdate) 替换交易就可以了。再次感谢,非常感谢,现在意味着我可以将其投入生产不需要python。 太棒了!您已经了解了如何“使窗户变小”。 (如果时间允许,您也可以尝试(按 srcuserid、pickupdate 分区)...)感谢您的反馈。祝你好运!以上是关于在case sql语句中对范围间隔求和的主要内容,如果未能解决你的问题,请参考以下文章