使用(递归?)CTE + 窗口函数将销售订单归零?
Posted
技术标签:
【中文标题】使用(递归?)CTE + 窗口函数将销售订单归零?【英文标题】:Using a (Recursive?) CTE + Window Functions to zero out sales orders? 【发布时间】:2017-03-11 08:30:23 【问题描述】:我正在尝试使用递归 CTE + 窗口函数来查找一系列买/卖订单的最后结果。
首先,这里有一些术语:
field_id 是商店的 ID。 field_number 是订单号,但同一个人可以重复使用 Field_date 是初始订单的日期。 Field_inserted 是发生此特定事务的时间。 Field_sale 是我们购买还是退货。不幸的是,由于系统的工作方式,我无法在退货时获得成本,因此确定订单的最后结果很复杂(我们最终是否出售了任何结果)。我需要将购买与销售相匹配,这通常效果很好。但是,在以下情况下它会失败,我试图找到一种方法来一次性完成,可能使用递归 CTE。
这里有一些代码。
DECLARE @tablea TABLE (field_id int, field_number CHAR(3), field_date datetime, field_inserted DATETIME, field_sale varchar(4))
INSERT INTO @tablea
VALUES
(1, 100, '20170311','20170311 01:00:00', 'Buy'),
(1, 100, '20170311','20170311 01:01:00', 'Retu'),
(1, 100, '20170311','20170311 01:02:00', 'Buy'),
(1, 100, '20170311','20170311 01:03:00', 'Retu'),
(1, 100, '20170311','20170311 01:02:01', 'buy'),
(2, 100, '20170311','20170311 01:03:00', 'REtu'),
(1, 110, '20170311','20170311 01:03:00', 'Buy');
现在删除随后退回的购买。 ISNULL 是因为我是 NOT IN 将忽略所有 _lead/_lag 值为 NULL 的行。
WITH cte AS
(SELECT
ROW_NUMBER() OVER (PARTITION BY field_id, field_number, field_date ORDER BY field_inserted) AS row_num,
field_id,
field_number,
field_date,
field_sale,
lead(field_sale) OVER (PARTITION BY field_id, field_number, field_date ORDER BY field_inserted) AS field_sale_lead,
lag(field_sale) OVER (PARTITION BY field_id, field_number, field_date ORDER BY field_inserted) AS field_sale_lag
FROM @tablea
)
SELECT * FROM cte
WHERE NOT (cte.field_sale = 'Buy' AND ISNULL(field_sale_lead,'') = 'Retu')--AND field_sale_lead IS NOT null)
AND NOT (cte.field_sale = 'Retu' AND ISNULL(field_sale_lag,'') = 'buy' )--AND field_sale_lag IS NOT NULL)
我觉得很自鸣得意,以为我拥有它。但是,这是简单的情况。买,退货,买,退货。我们再试试另外一种情况,Buy Buy Return Return,它仍然有效,但显然会导致净值为 0..
DECLARE @tablea TABLE (field_id int, field_number CHAR(3), field_date datetime, field_inserted DATETIME, field_sale varchar(4))
INSERT INTO @tablea
VALUES
(1, 100, '20170311','20170311 01:00:00', 'Buy'),
(1, 100, '20170311','20170311 01:01:00', 'Buy'),
(1, 100, '20170311','20170311 01:02:00', 'Retu'),
(1, 100, '20170311','20170311 01:03:00', 'Retu'),
(2, 100, '20170311','20170311 01:03:00', 'Buy'),
(1, 110, '20170311','20170311 01:03:00', 'Buy');
WITH cte AS
(SELECT
ROW_NUMBER() OVER (PARTITION BY field_id, field_number, field_date ORDER BY field_inserted) AS row_num,
field_id,
field_number,
field_date,
field_sale,
lead(field_sale) OVER (PARTITION BY field_id, field_number, field_date ORDER BY field_inserted) AS field_sale_lead,
lag(field_sale) OVER (PARTITION BY field_id, field_number, field_date ORDER BY field_inserted) AS field_sale_lag
FROM @tablea
)
SELECT * FROM cte
WHERE NOT (cte.field_sale = 'Buy' AND ISNULL(field_sale_lead,'') = 'sell')--AND field_sale_lead IS NOT null)
AND NOT (cte.field_sale = 'sell' AND ISNULL(field_sale_lag,'') = 'buy' )--AND field_sale_lag IS NOT NULL)
但是,当您这样做时,您会意识到它找到了直接匹配项,但现在仍然存在买入/退货对,我想取消它。
在这一点上我被卡住了。我以前做过递归 CTE,但无论出于何种原因,我都无法弄清楚如何递归并使其抵消 1/1/100 和 4/1/100。我所能做的就是让它在递归中窒息。
DECLARE @tablea TABLE (field_id int, field_number CHAR(3), field_date datetime, field_inserted DATETIME, field_sale varchar(4))
INSERT INTO @tablea
VALUES
(1, 100, '20170311','20170311 01:00:00', 'Buy'),
(1, 100, '20170311','20170311 01:01:00', 'Buy'),
(1, 100, '20170311','20170311 01:02:00', 'Retu'),
(1, 100, '20170311','20170311 01:03:00', 'Retu'),
(2, 100, '20170311','20170311 01:03:00', 'Buy'),
(1, 110, '20170311','20170311 01:03:00', 'Buy');
WITH cte AS
(SELECT
ROW_NUMBER() OVER (PARTITION BY field_id, field_number, field_date ORDER BY field_inserted) AS row_num,
field_id,
field_number,
field_date,
field_sale,
field_inserted,
lead(field_sale) OVER (PARTITION BY field_id, field_number, field_date ORDER BY field_inserted) AS field_sale_lead,
lag(field_sale) OVER (PARTITION BY field_id, field_number, field_date ORDER BY field_inserted) AS field_sale_lag
FROM @tablea
--)
--SELECT * FROM cte
--WHERE NOT (cte.field_sale = 'Buy' AND ISNULL(field_sale_lead,'') = 'Retu')--AND field_sale_lead IS NOT null)
--AND NOT (cte.field_sale = 'Retu' AND ISNULL(field_sale_lag,'') = 'buy' )--AND field_sale_lag IS NOT NULL)
UNION ALL
SELECT
ROW_NUMBER() OVER (PARTITION BY cte.field_id, cte.field_number, cte.field_date ORDER BY cte.field_inserted) AS row_num,
cte.field_id,
cte.field_number,
cte.field_date,
cte.field_sale,
cte.field_inserted,
lead(cte.field_sale) OVER (PARTITION BY cte.field_id, cte.field_number, cte.field_date ORDER BY cte.field_inserted) AS field_sale_lead,
lag(cte.field_sale) OVER (PARTITION BY cte.field_id, cte.field_number, cte.field_date ORDER BY cte.field_inserted) AS field_sale_lag
FROM @tablea INNER JOIN cte ON cte.field_date = [@tablea].field_date AND cte.field_id = [@tablea].field_id AND cte.field_number = [@tablea].field_number
)
SELECT * FROM cte
WHERE NOT (cte.field_sale = 'Buy' AND ISNULL(field_sale_lead,'') = 'Retu')--AND field_sale_lead IS NOT null)
AND NOT (cte.field_sale = 'Retu' AND ISNULL(field_sale_lag,'') = 'buy' )--AND field_sale_lag IS NOT NULL)
【问题讨论】:
如果序列是(Buy Buy Buy Return Return),哪些Buys必须去掉? @serg 好问题。我认为这将是最后两个 【参考方案1】:我们可以通过使用common table expression 和row_number()
来解决这个没有循环或递归,如下所示:
如果我正确理解您的问题,您希望删除已退回的销售
,并且对于每个'retu'
,它应该删除最近的'buy'
。
首先,我们将使用row_number()
将id
添加到我们的行集中,这样我们就可以唯一地标识我们的行。
接下来,我们添加br_rn
(Buy/Return RowNumber 的缩写)被field_id, field_number, field_date
分区,但我们将同时添加 field_sale
到分区中;我们将通过field_inserted desc
订购。
这将让我们将每个 'retu'
与最近的 'buy'
匹配,一旦我们能够做到这一点,我们就可以消除所有带有 not exists()
的对:
;with cte as (
select
id = row_number() over (
order by field_id, field_number, field_date, field_inserted asc
)
, field_id
, field_number
, field_date
, field_inserted
, field_sale
, br_rn = row_number() over (
partition by field_id, field_number, field_date, field_sale
order by field_inserted desc
)
from @tablea
)
select
id
, field_number
, field_date
, field_inserted
, field_sale
from cte
where not exists (
select 1
from cte as i
where i.field_id = cte.field_id
and i.field_number = cte.field_number
and i.field_date = cte.field_date
and i.br_rn = cte.br_rn
and i.id <> cte.id
)
order by id
rextester 演示:http://rextester.com/TKXOC61533
对于这个输入:
(1, 100, '20170311','20170311 01:00:00', 'Buy')
, (1, 100, '20170311','20170311 01:01:00', 'Buy')
, (1, 100, '20170311','20170311 01:02:00', 'Retu')
, (1, 100, '20170311','20170311 01:03:00', 'Retu')
, (2, 100, '20170311','20170311 01:03:00', 'Buy')
, (1, 110, '20170311','20170311 01:03:00', 'Buy');
返回:
+----+----------+--------------+------------+---------------------+------------+
| id | field_id | field_number | field_date | field_inserted | field_sale |
+----+----------+--------------+------------+---------------------+------------+
| 5 | 1 | 110 | 2017-03-11 | 2017-03-11 01:03:00 | Buy |
| 6 | 2 | 100 | 2017-03-11 | 2017-03-11 01:03:00 | Buy |
+----+----------+--------------+------------+---------------------+------------+
对于这个输入:
(1, 100, '20170311','20170311 01:01:00', 'Buy')
, (1, 100, '20170311','20170311 01:02:00', 'Buy')
, (1, 100, '20170311','20170311 01:03:00', 'Buy')
, (1, 100, '20170311','20170311 01:04:00', 'Retu')
, (1, 100, '20170311','20170311 01:05:00', 'Buy')
, (1, 100, '20170311','20170311 01:06:00', 'Retu')
, (1, 100, '20170311','20170311 01:07:00', 'Retu')
, (2, 100, '20170311','20170311 01:03:00', 'Buy')
, (1, 110, '20170311','20170311 01:03:00', 'Buy');
返回:
+----+----------+--------------+------------+---------------------+------------+
| id | field_id | field_number | field_date | field_inserted | field_sale |
+----+----------+--------------+------------+---------------------+------------+
| 1 | 1 | 100 | 2017-03-11 | 2017-03-11 01:01:00 | Buy |
| 8 | 1 | 110 | 2017-03-11 | 2017-03-11 01:03:00 | Buy |
| 9 | 2 | 100 | 2017-03-11 | 2017-03-11 01:03:00 | Buy |
+----+----------+--------------+------------+---------------------+------------+
对于这个输入:
(1, 100, '20170311','20170311 01:01:00', 'Buy')
, (1, 100, '20170311','20170311 01:02:00', 'Buy')
, (1, 100, '20170311','20170311 01:04:00', 'Retu')
, (1, 100, '20170311','20170311 01:05:00', 'Retu')
, (1, 100, '20170312','20170311 01:06:00', 'Buy')
, (1, 100, '20170312','20170311 01:07:00', 'Buy')
, (2, 100, '20170311','20170311 01:03:00', 'Buy')
, (1, 110, '20170311','20170311 01:03:00', 'Buy')
返回:
+----+----------+--------------+------------+---------------------+------------+
| id | field_id | field_number | field_date | field_inserted | field_sale |
+----+----------+--------------+------------+---------------------+------------+
| 5 | 1 | 100 | 2017-03-12 | 2017-03-11 01:06:00 | Buy |
| 6 | 1 | 100 | 2017-03-12 | 2017-03-11 01:07:00 | Buy |
| 7 | 1 | 110 | 2017-03-11 | 2017-03-11 01:03:00 | Buy |
| 8 | 2 | 100 | 2017-03-11 | 2017-03-11 01:03:00 | Buy |
+----+----------+--------------+------------+---------------------+------------+
这可能有助于说明我们在消除任何对之前查看 cte 返回的内容。
在过滤之前只查看需要过滤的集合:
+----+----------+--------------+------------+---------------------+------------+-------+
| id | field_id | field_number | field_date | field_inserted | field_sale | br_rn |
+----+----------+--------------+------------+---------------------+------------+-------+
| 1 | 1 | 100 | 2017-03-11 | 2017-03-11 01:01:00 | Buy | 4 |
| 2 | 1 | 100 | 2017-03-11 | 2017-03-11 01:02:00 | Buy | 3 |
| 3 | 1 | 100 | 2017-03-11 | 2017-03-11 01:03:00 | Buy | 2 |
| 4 | 1 | 100 | 2017-03-11 | 2017-03-11 01:04:00 | Retu | 3 |
| 5 | 1 | 100 | 2017-03-11 | 2017-03-11 01:05:00 | Buy | 1 |
| 6 | 1 | 100 | 2017-03-11 | 2017-03-11 01:06:00 | Retu | 2 |
| 7 | 1 | 100 | 2017-03-11 | 2017-03-11 01:07:00 | Retu | 1 |
+----+----------+--------------+------------+---------------------+------------+-------+
这样看,我们可以很容易地看到'buy'
订单id
1
有一个br_rn
的4
并且没有关联的'retu'
。
【讨论】:
我发现了一个奇怪的情况,它不起作用。现在写详细信息。 不确定为什么这组不能正常工作。我希望它会在 20170312 年 1 月 100 日给出购买,但只有最后两场演出。现在重读解释。(1, 100, '20170311','20170311 01:01:00', 'Buy'), (1, 100, '20170311','20170311 01:02:00', 'Buy'), (1, 100, '20170311','20170311 01:04:00', 'Retu'), (1, 100, '20170311','20170311 01:05:00', 'Retu'), (1, 100, '20170312','20170311 01:06:00', 'Buy'), (1, 100, '20170312','20170311 01:07:00', 'Buy'), (2, 100, '20170311','20170311 01:03:00', 'Buy'), (1, 110, '20170311','20170311 01:03:00', 'Buy')
此解决方案不考虑配对购买/退货的field_inserted
。我的猜测是根据field_inserted
,退货必须与之前的购买配对,因为您无法退回尚未购买的东西。
@mbourgon 我已经更新了答案以纠正我的疏忽。我最初未能在not exists()
中包含and i.field_date = cte.field_date
,即使它在partition by
中。我还使用您的附加数据集更新了 rextester 演示,并将结果包含在答案中。
有道理,这就解决了问题!如果你去 NTSSUG 或 DFW SQLSat,请告诉我,我会在下一次聚会后给你买饮料或其他东西!【参考方案2】:
我可以建议在可能的情况下删除成对的顺序购买/退货。试试
DECLARE @tablea TABLE (field_id int, field_number CHAR(3), field_date datetime, field_inserted DATETIME, field_sale varchar(4))
INSERT INTO @tablea
VALUES
(1, 100, '20170311','20170311 01:01:00', 'Buy'),
(1, 100, '20170311','20170311 01:02:00', 'Buy'),
(1, 100, '20170311','20170311 01:03:00', 'Buy'),
(1, 100, '20170311','20170311 01:04:00', 'Retu'),
(1, 100, '20170311','20170311 01:05:00', 'Buy'),
(1, 100, '20170311','20170311 01:06:00', 'Retu'),
(1, 100, '20170311','20170311 01:07:00', 'Retu'),
(2, 100, '20170311','20170311 01:03:00', 'Buy'),
(1, 110, '20170311','20170311 01:03:00', 'Buy');
select * from @tablea
order by field_id,
field_number,
field_inserted
declare @eoj int =1;
while @eoj > 0
begin
WITH cte AS
(
SELECT
case field_sale when 'Buy' then
lead (field_sale) OVER (PARTITION BY field_id, field_number ORDER BY field_inserted)
when 'Retu' then
lag (field_sale) OVER (PARTITION BY field_id, field_number ORDER BY field_inserted)
end nbr_type,
field_id,
field_number,
field_date,
field_sale,
field_inserted
FROM @tablea
)
delete
from cte
where nbr_type is not null and nbr_type <> field_sale;
set @eoj = @@rowcount;
-- check it
select * from @tablea
order by field_id,
field_number,
field_inserted;
end;
它将重复 N+1 次,其中 N 是最长返回序列的长度。在上面的例子中 N=2。
【讨论】:
啊!我没有考虑过这样做会删除它。感谢您的关注。以上是关于使用(递归?)CTE + 窗口函数将销售订单归零?的主要内容,如果未能解决你的问题,请参考以下文章