基于上一行的sql更新

Posted

技术标签:

【中文标题】基于上一行的sql更新【英文标题】:Sql update based on previous row 【发布时间】:2020-03-31 11:07:00 【问题描述】:

我有一张桌子。

SELECT * INTO #tmp
FROM (
SELECT 1 AS [ID], 20200312 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 480.00 AS [ValueC], 4906 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200313 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 1440.00 AS [ValueC], 3466 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200314 AS [date], 0 AS [ValueA], 1000.00 AS [ValueB], 0.00 AS [ValueC], 4466 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200318 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 1056.00 AS [ValueC], 3410 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200319 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 864.00 AS [ValueC], 2546 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200320 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 1296.00 AS [ValueC], 1250 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200321 AS [date], 0 AS [ValueA], 4000.00 AS [ValueB], 624.00 AS [ValueC], 4626 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200324 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 1152.00 AS [ValueC], 3474 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200325 AS [date], 3474 AS [ValueA], 0.00 AS [ValueB], 2718.00 AS [ValueC], 756 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200330 AS [date], 0 AS [ValueA], 6000.00 AS [ValueB], 1080.00 AS [ValueC], 5676 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200401 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 1920.00 AS [ValueC], 2756 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200403 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 1920.00 AS [ValueC], 836 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200407 AS [date], 0 AS [ValueA], 3000.00 AS [ValueB], 0.00 AS [ValueC], 3836 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200408 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 2448.00 AS [ValueC], 1388 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200413 AS [date], 0 AS [ValueA], 4000.00 AS [ValueB], 0.00 AS [ValueC], 5388 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200415 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 1920.00 AS [ValueC], 3468 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200417 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 1920.00 AS [ValueC], 1548 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200420 AS [date], 0 AS [ValueA], 1000.00 AS [ValueB], 1920.00 AS [ValueC], 628 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200426 AS [date], 0 AS [ValueA], 4000.00 AS [ValueB], 0.00 AS [ValueC], 4628 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200515 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 3840.00 AS [ValueC], 788 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200525 AS [date], 0 AS [ValueA], 3000.00 AS [ValueB], 1920.00 AS [ValueC], 1868 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200601 AS [date], 0 AS [ValueA], 2000.00 AS [ValueB], 1080.00 AS [ValueC], 2788 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200608 AS [date], 0 AS [ValueA], 1000.00 AS [ValueB], 1920.00 AS [ValueC], 1868 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200615 AS [date], 0 AS [ValueA], 2000.00 AS [ValueB], 0.00 AS [ValueC], 3868 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200622 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 1920.00 AS [ValueC], 1948 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200706 AS [date], 0 AS [ValueA], 2000.00 AS [ValueB], 1920.00 AS [ValueC], 2028 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200713 AS [date], 0 AS [ValueA], 2000.00 AS [ValueB], 0.00 AS [ValueC], 4028 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200720 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 3000.00 AS [ValueC], 1028 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200727 AS [date], 0 AS [ValueA], 3000.00 AS [ValueB], 0.00 AS [ValueC], 4028 AS [NewColumn] UNION ALL
SELECT 1 AS [ID], 20200803 AS [date], 0 AS [ValueA], 0.00 AS [ValueB], 1920.00 AS [ValueC], 2108 AS [NewColumn]  
) t;

[NewColumn] 是所需的输出

SELECT [ID], [date], [ValueA], [ValueB], [ValueC], [NewColumn]
FROM #tmp
order by [date]

基于 A、B 和 C 列的值,我可以计算的唯一值是日期为 20200325 并根据以下公式计算。

update #tmp
set [NewColumn] = ValueA+ValueB-ValueC
where date = 20200325

所以值为 756

所有其他行都是根据前一行计算的 例如:

[NewColumn](for date 20200330) = [NewColumn](for date 20200325)+ValueA+ValueB-ValueC
X = 756 + 0 + 6000 - 1080
X = 5676

等等……

有没有办法通过更新语句在 sql 中实现这一点

PS。我需要在 20200325 之前以及该日期之后更新

【问题讨论】:

您使用的是哪种 DBMS 产品? “SQL”只是一种查询语言,而不是特定数据库产品的名称(那些可怕的方括号是无效的标准 SQL)。请为您正在使用的数据库产品添加tag。 Why should I tag my DBMS 【参考方案1】:

假设您在 SQL-Server 上运行,如语法所示,您可以使用可更新的 cte 和窗口函数:

with cte as (
    select 
        NewColumn, 
        sum(ValueA + ValueB - ValueC) over(partition by id order by date) NewVal
    from #tmp
    where date >= 20200325 
)
update cte set NewColumn = NewVal

Demo on DB Fiddle

身份证 |日期 |值A |价值B |价值C |新列 -: | --------: | -----: | :-------- | :-------- | --------: 1 | 20200312 | 0 | 0.00 | 480.00 | 4906 1 | 20200313 | 0 | 0.00 | 1440.00 | 3466 1 | 20200314 | 0 | 1000.00 | 0.00 | 4466 1 | 20200318 | 0 | 0.00 | 1056.00 | 3410 1 | 20200319 | 0 | 0.00 | 864.00 | 2546 1 | 20200320 | 0 | 0.00 | 1296.00 | 1250 1 | 20200321 | 0 | 4000.00 | 624.00 | 4626 1 | 20200324 | 0 | 0.00 | 1152.00 | 3474 1 | 20200325 | 3474 | 0.00 | 2718.00 | 756 1 | 20200330 | 0 | 6000.00 | 1080.00 | 5676 1 | 20200401 | 0 | 0.00 | 1920.00 | 3756 1 | 20200403 | 0 | 0.00 | 1920.00 | 1836年 1 | 20200407 | 0 | 3000.00 | 0.00 | 4836 1 | 20200408 | 0 | 0.00 | 2448.00 | 2388 1 | 20200413 | 0 | 4000.00 | 0.00 | 6388 1 | 20200415 | 0 | 0.00 | 1920.00 | 4468 1 | 20200417 | 0 | 0.00 | 1920.00 | 2548 1 | 20200420 | 0 | 1000.00 | 1920.00 | 1628 1 | 20200426 | 0 | 4000.00 | 0.00 | 5628 1 | 20200515 | 0 | 0.00 | 3840.00 | 1788 1 | 20200525 | 0 | 3000.00 | 1920.00 | 2868 1 | 20200601 | 0 | 2000.00 | 1080.00 | 3788 1 | 20200608 | 0 | 1000.00 | 1920.00 | 2868 1 | 20200615 | 0 | 2000.00 | 0.00 | 4868 1 | 20200622 | 0 | 0.00 | 1920.00 | 2948 1 | 20200706 | 0 | 2000.00 | 1920.00 | 3028 1 | 20200713 | 0 | 2000.00 | 0.00 | 5028 1 | 20200720 | 0 | 0.00 | 3000.00 | 2028 1 | 20200727 | 0 | 3000.00 | 0.00 | 5028 1 | 20200803 | 0 | 0.00 | 1920.00 | 3108

【讨论】:

【参考方案2】:

你描述的是:

select t.*, sum(valuea + valueb - valuec) over (partition by id order by date)
from tmp t
where date >= '20200325'
order by id, date;

但是,这不会返回您指定的值。它返回:

ID      date    ValueA  ValueB  ValueC  NewColumn   (No column name)
1   20200325    3474       0.00 2718.00  756     756.00
1   20200330       0    6000.00 1080.00 5676    5676.00
1   20200401       0       0.00 1920.00 2756    3756.00
1   20200403       0       0.00 1920.00  836    1836.00
1   20200407       0    3000.00 0.00    3836    4836.00
1   20200408       0       0.00 2448.00 1388    2388.00
1   20200413       0    4000.00 0.00    5388    6388.00
1   20200415       0       0.00 1920.00 3468    4468.00
1   20200417       0       0.00 1920.00 1548    2548.00
1   20200420    0   1000.00 1920.00 628 1628.00

Here 是一个 dbfiddle。

我怀疑您不想硬编码 20200325,但您只想要一个非零值 valueA。在这种情况下,您可以使用窗口函数分配组:

select t.*, sum(valuea + valueb - valuec) over (partition by id, grp order by date)
from (select t.*,
             sum(case when valueA > 0 then 1 else 0 end) over (partition by id order by date) as grp
      from tmp t
     ) t
where grp > 0
order by id, date;

如果您愿意,可以将其合并到update

with toupdate as (
       select t.*,
              sum(valuea + valueb - valuec) over (partition by id order by date) as new_newcolumn
       from tmp t
       where date >= '20200325'
      )
update toupdate
    set newcolumn = new_newcolumn
    where newcolumn <> new_newcolumn;

【讨论】:

您对最终输出的看法是正确的,我将编辑问题。这只会更新日期 >= 20200325 的值。我还需要更新以前的值。如果我删除 where 子句,我不会得到所需的输出 @DavePa 。 . .您的问题没有指定对这些行执行什么操作。我建议您提出一个具有完整规范的 new 问题。这个问题已经有两个答案了。 它没有直接指定,但我有整个表所需的输出,而不仅仅是在某个日期之后。这只是计算的一个例子。有 2 个答案,但目前都不完整

以上是关于基于上一行的sql更新的主要内容,如果未能解决你的问题,请参考以下文章

更新一行以基于另一行的转换 (Oracle)

上一行结束日期作为 SQL 中的下一行开始日期

基于另一个查询的 MS Access SQL 更新查询

基于String在SQL(Snowflake)中选择一行

SQL - 基于前一行的递归平均值(AR 模型)

关于数据库更新/插入速率限制的一些查询(基于 SQL 或基于 NoSQL)