Oracle SQL 分析查询 - 类似电子表格的递归运行总计

Posted

技术标签:

【中文标题】Oracle SQL 分析查询 - 类似电子表格的递归运行总计【英文标题】:Oracle SQL Analytic query - recursive spreadsheet-like running total 【发布时间】:2012-08-27 22:23:17 【问题描述】:

我有以下数据,由A 值组成,按MM(月)排序。

B 列以类似电子表格的方式计算为 GREATEST(current value of A + previous value of B, 0)

如何使用 SQL 查询计算 B

我尝试使用分析函数,但未能成功。 我知道有Model Clause;我找到了a similar example,但我不知道从哪里开始。

我使用的是 Oracle 10g,因此无法使用递归查询。


这是我的测试数据:

MM         | A      | B
-----------+--------+------
2012-01-01 |    800 |  800
2012-02-01 |   1900 | 2700
2012-03-01 |   1750 | 4450
2012-04-01 | -20000 |    0
2012-05-01 |    900 |  900
2012-06-01 |   3900 | 4800
2012-07-01 |  -2600 | 2200
2012-08-01 |  -2600 |    0
2012-09-01 |   2100 | 2100
2012-10-01 |  -2400 |    0
2012-11-01 |   1100 | 1100
2012-12-01 |   1300 | 2400

这里是“表定义”:

select t.* from (
  select date'2012-01-01' as mm, 800 as a from dual union all
  select date'2012-02-01' as mm, 1900 as a from dual union all
  select date'2012-03-01' as mm, 1750 as a from dual union all
  select date'2012-04-01' as mm, -20000 as a from dual union all
  select date'2012-05-01' as mm, 900 as a from dual union all
  select date'2012-06-01' as mm, 3900 as a from dual union all
  select date'2012-07-01' as mm, -2600 as a from dual union all
  select date'2012-08-01' as mm, -2600 as a from dual union all
  select date'2012-09-01' as mm, 2100 as a from dual union all
  select date'2012-10-01' as mm, -2400 as a from dual union all
  select date'2012-11-01' as mm, 1100 as a from dual union all
  select date'2012-12-01' as mm, 1300 as a from dual
) t;

【问题讨论】:

我认为你的最大意思是最大 是的,很抱歉。我还在说电子表格语言而不是 SQL。 【参考方案1】:

很抱歉,如果这是题外话,鉴于问题的 Oracle 版本,但我们现在可以使用 SQL:2016 MATCH_RECOGNIZE 子句:

select * from t
match_recognize(
  order by mm
  measures case classifier() when 'POS' then sum(a) else 0 end as b
  all rows per match
  pattern (pos* neg0,1)
  define pos as sum(a) > 0
);

【讨论】:

【参考方案2】:

我想出了一个用户定义的聚合函数

create or replace type tsum1 as object
  (
  total number,

  static function ODCIAggregateInitialize(nctx IN OUT tsum1 )
       return number,

  member function ODCIAggregateIterate(self IN OUT tsum1 ,
                                       value IN number )
       return number,

  member function ODCIAggregateTerminate(self IN tsum1,
                              retVal OUT  number,
                              flags IN number)
       return number,

  member function ODCIAggregateMerge(self IN OUT tsum1,
                          ctx2 IN tsum1)
       return number
)
/

create or replace type body tsum1
 is

 static function ODCIAggregateInitialize(nctx IN OUT tsum1)
 return number
 is
 begin
   nctx := tsum1(0);
   return ODCIConst.Success;
 end;

 member function ODCIAggregateIterate(self IN OUT tsum1,
                                    value IN number )
 return number
 is
 begin
   self.total := self.total + value;
     if (self.total < 0) then
       self.total := 0;
     end if;
   return ODCIConst.Success;
 end;

 member function ODCIAggregateTerminate(self IN tsum1,
                                        retVal OUT number,
                                        flags IN number)
 return number
 is
 begin
   retVal := self.total;
   return ODCIConst.Success;
 end;

 member function ODCIAggregateMerge(self IN OUT tsum1,
                                    ctx2 IN tsum1)
 return number
 is
 begin
   self.total := self.total + ctx2.total;
   return ODCIConst.Success;
 end;
 end;
 /

CREATE OR REPLACE FUNCTION sum1(input number)
RETURN number
PARALLEL_ENABLE AGGREGATE USING tsum1;
/

这里是查询

with T1 as(
   select date'2012-01-01' as mm, 800 as a from dual union all
   select date'2012-02-01' as mm, 1900 as a from dual union all
   select date'2012-03-01' as mm, 1750 as a from dual union all
   select date'2012-04-01' as mm, -20000 as a from dual union all
   select date'2012-05-01' as mm, 900 as a from dual union all
   select date'2012-06-01' as mm, 3900 as a from dual union all
   select date'2012-07-01' as mm, -2600 as a from dual union all
   select date'2012-08-01' as mm, -2600 as a from dual union all
   select date'2012-09-01' as mm, 2100 as a from dual union all
   select date'2012-10-01' as mm, -2400 as a from dual union all
   select date'2012-11-01' as mm, 1100 as a from dual union all
   select date'2012-12-01' as mm, 1300 as a from dual
 )

 select mm
      , a
      , sum1(a) over(order by mm) as b
  from t1

    Mm         a      b
 ----------------------------  
 01.01.2012    800    800  
 01.02.2012    1900   2700  
 01.03.2012    1750   4450  
 01.04.2012   -20000  0  
 01.05.2012    900    900  
 01.06.2012    3900   4800  
 01.07.2012   -2600   2200  
 01.08.2012   -2600   0  
 01.09.2012    2100   2100  
 01.10.2012   -2400   0  
 01.11.2012    1100   1100  
 01.12.2012    1300   2400

【讨论】:

谢谢,它成功了。它也是自定义聚合函数的一个有用示例。 很好,我没有想到这一点。我想知道如果您不将它用作(有序)窗口函数会发生什么。可能是相当随意的结果?【参考方案3】:

因此,让我们在这个问题上释放MODEL 子句(一个只有其力量才能超越神秘的设备):

with data as (
  select date'2012-01-01' as mm,    800 as a from dual union all
  select date'2012-02-01' as mm,   1900 as a from dual union all
  select date'2012-03-01' as mm,   1750 as a from dual union all
  select date'2012-04-01' as mm, -20000 as a from dual union all
  select date'2012-05-01' as mm,    900 as a from dual union all
  select date'2012-06-01' as mm,   3900 as a from dual union all
  select date'2012-07-01' as mm,  -2600 as a from dual union all
  select date'2012-08-01' as mm,  -2600 as a from dual union all
  select date'2012-09-01' as mm,   2100 as a from dual union all
  select date'2012-10-01' as mm,  -2400 as a from dual union all
  select date'2012-11-01' as mm,   1100 as a from dual union all
  select date'2012-12-01' as mm,   1300 as a from dual
)
select mm, a, b
from (
  -- Add a dummy value for b, making it available to the MODEL clause
  select mm, a, 0 b
  from data
)
      -- Generate a ROW_NUMBER() dimension, in order to access rows by RN
model dimension by (row_number() over (order by mm) rn)
      -- Spreadsheet values / measures involved in calculations are mm, a, b
      measures (mm, a, b)
      -- A single rule will do. Any value of B should be calculated according to
      -- GREATEST([previous value of B] + [current value of A], 0)
      rules (
        b[any] = greatest(nvl(b[cv(rn) - 1], 0) + a[cv(rn)], 0)
      )

以上产出:

MM              A     B
01.01.2012    800   800
01.02.2012   1900  2700
01.03.2012   1750  4450
01.04.2012 -20000     0
01.05.2012    900   900
01.06.2012   3900  4800
01.07.2012  -2600  2200
01.08.2012  -2600     0
01.09.2012   2100  2100
01.10.2012  -2400     0
01.11.2012   1100  1100
01.12.2012   1300  2400

【讨论】:

尖括号[]有时在SQL中有含义?我才意识到我对此一无所知。 @masterxilo:是的。在 Oracle MODEL 子句中,或者在 SQL 标准 ARRAYMULTISET 表示法中(例如,as implemented by PostgreSQL)。也就是说,您几乎不需要它们,因为 MODELARRAYs 都不会经常使用【参考方案4】:
with sample_data as (
  select date'2012-01-01' as mm, 800 as a from dual union all
  select date'2012-02-01' as mm, 1900 as a from dual union all
  select date'2012-03-01' as mm, 1750 as a from dual union all
  select date'2012-04-01' as mm, -20000 as a from dual union all
  select date'2012-05-01' as mm, 900 as a from dual union all
  select date'2012-06-01' as mm, 3900 as a from dual union all
  select date'2012-07-01' as mm, -2600 as a from dual union all
  select date'2012-08-01' as mm, -2600 as a from dual union all
  select date'2012-09-01' as mm, 2100 as a from dual union all
  select date'2012-10-01' as mm, -2400 as a from dual union all
  select date'2012-11-01' as mm, 1100 as a from dual union all
  select date'2012-12-01' as mm, 1300 as a from dual
) 
select mm, 
       a, 
       greatest(nvl(a,0) + lag(a,1,0) over (order by mm), 0) as b
from sample_data;

但它确实产生这一行:

2012-05-01 | 900 | 900

因为它在该行中计算 900 - 20000,并且零大于该结果。如果您使用 abs 函数来消除计算中的负值,您可以“修复”该问题。

【讨论】:

这样计算bgreatest(nvl(a,0) + greatest(lag(a,1,0) over (order by mm), 0), 0) as b @LukasEder:我之前尝试过类似的方法,但它不会为2012-07-012200 产生正确的结果。我确信这不能仅使用分析函数来完成。

以上是关于Oracle SQL 分析查询 - 类似电子表格的递归运行总计的主要内容,如果未能解决你的问题,请参考以下文章

Oracle SQL 查询对连续记录进行分组

将SQL查询分析器查询的结果用SQL语句导出到Excel表格的语句怎么写?

在oracle sql中使用if创建过程

如何从命令行对 OpenOffice/LibreOffice 电子表格运行 sql 查询?

Oracle SQL 新手需要帮助查找类似记录

在SQL中,如何查询某一字段中最大值的数据