如何使用窗口函数“求和（DISTINCT <column>）OVER（）”？

Posted 2023-02-16

技术标签:

【中文标题】如何使用窗口函数“求和（DISTINCT <column>）OVER（）”？【英文标题】：How to `sum( DISTINCT <column> ) OVER ()` using window function? 【发布时间】：2021-07-12 10:23:35 【问题描述】：

我有下一个数据：

这里我已经计算了conf_id 的总数。但也要计算整个分区的总数。例如：根据每个订单的协议计算总 suma（不是四舍五入略有不同的订单货物）

如何求和737.38 和1238.3？例如。组中只取一个数

（我不能求和（item_suma），因为它会返回1975.67。注意循环conf_suma作为中间步骤）

UPD 完整查询。在这里，我想为每个组计算四舍五入的 suma。然后我需要计算这些组的总和

SELECT app_period( '2021-02-01', '2021-03-01' );


WITH
target_date AS ( SELECT '2021-02-01'::timestamptz ),
target_order as (
  SELECT
    tstzrange( '2021-01-01', '2021-02-01') as bill_range,
    o.*
  FROM ( SELECT * FROM "order_bt" WHERE sys_period @> sys_time() ) o
  WHERE FALSE
    OR o.agreement_id = 3385 and o.period_id = 10
),
USAGE AS ( SELECT
  ocd.*,


  o.agreement_id                  as agreement_id,
  o.id                            AS order_id,
  
  (dense_rank() over (PARTITION BY o.agreement_id       ORDER BY o.id                     )) as zzzz_id,
  (dense_rank() over (PARTITION BY o.agreement_id, o.id ORDER BY (ocd.ic).consumed_period )) as conf_id,

  
   sum( ocd.item_suma     ) OVER( PARTITION BY (ocd.o).agreement_id                 ) AS agreement_suma2,

 
  (sum( ocd.item_suma )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )) AS x_suma,
  (sum( ocd.item_cost )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )) AS x_cost,
  (sum( ocd.item_suma )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ))::numeric( 10, 2) AS conf_suma,
  (sum( ocd.item_cost )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ))::numeric( 10, 2) AS conf_cost,
  max((ocd.ic).consumed) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )                   AS consumed,
  (sum( ocd.item_suma )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id                           )) AS order_suma2
FROM target_order o
LEFT JOIN order_cost_details( o.bill_range ) ocd
  ON (ocd.o).id = o.id  AND  (ocd.ic).consumed_period && o.app_period
)

SELECT 
  *,
  (conf_suma/6) ::numeric( 10, 2 ) as group_nds,
  (SELECT sum(x) from (SELECT  sum( DISTINCT conf_suma )                       AS x FROM usage sub_u WHERE sub_u.agreement_id = usage.agreement_id GROUP BY agreement_id, order_id) t) as total_suma,
  (SELECT sum(x) from (SELECT (sum( DISTINCT conf_suma ) /6)::numeric( 10, 2 ) AS x FROM usage sub_u WHERE sub_u.agreement_id = usage.agreement_id GROUP BY agreement_id, order_id) t) as total_nds
FROM USAGE
WINDOW w AS ( PARTITION BY usage.agreement_id ROWS CURRENT ROW EXCLUDE TIES)
ORDER BY
  order_id,
  conf_id

我的old question

【问题讨论】：

你能分享你期望这个样本的输出吗？ @Mureinik：另外一列，值为1975,68 所有行？单排？我不确定我是否遵循这里的逻辑。 @Mureinik：我更新了问题 @eshirvana：添加 【参考方案1】：

更好的方法dbfiddle：

row_number

row_number() over (partition by agreement_id, order_id ) as nrow

filter nrow = 1

with data as (
  select * from (values 
      ( 1, 1, 1, 1.0049 ), (2, 1,1,1.0049), ( 3, 1,1,1.0049 ) ,
      ( 4, 1, 2, 1.0049 ), (5, 1,2,1.0057), 
      ( 6, 2, 1, 1.53 ), ( 7,2,1,2.18), ( 8,2,2,3.48 )
 ) t (id, agreement_id, order_id, suma)
),
intermediate as (select 
 *,
 row_number() over (partition by agreement_id, order_id ) as nrow,
 (sum( suma ) over ( partition by agreement_id, order_id ))::numeric( 10, 2) as order_suma,
from data)

select 
  *,
  sum( order_suma ) filter (where nrow = 1) over (partition by agreement_id)
from intermediate```

【讨论】：

【参考方案2】：

我找到了解决方案。见dbfiddle。

要为不同的值运行窗口函数，我应该从每个对等点获取第一个值。为了完成这个我

aggregate

lag

_distinct

瞧。您在目标 PARTITIONPostgreSQL 尚未实现的 DISTINCT 值上获得 sum

with data as (
  select * from (values 
      ( 1, 1, 1, 1.0049 ), (2, 1,1,1.0049), ( 3, 1,1,1.0049 ) ,
      ( 4, 1, 2, 1.0049 ), (5, 1,2,1.0057), 
      ( 6, 2, 1, 1.53 ), ( 7,2,1,2.18), ( 8,2,2,3.48 )
 ) t (id, agreement_id, order_id, suma)
),
intermediate as (select 
 *,
 sum( suma ) over ( partition by agreement_id, order_id ) as fract_order_suma,
 sum( suma ) over ( partition by agreement_id           ) as fract_agreement_total,
 (sum( suma::numeric(10,2) ) over ( partition by agreement_id, order_id )) as wrong_order_suma,
 (sum( suma ) over ( partition by agreement_id, order_id ))::numeric( 10, 2) as order_suma,
 (sum( suma ) over ( partition by agreement_id           ))::numeric( 10, 2) as wrong_agreement_total,
 id as xid,
 array_agg( id ) over ( partition by agreement_id, order_id ) as agg
from data),

distinc as (select *,
  lag( agg ) over ( partition by agreement_id ) as prev, 
  id = any (lag( agg ) over ()) is not true as _distinct, -- allow to match first ID from next peer
  order_suma as xorder_suma, -- repeat column to easily visually compare with _distinct
  (SELECT sum(x) from (SELECT  sum( DISTINCT order_suma ) AS x FROM intermediate sub_q WHERE sub_q.agreement_id = intermediate.agreement_id GROUP BY agreement_id, order_id) t) as correct_total_suma
from intermediate
)
select 
*,
sum( order_suma ) filter ( where _distinct ) over ( partition by agreement_id ) as also_correct_total_suma
from distinc

【讨论】：

以上是关于如何使用窗口函数“求和（DISTINCT <column>）OVER（）”？的主要内容，如果未能解决你的问题，请参考以下文章