如何使用窗口函数“求和(DISTINCT <column>)OVER()”?

Posted

技术标签:

【中文标题】如何使用窗口函数“求和(DISTINCT <column>)OVER()”?【英文标题】:How to `sum( DISTINCT <column> ) OVER ()` using window function? 【发布时间】:2021-07-12 10:23:35 【问题描述】:

我有下一个数据:

这里我已经计算了conf_id 的总数。但也要计算整个分区的总数。例如:根据每个订单的协议计算总 suma(不是四舍五入略有不同的订单货物)

如何求和737.381238.3?例如。组中只取一个数

(我不能求和(item_suma),因为它会返回1975.67。注意循环conf_suma作为中间步骤)

UPD 完整查询。在这里,我想为每个组计算四舍五入的 suma。然后我需要计算这些组的总和

SELECT app_period( '2021-02-01', '2021-03-01' );


WITH
target_date AS ( SELECT '2021-02-01'::timestamptz ),
target_order as (
  SELECT
    tstzrange( '2021-01-01', '2021-02-01') as bill_range,
    o.*
  FROM ( SELECT * FROM "order_bt" WHERE sys_period @> sys_time() ) o
  WHERE FALSE
    OR o.agreement_id = 3385 and o.period_id = 10
),
USAGE AS ( SELECT
  ocd.*,


  o.agreement_id                  as agreement_id,
  o.id                            AS order_id,
  
  (dense_rank() over (PARTITION BY o.agreement_id       ORDER BY o.id                     )) as zzzz_id,
  (dense_rank() over (PARTITION BY o.agreement_id, o.id ORDER BY (ocd.ic).consumed_period )) as conf_id,

  
   sum( ocd.item_suma     ) OVER( PARTITION BY (ocd.o).agreement_id                 ) AS agreement_suma2,

 
  (sum( ocd.item_suma )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )) AS x_suma,
  (sum( ocd.item_cost )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )) AS x_cost,
  (sum( ocd.item_suma )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ))::numeric( 10, 2) AS conf_suma,
  (sum( ocd.item_cost )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ))::numeric( 10, 2) AS conf_cost,
  max((ocd.ic).consumed) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )                   AS consumed,
  (sum( ocd.item_suma )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id                           )) AS order_suma2
FROM target_order o
LEFT JOIN order_cost_details( o.bill_range ) ocd
  ON (ocd.o).id = o.id  AND  (ocd.ic).consumed_period && o.app_period
)

SELECT 
  *,
  (conf_suma/6) ::numeric( 10, 2 ) as group_nds,
  (SELECT sum(x) from (SELECT  sum( DISTINCT conf_suma )                       AS x FROM usage sub_u WHERE sub_u.agreement_id = usage.agreement_id GROUP BY agreement_id, order_id) t) as total_suma,
  (SELECT sum(x) from (SELECT (sum( DISTINCT conf_suma ) /6)::numeric( 10, 2 ) AS x FROM usage sub_u WHERE sub_u.agreement_id = usage.agreement_id GROUP BY agreement_id, order_id) t) as total_nds
FROM USAGE
WINDOW w AS ( PARTITION BY usage.agreement_id ROWS CURRENT ROW EXCLUDE TIES)
ORDER BY
  order_id,
  conf_id

我的old question

【问题讨论】:

你能分享你期望这个样本的输出吗? @Mureinik:另外一列,值为1975,68 所有行?单排?我不确定我是否遵循这里的逻辑。 @Mureinik:我更新了问题 @eshirvana:添加 【参考方案1】:

更好的方法dbfiddle:

    在每个订单上分配row_numberrow_number() over (partition by agreement_id, order_id ) as nrow 只取第一个suma:filter nrow = 1
with data as (
  select * from (values 
      ( 1, 1, 1, 1.0049 ), (2, 1,1,1.0049), ( 3, 1,1,1.0049 ) ,
      ( 4, 1, 2, 1.0049 ), (5, 1,2,1.0057), 
      ( 6, 2, 1, 1.53 ), ( 7,2,1,2.18), ( 8,2,2,3.48 )
 ) t (id, agreement_id, order_id, suma)
),
intermediate as (select 
 *,
 row_number() over (partition by agreement_id, order_id ) as nrow,
 (sum( suma ) over ( partition by agreement_id, order_id ))::numeric( 10, 2) as order_suma,
from data)

select 
  *,
  sum( order_suma ) filter (where nrow = 1) over (partition by agreement_id)
from intermediate```

【讨论】:

【参考方案2】:

我找到了解决方案。见dbfiddle。

要为不同的值运行窗口函数,我应该从每个对等点获取第一个值。为了完成这个我

    aggregate 此对等点的行 ID lag这个聚合了一个 将尚未聚合的行(这是对等的第一行)标记为_distinct sum() FILTER (WHERE _distinct) over (...)

瞧。您在目标 PARTITIONPostgreSQL 尚未实现的 DISTINCT 值上获得 sum

with data as (
  select * from (values 
      ( 1, 1, 1, 1.0049 ), (2, 1,1,1.0049), ( 3, 1,1,1.0049 ) ,
      ( 4, 1, 2, 1.0049 ), (5, 1,2,1.0057), 
      ( 6, 2, 1, 1.53 ), ( 7,2,1,2.18), ( 8,2,2,3.48 )
 ) t (id, agreement_id, order_id, suma)
),
intermediate as (select 
 *,
 sum( suma ) over ( partition by agreement_id, order_id ) as fract_order_suma,
 sum( suma ) over ( partition by agreement_id           ) as fract_agreement_total,
 (sum( suma::numeric(10,2) ) over ( partition by agreement_id, order_id )) as wrong_order_suma,
 (sum( suma ) over ( partition by agreement_id, order_id ))::numeric( 10, 2) as order_suma,
 (sum( suma ) over ( partition by agreement_id           ))::numeric( 10, 2) as wrong_agreement_total,
 id as xid,
 array_agg( id ) over ( partition by agreement_id, order_id ) as agg
from data),

distinc as (select *,
  lag( agg ) over ( partition by agreement_id ) as prev, 
  id = any (lag( agg ) over ()) is not true as _distinct, -- allow to match first ID from next peer
  order_suma as xorder_suma, -- repeat column to easily visually compare with _distinct
  (SELECT sum(x) from (SELECT  sum( DISTINCT order_suma ) AS x FROM intermediate sub_q WHERE sub_q.agreement_id = intermediate.agreement_id GROUP BY agreement_id, order_id) t) as correct_total_suma
from intermediate
)
select 
*,
sum( order_suma ) filter ( where _distinct ) over ( partition by agreement_id ) as also_correct_total_suma
from distinc

【讨论】:

以上是关于如何使用窗口函数“求和(DISTINCT <column>)OVER()”?的主要内容,如果未能解决你的问题,请参考以下文章

如何使用窗口函数引用输出行?

如何使用此 jquery 函数关闭模式窗口?

如何使用似乎忽略索引的窗口函数提高查询的性能?

如何在带有 Postgres 的动态框架中使用窗口函数中的列值?

如何在 PySpark 中使用窗口函数?

如何使用 JQuery $.scrollTo() 函数滚动窗口