无法重置分区组(特别是 Window 函数和 PostgreSQL)

Posted

技术标签:

【中文标题】无法重置分区组(特别是 Window 函数和 PostgreSQL)【英文标题】:Unable to reset partition groups (Window functions and PostgreSQL specifically) 【发布时间】:2019-10-17 18:54:57 【问题描述】:

我有一个像这样的简单数据集:

SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name, generate_series(8, 1, -1) AS order_time;

+-------------+------------+
| customer_id | order_time |
+-------------+------------+
| "A"         | 8          |
+-------------+------------+
| "A"         | 7          |
+-------------+------------+
| "A"         | 6          |
+-------------+------------+
| "B"         | 5          |
+-------------+------------+
| "B"         | 4          |
+-------------+------------+
| "A"         | 3          |
+-------------+------------+
| "C"         | 2          |
+-------------+------------+
| "B"         | 1          |
+-------------+------------+

我正在寻找一排:

+-------------+------------+
| customer_id | order_time |
+-------------+------------+
| "A"         | 6          |
+-------------+------------+

也就是说,我想获得最新(连续)customer_id 中的第一个order_time。使用以下 SQL,我只能从 customer_id A 获得“3”作为 order_time。我似乎无法“重置”分区。

SELECT customer_name, LAST_VALUE(order_time) OVER W
FROM
(
  SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name, generate_series(8, 1, -1) AS order_time
) X
WINDOW W AS (PARTITION BY customer_name ORDER BY order_time DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY order_time DESC
LIMIT 1;

使用 PostgreSQL 11.5

【问题讨论】:

【参考方案1】:

你可以使用区别

ROW_NUMBER() OVER (ORDER BY order_time DESC)

ROW_NUMBER() OVER (PARTITION BY customer_name ORDER BY order_time DESC)gaps-and-islands 结构提供分组:

SELECT XX.customer_name, LAST_VALUE(order_time) OVER W FROM
(
 SELECT X.*, ROW_NUMBER() OVER (ORDER BY order_time DESC)-
             ROW_NUMBER() OVER (PARTITION BY customer_name ORDER BY order_time DESC) 
             AS rn

   FROM
   (
     SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name, 
            generate_series(8, 1, -1) AS order_time
    ) X
 ) XX
WINDOW W AS (PARTITION BY rn ORDER BY order_time DESC 
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
LIMIT 1; 

Demo

【讨论】:

谢谢!绝对似乎可以解决问题。可以这么说,我希望有一些不那么复杂/更内置的东西。但似乎我在备用框架规范方面走在了正确的轨道上。 顺便问一句,我不应该需要“最终”ORDER BY 对吗? LIMIT 1 之前的那个。 我们真的需要将customer_name 添加到PARTITION BY rn 吗?这不是矫枉过正吗? 是的 @Jeff ,很好,我们已经在上一步中想到了。

以上是关于无法重置分区组(特别是 Window 函数和 PostgreSQL)的主要内容,如果未能解决你的问题,请参考以下文章

Spark 12 GB 数据加载与 Window 函数性能问题

某云elasticsearch节点失效,手动重置primary,迁移分区

golang kafka Shopify/sarama 消费者重置新增分区偏移量并进行重新消费

如何让Win10 重置此电脑功能无法使用

SQL Server 数据分区管理

SQL Server 数据分区管理