PostgreSQL 11 对索引应该足够的分区表进行并行 seq 扫描

Posted 2023-02-15

技术标签:

【中文标题】PostgreSQL 11 对索引应该足够的分区表进行并行 seq 扫描【英文标题】：PostgreSQL 11 goes for parallel seq scan on partitioned table where index should be enough 【发布时间】：2019-07-18 07:06:28 【问题描述】：

问题是我一直在对一个非常简单的查询进行 seq 扫描，以进行非常简单的设置。我做错了什么？

Windows Server 2016 上的 Postgres 11 配置更改完成：constraint_exclusion = partition 单表分区为 200 个子表，每个分区数千万条记录。相关字段的索引（假设一个也已分区）

这里是创建语句：

CREATE TABLE A (
    K int NOT NULL,
    X bigint NOT NULL,
    Date timestamp NOT NULL,
    fy smallint NOT NULL,
    fz decimal(18, 8) NOT NULL,
    fw decimal(18, 8) NOT NULL,
    fv decimal(18, 8) NULL,
    PRIMARY KEY (K, X)
) PARTITION BY LIST (K);

CREATE TABLE A_1 PARTITION OF A FOR VALUES IN (1);
CREATE TABLE A_2 PARTITION OF A FOR VALUES IN (2);
...
CREATE TABLE A_200 PARTITION OF A FOR VALUES IN (200);
CREATE TABLE A_Default PARTITION OF A DEFAULT;

CREATE INDEX IX_A_Date ON A (Date);

有问题的查询：

SELECT K, MIN(Date), MAX(Date)
FROM A 
GROUP BY K

这总是会产生一个需要几分钟的序列扫描，而很明显根本不需要表数据，因为 Date 字段已被索引，我只是要求其 B 树的第一个和最后一个叶子。

最初索引在(K, Date) 上，它很快就向我展示了Postgres 在我认为它正在使用的任何查询中都不会尊重一个。(Date) 上的索引对其他查询起到了作用，看起来像Postgres 声称自动分区索引。然而，这个特定的简单查询总是用于 seq 扫描。

任何想法表示赞赏！

更新

查询计划(analyze, buffers)如下：

Finalize GroupAggregate  (cost=4058360.99..4058412.66 rows=200 width=20) (actual time=148448.183..148448.189 rows=5 loops=1)
  Group Key: a_16.k
  Buffers: shared hit=5970 read=548034 dirtied=4851 written=1446
  ->  Gather Merge  (cost=4058360.99..4058407.66 rows=400 width=20) (actual time=148448.166..148463.953 rows=8 loops=1)
    Workers Planned: 2
    Workers Launched: 2
    Buffers: shared hit=5998 read=1919356 dirtied=4865 written=1454
    ->  Sort  (cost=4057360.97..4057361.47 rows=200 width=20) (actual time=148302.271..148302.285 rows=3 loops=3)
        Sort Key: a_16.k
        Sort Method: quicksort  Memory: 25kB
        Worker 0:  Sort Method: quicksort  Memory: 25kB
        Worker 1:  Sort Method: quicksort  Memory: 25kB
        Buffers: shared hit=5998 read=1919356 dirtied=4865 written=1454
        ->  Partial HashAggregate  (cost=4057351.32..4057353.32 rows=200 width=20) (actual time=148302.199..148302.203 rows=3 loops=3)
            Group Key: a_16.k
            Buffers: shared hit=5984 read=1919356 dirtied=4865 written=1454
            ->  Parallel Append  (cost=0.00..3347409.96 rows=94658849 width=12) (actual time=1.678..116664.051 rows=75662243 loops=3)
                Buffers: shared hit=5984 read=1919356 dirtied=4865 written=1454
                ->  Parallel Seq Scan on a_16  (cost=0.00..1302601.32 rows=42870432 width=12) (actual time=0.320..41625.766 rows=34283419 loops=3)
                    Buffers: shared hit=14 read=873883 dirtied=14 written=8
                ->  Parallel Seq Scan on a_19  (cost=0.00..794121.94 rows=26070794 width=12) (actual time=0.603..54017.937 rows=31276617 loops=2)
                    Buffers: shared read=533414
                ->  Parallel Seq Scan on a_20  (cost=0.00..447025.50 rows=14900850 width=12) (actual time=0.347..52866.404 rows=35762000 loops=1)
                    Buffers: shared hit=5964 read=292053 dirtied=4850 written=1446
                ->  Parallel Seq Scan on a_18  (cost=0.00..198330.23 rows=6450422 width=12) (actual time=4.504..27197.706 rows=15481014 loops=1)
                    Buffers: shared read=133826
                ->  Parallel Seq Scan on a_17  (cost=0.00..129272.31 rows=4308631 width=12) (actual time=3.014..18423.307 rows=10340224 loops=1)
                    Buffers: shared hit=6 read=86180 dirtied=1
                ...
                ->  Parallel Seq Scan on a_197  (cost=0.00..14.18 rows=418 width=12) (actual time=0.000..0.000 rows=0 loops=1)
                ->  Parallel Seq Scan on a_198  (cost=0.00..14.18 rows=418 width=12) (actual time=0.001..0.002 rows=0 loops=1)
                ->  Parallel Seq Scan on a_199  (cost=0.00..14.18 rows=418 width=12) (actual time=0.001..0.001 rows=0 loops=1)
                ->  Parallel Seq Scan on a_default  (cost=0.00..14.18 rows=418 width=12) (actual time=0.001..0.002 rows=0 loops=1)
Planning Time: 16.893 ms
Execution Time: 148466.519 ms

更新 2 只是为了避免将来出现“你应该在 (K, Date) 上编制索引”之类的问题：

具有两个索引的查询计划完全相同，分析数量相同，甚至缓冲区命中/读取也几乎相同。

【问题讨论】：

您的查询请求来自所有分区的所有行，因此索引很可能没有帮助。此外，您的索引仅包含 date 列，但不包含 K 列，因此 Postgres 需要使用随机 I/O 为每个 date 值查找 K 值，这很可能比 seq 扫描慢.您可以尝试使用k, date 上的索引。 random_page_cost 的值是多少？如果您确定随机 I/O 会更快，那么降低它可能会说服规划器支持索引扫描在 (K, Date) 上恢复索引是我尝试的第一件事，但效果不佳。 what am I doing wrong?你用的是windows？您使用 Date 作为标识符（用于时间戳...）？ X (bigint) 用于标识符，我使用日期作为日期，因为我需要一个日期才能在这里。还有窗户……到底有关系吗？时间确实看起来很慢。从共享内存中读取 1500 万行需要 27 秒是不对的。但从磁盘读取也似乎很慢：292053 个块或 2GB 在 52 秒内 - 这很可能是由 Windows 造成的，因为 NTFS 不是那里最快的文件系统。 I/O 性能缓慢的一个原因可能是病毒扫描程序。但我不知道是什么让从缓存中访问块变得如此缓慢。该服务器有多少个 CPU？也许你可以通过增加 max_parallel_workers_per_gather 和 max_parallel_workers 来缓解这个问题 【参考方案1】：

可以通过将enable_partitionwise_aggregate 设置为on 来启用聚合下推到并行计划。

这可能会在一定程度上加快您的查询速度，因为 PostgreSQL 不必在并行工作器之间传递如此多的数据。

但看起来 PostgreSQL 还不够聪明，无法确定它可以使用索引来加速每个分区的 min 和 max，尽管它足够聪明地使用非分区表来做到这一点。

没有很好的方法来解决这个问题；你可以求助于查询每个分区：

SELECT k, min(min_date), max(max_date)
FROM (
   SELECT 1 AS k, MIN(date) AS min_date, MAX(date) AS max_date FROM a_1
   UNION ALL
   SELECT 2, MIN(date), MAX(date) FROM a_2
   UNION ALL
   ...
   SELECT 200, MIN(date), MAX(date) FROM a_200
   UNION ALL
   SELECT k, MIN(date), MAX(date) FROM a_default
) AS all_a
GROUP BY k;

呸！这里显然有改进的余地。

我挖了代码，在src/backend/optimizer/plan/planagg.c找到原因：

/*
 * preprocess_minmax_aggregates - preprocess MIN/MAX aggregates
 *
 * Check to see whether the query contains MIN/MAX aggregate functions that
 * might be optimizable via indexscans.  If it does, and all the aggregates
 * are potentially optimizable, then create a MinMaxAggPath and add it to
 * the (UPPERREL_GROUP_AGG, NULL) upperrel.
[...]
 */
void
preprocess_minmax_aggregates(PlannerInfo *root, List *tlist)

[...]                                                                                
    /*
     * Reject unoptimizable cases.
     *
     * We don't handle GROUP BY or windowing, because our current
     * implementations of grouping require looking at all the rows anyway, and
     * so there's not much point in optimizing MIN/MAX.
     */
    if (parse->groupClause || list_length(parse->groupingSets) > 1 ||
        parse->hasWindowFuncs)
        return;

基本上，PostgreSQL 在看到 GROUP BY 子句时会下注。

【讨论】：

“糟糕！这里显然还有改进的余地。” topicstarter 可以使用 PostgreSQL 支持的动态 SQL 生成内部 SQL。我现在也注意到了。您将非聚合列与聚合列混合在一起，这是 SQL 标准所不允许的，因此 PostgreSQL 很可能会在您的查询中出错。谢谢，这是一个遗漏。固定。没问题我应该在外部 SQL 中说，内部 SQL 是正确的，因为那里的列是常量，所以它是允许的。谢谢！这完全解释了，在使用 Postgre 时我会牢记这一细微差别

以上是关于PostgreSQL 11 对索引应该足够的分区表进行并行 seq 扫描的主要内容，如果未能解决你的问题，请参考以下文章

PostgreSQL分区介绍

postgresql----表分区

与经典规范化表相比，postgres JSON 索引是不是足够高效？

PostgreSQL分区表

Postgresql 使用索引对连接表进行排序

PostgreSQL+表分区：低效的 max() 和 min()