小表上的仅索引扫描非常慢

Posted 2023-04-14

技术标签:

【中文标题】小表上的仅索引扫描非常慢【英文标题】：Index-only scan very slow on small table 【发布时间】：2021-04-21 14:20:20 【问题描述】：

我正在尝试优化以下 PostgreSQL 查询：

SELECT DISTINCT route.id,
                route.ref_origine,
                route.platform
FROM route
         INNER JOIN level ON (route.id IN (
    SELECT level_route.route_id
    FROM level_route
    WHERE level.id = level_route.level_id))
         CROSS JOIN site
WHERE level.ref_site = site.id
  AND site.ref_client = 1
  AND site.visible = TRUE;

为了清楚起见，我重命名了表和字段，但它是由 QueryDSL 生成的。

在分析这个查询时，我可以看到：

预计的计划时间总是在 2 秒左右实际执行时间始终在 1.4 秒左右

我想提高这个特定查询的性能，因为我们有很多这样的查询要执行，这使我们的工作运行时间过长。

当输入我的一个解释分析结果时，PostgreSQL Explain Visualizer 会为我提供以下信息：

执行时间（秒）：2.16 计划时间（毫秒）：0.22 最慢节点：2.16 最大节点（行）：3030 最昂贵的节点：581,444.75

再次运行 EXPLAIN ANALYZE（PEV 没有 JSON 格式，所以它有点不同但类似的计划）给了我以下输出：

HashAggregate  (cost=582383.52..582444.12 rows=6060 width=17) (actual time=2054.753..2054.984 rows=1088 loops=1)
"  Group Key: route.id, route.ref_origine, route.platform"
  ->  Nested Loop  (cost=3.99..581516.26 rows=115635 width=17) (actual time=95.786..2053.788 rows=1230 loops=1)
        Join Filter: (SubPlan 1)
        Rows Removed by Join Filter: 268440
        ->  Seq Scan on route  (cost=0.00..50.30 rows=3030 width=17) (actual time=0.006..1.104 rows=3030 loops=1)
        ->  Materialize  (cost=3.99..9.15 rows=76 width=8) (actual time=0.000..0.007 rows=89 loops=3030)
              ->  Hash Join  (cost=3.99..8.77 rows=76 width=8) (actual time=0.029..0.070 rows=89 loops=1)
                    Hash Cond: (level.ref_site = site.id)
                    ->  Seq Scan on level  (cost=0.00..3.47 rows=147 width=16) (actual time=0.002..0.018 rows=148 loops=1)
                    ->  Hash  (cost=3.65..3.65 rows=27 width=8) (actual time=0.021..0.021 rows=28 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 10kB
                          ->  Seq Scan on site  (cost=0.00..3.65 rows=27 width=8) (actual time=0.005..0.017 rows=28 loops=1)
                                Filter: (visible AND (ref_client = 1))
                                Rows Removed by Filter: 25
        SubPlan 1
          ->  Index Only Scan using level_route_pkey on level_route  (cost=0.28..4.68 rows=23 width=8) (actual time=0.002..0.005 rows=14 loops=269670)
                Index Cond: (level_id = level.id)
                Heap Fetches: 0
Planning time: 0.221 ms
Execution time: 2055.094 ms

据我所知，问题似乎出在此复制/粘贴计划的最后一部分：表 level_route 的索引仅扫描。

确实，level_route是一个3030行的表，只包含两个字段：level_id和route_id。常规关联表。主键 level_route_pkey 是 (level_id, route_id) 所以我们在这个字段组合上有一个索引，它用于仅索引扫描。虽然这看起来合乎逻辑且聪明（避免了无用的堆提取），但为什么对包含两个基本字段的 3030 行表的仅索引扫描需要超过一秒的时间才能执行？

这里有一些我不明白的地方。寻找任何建议/建议以改进此查询执行时间和性能。

这里是表格和相关的索引定义：

路由表

id          bigint                not null,
ref_origine bigint,
platform    boolean default false not null,
constraint route_pkey
    primary key (id),
constraint fk6o50xseq5lpmpw8mo8de5nylu
    foreign key (ref_origine) references point

等级表

id          bigint                      not null,
altitude    integer           default 0 not null,
elevation   integer                     not null,
name        varchar(255)                not null,
svg         text                        not null,
ref_site    bigint,
description varchar(10485760) default ''::character varying,
type        varchar(50)       default NULL::character varying,
constraint level_pkey
    primary key (id),
constraint unique_elevation_for_level_in_site
    unique (ref_site, elevation),
constraint fkl4mlang0ahx72vnrp1slop579
    foreign key (ref_site) references site

level_route 表

level_id bigint not null,
route_id bigint not null,
constraint level_route_pkey
    primary key (level_id, route_id),
constraint fkd0kfn8u2jfqcg46li5aw2mp5n
    foreign key (level_id) references level,
constraint fkm1to01mwy537dxb7vu8qk9ut1
    foreign key (route_id) references route

create index if not exists level_route_level_id_index
    on level_route (level_id);

网站表格

id          bigint                                          not null,
description varchar(255),
name        varchar(255)                                    not null,
naoid       bigint,
naotoken    varchar(255),
version     integer      default 0                          not null,
ref_client  bigint,
uuid        varchar(255) default ''::character varying      not null,
deployement varchar(50)  default 'Never'::character varying not null,
deployedat  timestamp,
visible     boolean      default false                      not null,
constraint site_pkey
    primary key (id),
constraint fkhqoxjhhr0ud7i5cvu9nhwc62k
    foreign key (ref_client) references client

非常感谢愿意花时间在这里帮助我的人。

【问题讨论】：

CROSS JOIN site 似乎很奇怪，然后有 WHERE 条件涉及它。这通常意味着使用带有 ON 条件的常规 JOIN。你说得很好。然而，即使我切换此 CROSS JOIN 并使用 JOIN ON 代替 level.ref_site = site.id 上的条件，它也不会更改查询计划中最慢的部分，也不会影响此查询的整体性能：/ 【参考方案1】：

按照您编写查询的方式，PostgreSQL 必须为连接中的每一对行执行子查询。仅索引扫描执行得非常快，但过于频繁。

以合理的方式重写查询：

SELECT DISTINCT
       route.id,
       route.ref_origine,
       route.platform
FROM route
   INNER JOIN FROM level_route
      ON route.id = level_route.route_id
   INNER JOIN level
      ON level.id = level_route.level_id
   INNER JOIN site
      ON level.ref_site = site.id
WHERE site.ref_client = 1
  AND site.visible;

【讨论】：

哇，非常感谢，我读了很多遍，没有注意到这个巨大的循环。您的查询版本非常快，我在本机查询中用这个版本替换了我们使用的 QueryDSL 实现，现在一切都运行得非常快。

以上是关于小表上的仅索引扫描非常慢的主要内容，如果未能解决你的问题，请参考以下文章