postgresql 不使用索引作为主键 = 外键

Posted

技术标签:

【中文标题】postgresql 不使用索引作为主键 = 外键【英文标题】:postgresql doesn't use index for primary key = foreign key 【发布时间】:2017-12-15 09:18:59 【问题描述】:

我有 3 张主桌,

ts_entity(id,short_name,name,type_id) ts_entry_entity(id,entity_id,entry_id) ts_entry(id, ... other columns ...)

所有的 id 列都是 UUID,并且有一个 Btree 索引。

ts_entry_entity.entity_id 具有ts_entity.id 的外键,并且还具有 Btree 索引。

ts_entry_entity.entry_id也是外键,也有Btree索引。

我有一个 SQL,比如

select ts_entity.id,ts_entity.short_name,ts_entity.name,ts_entry.id, ... ts_entry.otherColumns ... 
from ts_entity,ts_entry_entity,ts_entry 
where ts_entity.id=ts_entry_entity.entity_id 
and ts_entry_entity.entry_id=ts_entry.id 
and ... ts_entry.otherColumns='xxx' ... 
order by ts_entity.short_name 
limit 100 offset 0

奇怪的事情来了,“ts_entry_entity.entity_id=ts_entity.id”不使用任何索引,而且花费大约 50 秒。

ts_entity 上没有 where 条件。

我的问题:为什么ts_entry_entity.entity_id=ts_entity.id 不使用索引?为什么要花这么多时间?如何优化 SQL?

下面是explain analyze 结果。

Limit  (cost=235455.31..235455.41 rows=1 width=1808) (actual
time=54590.304..54590.781 rows=100 loops=1)    ->  Unique 
(cost=235455.31..235455.41 rows=1 width=1808) (actual
time=54590.301..54590.666 rows=100 loops=1)
         ->  Sort  (cost=235455.31..235455.32 rows=1 width=1808) (actual time=54590.297..54590.410 rows=100 loops=1)
               Sort Key: ts_entity.short_name, ts_entity.id, ts_entity.name, ts_entry_version.display_date, ts_entry.id,
(formatdate(totimestamp(ts_entry_version.display_date, '-5'::character
varying), 'MM/DD/YYYY'::charac ter varying)),
ts_entry_version.submitted_date,
(formatdate(totimestamp(ts_entry_version.submitted_date,
'-5'::character varying), 'MM/DD/YYYY'::character varying)),
ts_entry_type.name, (get_priority((ts_entry_version.prio
rity)::integer)), ts_entry_version.priority,
(get_sentiment((ts_entry_version.sentiment)::integer)),
ts_entry_version.sentiment,
(getdisplayvalue((ts_entry_version.source_id)::character varying, 0,
', '::character varying) ), ts_entry_version.source_id,
(NULLIF((ts_entry_version.title)::text, ''::text)),
ts_entry.submitted_date,
(formatdate(totimestamp(ts_entry.submitted_date, '-5'::character
varying), 'MM/DD/YYYY'::character varying)), (get
displayvalue((ts_entry_version.submitter_id)::character varying, 0, ',
'::character varying)), ts_entry_version.submitter_id,
entryadhoc_o9e2c9f871634dd3aeafe9bdced2e34f.owner_id,
(getdisplayvalue(toentityid((entryadhoc_o9
e2c9f871634dd3aeafe9bdced2e34f.value)::character varying,
'23f03fe70a16aed0d7e210357164e401'::character varying), 0, ',
'::character varying)),
(toentityid((entryadhoc_o9e2c9f871634dd3aeafe9bdced2e34f.value)::character
var ying, '23f03fe70a16aed0d7e210357164e401'::character varying)),
entryadhoc_td66ad96a9ab472db3cf1279b65baa69.owner_id,
(totimestamp((entryadhoc_td66ad96a9ab472db3cf1279b65baa69.value)::character
varying, '-5'::character vary ing)),
(formatdate(totimestamp((entryadhoc_td66ad96a9ab472db3cf1279b65baa69.value)::character
varying, '-5'::character varying), 'MM/DD/YYYY'::character varying)),
entryadhoc_z3757638d8d64373ad835c3523a6a70b.owner_id, (tot
imestamp((entryadhoc_z3757638d8d64373ad835c3523a6a70b.value)::character
varying, '-5'::character varying)),
(formatdate(totimestamp((entryadhoc_z3757638d8d64373ad835c3523a6a70b.value)::character
varying, '-5'::character va rying), 'MM/DD/YYYY'::character varying)),
entryadhoc_i0f819c1244b427794a83767eaa68e73.owner_id,
(totimestamp((entryadhoc_i0f819c1244b427794a83767eaa68e73.value)::character
varying, '-5'::character varying)), (formatdate(t
otimestamp((entryadhoc_i0f819c1244b427794a83767eaa68e73.value)::character
varying, '-5'::character varying), 'MM/DD/YYYY'::character varying)),
entryadhoc_i7f5d5035cac421daa9879c1e21ec63f.owner_id,
(getdisplayvalue(toentit
yid((entryadhoc_i7f5d5035cac421daa9879c1e21ec63f.value)::character
varying, '23f03fe70a16aed0d7e210357164e401'::character varying), 0, ',
'::character varying)),
(toentityid((entryadhoc_i7f5d5035cac421daa9879c1e21ec63f.val
ue)::character varying, '23f03fe70a16aed0d7e210357164e401'::character
varying)), entryadhoc_v7f9c1146ee24742a73b83526dc66df7.owner_id,
(NULLIF(entryadhoc_v7f9c1146ee24742a73b83526dc66df7.value, ''::text))
               Sort Method: external merge  Disk: 3360kB
               ->  Nested Loop  (cost=22979.01..235455.30 rows=1 width=1808) (actual time=94.889..54532.919 rows=2846 loops=1)
                     Join Filter: (ts_entry_entity.entity_id = ts_entity.id)
                     Rows Removed by Join Filter: 34363583
                     ->  Nested Loop  (cost=22979.01..234676.15 rows=1 width=987) (actual time=78.801..2914.864 rows=2846 loops=1)
                           ->  Nested Loop Anti Join  (cost=22978.59..234675.43 rows=1 width=987) (actual
time=78.776..2867.254 rows=2846 loops=1)
                                 ->  Hash Join  (cost=22978.17..63457.52 rows=258 width=987) (actual
time=78.614..2573.586 rows=2846 loops=1)
                                       Hash Cond: (ts_entry.current_version_id = ts_entry_version.id)
                                       ->  Hash Left Join  (cost=19831.38..59727.56 rows=154823 width=383) (actual
time=47.558..2391.088 rows=155061 loops=1)
                                             Hash Cond: (ts_entry.id = entryadhoc_v7f9c1146ee24742a73b83526dc66df7.owner_id)
                                             ->  Hash Left Join  (cost=16526.15..54467.69 rows=154823 width=337) (actual
time=38.534..2138.354 rows=155061 loops=1)
                                                   Hash Cond: (ts_entry.id = entryadhoc_i7f5d5035cac421daa9879c1e21ec63f.owner_id)
                                                   ->  Hash Left Join  (cost=13220.92..49207.82 rows=154823 width=291) (actual
time=30.462..1888.735 rows=155061 loops=1)
                                                         Hash Cond: (ts_entry.id = entryadhoc_i0f819c1244b427794a83767eaa68e73.owner_id)
                                                         ->  Hash Left Join  (cost=9915.69..43947.95 rows=154823 width=245) (actual
time=22.268..1640.688 rows=155061 loops=1)
                                                               Hash Cond: (ts_entry.id =
entryadhoc_z3757638d8d64373ad835c3523a6a70b.owner_id)
                                                               ->  Hash Left Join  (cost=6610.46..38688.08 rows=154823 width=199) (actual
time=19.612..1409.457 rows=155061 loops=1)
                                                                     Hash Cond: (ts_entry.id =
entryadhoc_td66ad96a9ab472db3cf1279b65baa69.owner_id)
                                                                     ->  Hash Left Join  (cost=3305.23..33428.21 rows=154823 width=153) (actual time=12.431..1161.689 rows=155061 loops=1)
                                                                           Hash Cond: (ts_entry.id =
entryadhoc_o9e2c9f871634dd3aeafe9bdced2e34f.owner_id)
                                                                           ->  Seq Scan on ts_entry  (cost=0.00..28168.34 rows=154823 width=107) (actual time=0.101..898.818 rows=155061 loops=1)
                                                                                 Filter: ((NOT is_draft) AND (class <> 2))
                                                                                 Rows Removed by Filter: 236596
                                                                           ->  Hash  (cost=3292.29..3292.29 rows=1035 width=46) (actual time=12.304..12.304 rows=2846 loops=1)
                                                                                 Buckets: 4096 (originally 2048)  Batches: 1 (originally 1)  Memory
Usage: 305kB
                                                                                 ->  Bitmap Heap Scan on ts_attribute entryadhoc_o9e2c9f871634dd3aeafe9bdced2e34f  (cost=40.45..3292.29
rows=1035 width=46) (actual time=1.191 ..9.030 rows=2846 loops=1)
                                                                                       Recheck Cond: (def_id = 'b4e9878722eb409c9fdfff3fdba582a3'::bpchar)

有关表格的更多详细信息:

ts_entity(id,short_name,name,type_id) ts_entry_entity(id,entity_id,entry_id) ts_entry(id,version_id) ts_entry_version(id,entry_id,submitted_date,title,submitter) ts_attribute(id,attribute_definition_id,entry_id,value) ts_attribute_definition(id,name)

如您所见,ts_entry_version 将保存一个条目的所有版本。 ts_attribute 用于条目的可扩展列。

有关 SQL 的更多详细信息

我们在 ts_entry_version 列和 ts_attribute.value 上有几个过滤器。 ts_attribute.value 是 varchar,但内容可能是时间毫秒、普通字符串值、一个或多个 id 值。 SQL的结构如下:

select ts_entity.short_name, ts_entry_version.title, ts_attribute.value from ts_entity, ts_entry_entity,ts_entry left join ts_attribute on ts_entry.id=ts_attribute.entry_id and ts_attribute.attribute_definition_id='xxx' where ts_entity.id=ts_entry_entity.entity_id and ts_entry_entity.entry_id=ts_entry.id and ts_entry.version_id=ts_entry_version.id and ts_entry_version.title like '%xxx%' order by ts_entity.short_name asc limit 100 offset 0

【问题讨论】:

请将您的执行计划发布为formatted text,no screen shots。也可以上传到explain.depesz.com 刚刚添加了执行计划的格式化文本。抱歉截屏了。 您必须发布整个计划,不仅是其中的一部分,而且格式要正确,以便人们理解。显然,由于低估了,所以选择了嵌套循环连接。 似乎是一个以ts_entry_entity 作为桥接表的 EAV 模型。我将在其上放置两个复合索引 entry_id ,entity_id entity_id ,entrty_id (并松开 id 代理项)并在所有三个表上运行 VACUUM ANALYZE。并且有很多日期到字符串的转换正在进行,我希望这不是连接字段的一部分。 并且 order by ts_entity.short_name 与 join-fields 正交。结合 LIMIT 100 这可能导致需要在排序和限制之前检索所有结果。 short_name 上的索引可以提供帮助。和VACUUM ANALYZE 【参考方案1】:

我在 PostgreSQL 官方文档中找到了线索,https://www.postgresql.org/docs/current/static/runtime-config-query.html

更改配置,查询优化器将更喜欢使用索引。

【讨论】:

以上是关于postgresql 不使用索引作为主键 = 外键的主要内容,如果未能解决你的问题,请参考以下文章

mysql中的InnoDB和MyISAM

主键、外键和索引的区别?

主键,外键和索引的区别

主键外键和索引的区别?

主键外键和索引的区别?

主键外键和索引的区别?