如何使用 LIMIT syntx 优化具有 17 个连接表的复杂查询并限制每个连接的数据
Posted
技术标签:
【中文标题】如何使用 LIMIT syntx 优化具有 17 个连接表的复杂查询并限制每个连接的数据【英文标题】:How to Optimize complex query with 17 join tables and limiting data per join using LIMIT syntx 【发布时间】:2021-06-16 20:05:19 【问题描述】:下面是我们系统中生成的复杂查询的示例。在此示例中,我们将连接到其他 17 个表的数据。对于每个连接表,我使用语法 LIMIT 关键字来限制每个连接表返回的项目数。目标是每个连接表最多检索 50 个项目。对于连接数少得多的查询 (7-10),这似乎可以正常工作。
但是,在此查询中使用 50 的限制,我收到错误:错误:临时文件大小超过 temp_file_limit (1025563kB)。
如果我将限制从 50 更改为 5,查询将在 36 秒内运行。如果我将限制从 50 更改为 3,它会在 3 秒内运行。如果我把它改成2,它会在260ms内运行
我的问题是,有没有更有效的方法来运行这样的复杂查询,每次连接可以返回 50 个项目?或者对于 postgres 处理的单个查询来说太多了?
奇怪的是,随着返回子项的数量从 5 个减少到 2 个,它下降到 260 毫秒。
SELECT Count (*),
array_to_json((Array_agg(t))[0:500]) AS array
FROM (
SELECT tbl_338.id,
custom.fullname AS "CustomID",
tbl_338.field_7,
tbl_338.field_6,
tbl_338.field_5,
tbl_338.field_1,
tbl_338.field_2,
tbl_338.field_18,
tbl_338.field_17,
tbl_338.field_3,
tbl_338.field_32,
tbl_338.addedon,
tbl_338.updatedon,
tbl_338.field_16,
tbl_338.id,
tbl_338.addedby,
tbl_338.updatedby ,
jsonb_agg(DISTINCT jsonb_build_object('id',tbl_340_field_15.id,'data',tbl_340_field_15.fullname)) AS field_15,
jsonb_agg(DISTINCT jsonb_build_object('id',tbl_408_field_30.id,'data',tbl_408_field_30.fullname)) AS field_30,
jsonb_agg(DISTINCT jsonb_build_object('id',tbl_342_field_19.id,'data',tbl_342_field_19.fullname)) AS field_19 ,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_34.optionid,'data',field_34.OPTION,'attributes',field_34.attributes)) AS field_34,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_23.optionid,'data',field_23.OPTION,'attributes',field_23.attributes)) AS field_23,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_24.optionid,'data',field_24.OPTION,'attributes',field_24.attributes)) AS field_24,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_22.optionid,'data',field_22.OPTION,'attributes',field_22.attributes)) AS field_22,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_33.optionid,'data',field_33.OPTION,'attributes',field_33.attributes)) AS field_33,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_37.optionid,'data',field_37.OPTION,'attributes',field_37.attributes)) AS field_37,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_36.optionid,'data',field_36.OPTION,'attributes',field_36.attributes)) AS field_36,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_21.optionid,'data',field_21.OPTION,'attributes',field_21.attributes)) AS field_21,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_38.optionid,'data',field_38.OPTION,'attributes',field_38.attributes)) AS field_38,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_14.optionid,'data',field_14.OPTION,'attributes',field_14.attributes)) AS field_14,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_31.optionid,'data',field_31.OPTION,'attributes',field_31.attributes)) AS field_31,
jsonb_agg(DISTINCT jsonb_build_object('optionid',field_8.optionid,'data',field_8.OPTION,'attributes',field_8.attributes)) AS field_8 ,
jsonb_agg(DISTINCT jsonb_build_object('messageid',msg.messageid,'message',msg.message,'schedule',msg.schedule,'tablerowid',msg.tablerowid,'addedon',msg.addedon)) AS field_4
FROM schema_131.tbl_338 tbl_338
LEFT JOIN schema_131.tbl_338_customid custom
ON custom.id=tbl_338.id
LEFT JOIN lateral
(
SELECT DISTINCT field_15.*
FROM schema_131.tbl_338_to_tbl_340_field_15 field_15
WHERE field_15.tbl_338_field_15_id=tbl_338.id limit 50) field_15
ON true
LEFT JOIN lateral
(
SELECT DISTINCT tbl_340_field_15.*
FROM schema_131.tbl_340_customid tbl_340_field_15
WHERE tbl_340_field_15.id = field_15.tbl_340_field_5_id limit 50 ) tbl_340_field_15
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_30.*
FROM schema_131.tbl_408_to_tbl_338_field_4 field_30
WHERE field_30.tbl_338_field_30_id=tbl_338.id limit 50) field_30
ON true
LEFT JOIN lateral
(
SELECT DISTINCT tbl_408_field_30.*
FROM schema_131.tbl_408_customid tbl_408_field_30
WHERE tbl_408_field_30.id = field_30.tbl_408_field_4_id limit 50 ) tbl_408_field_30
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_19.*
FROM schema_131.tbl_338_to_tbl_342_field_19 field_19
WHERE field_19.tbl_338_field_19_id=tbl_338.id limit 50) field_19
ON true
LEFT JOIN lateral
(
SELECT DISTINCT tbl_342_field_19.*
FROM schema_131.tbl_342_customid tbl_342_field_19
WHERE tbl_342_field_19.id = field_19.tbl_342_field_5_id limit 50 ) tbl_342_field_19
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_34_join.*
FROM schema_131.tbl_338_field_34_join field_34_join
WHERE field_34_join.id=tbl_338.id limit 50) field_34_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_34.*
FROM schema_131.tbl_338_field_34 field_34
WHERE field_34.optionid = field_34_join.optionid
ORDER BY field_34.rank limit 5 ) field_34
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_23_join.*
FROM schema_131.tbl_338_field_23_join field_23_join
WHERE field_23_join.id=tbl_338.id limit 50) field_23_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_23.*
FROM schema_131.tbl_338_field_23 field_23
WHERE field_23.optionid = field_23_join.optionid
ORDER BY field_23.rank limit 5 ) field_23
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_24_join.*
FROM schema_131.tbl_338_field_24_join field_24_join
WHERE field_24_join.id=tbl_338.id limit 50) field_24_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_24.*
FROM schema_131.tbl_338_field_24 field_24
WHERE field_24.optionid = field_24_join.optionid
ORDER BY field_24.rank limit 5 ) field_24
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_22_join.*
FROM schema_131.tbl_338_field_22_join field_22_join
WHERE field_22_join.id=tbl_338.id limit 50) field_22_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_22.*
FROM schema_131.tbl_338_field_22 field_22
WHERE field_22.optionid = field_22_join.optionid
ORDER BY field_22.rank limit 5 ) field_22
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_33_join.*
FROM schema_131.tbl_338_field_33_join field_33_join
WHERE field_33_join.id=tbl_338.id limit 50) field_33_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_33.*
FROM schema_131.tbl_338_field_33 field_33
WHERE field_33.optionid = field_33_join.optionid
ORDER BY field_33.rank limit 5 ) field_33
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_37_join.*
FROM schema_131.tbl_338_field_37_join field_37_join
WHERE field_37_join.id=tbl_338.id limit 50) field_37_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_37.*
FROM schema_131.tbl_338_field_37 field_37
WHERE field_37.optionid = field_37_join.optionid
ORDER BY field_37.rank limit 5 ) field_37
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_36_join.*
FROM schema_131.tbl_338_field_36_join field_36_join
WHERE field_36_join.id=tbl_338.id limit 50) field_36_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_36.*
FROM schema_131.tbl_338_field_36 field_36
WHERE field_36.optionid = field_36_join.optionid
ORDER BY field_36.rank limit 5 ) field_36
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_21_join.*
FROM schema_131.tbl_338_field_21_join field_21_join
WHERE field_21_join.id=tbl_338.id limit 50) field_21_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_21.*
FROM schema_131.tbl_338_field_21 field_21
WHERE field_21.optionid = field_21_join.optionid
ORDER BY field_21.rank limit 5 ) field_21
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_38_join.*
FROM schema_131.tbl_338_field_38_join field_38_join
WHERE field_38_join.id=tbl_338.id limit 50) field_38_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_38.*
FROM schema_131.tbl_338_field_38 field_38
WHERE field_38.optionid = field_38_join.optionid
ORDER BY field_38.rank limit 5 ) field_38
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_14_join.*
FROM schema_131.tbl_338_field_14_join field_14_join
WHERE field_14_join.id=tbl_338.id limit 50) field_14_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_14.*
FROM schema_131.tbl_338_field_14 field_14
WHERE field_14.optionid = field_14_join.optionid
ORDER BY field_14.rank limit 5 ) field_14
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_31_join.*
FROM schema_131.tbl_338_field_31_join field_31_join
WHERE field_31_join.id=tbl_338.id limit 50) field_31_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_31.*
FROM schema_131.tbl_338_field_31 field_31
WHERE field_31.optionid = field_31_join.optionid
ORDER BY field_31.rank limit 5 ) field_31
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_8_join.*
FROM schema_131.tbl_338_field_8_join field_8_join
WHERE field_8_join.id=tbl_338.id limit 50) field_8_join
ON true
LEFT JOIN lateral
(
SELECT DISTINCT field_8.*
FROM schema_131.tbl_338_field_8 field_8
WHERE field_8.optionid = field_8_join.optionid
ORDER BY field_8.rank limit 5 ) field_8
ON true
LEFT JOIN lateral
(
SELECT DISTINCT msg.*
FROM schema_131.messages msg
WHERE msg.graceblockssms=tbl_338.smsnumber
AND msg.recipientsms=tbl_338.field_3
ORDER BY msg.addedon DESC limit 1) msg
ON true
GROUP BY tbl_338.id,
custom.fullname,
tbl_338.field_7,
tbl_338.field_6,
tbl_338.field_5,
tbl_338.field_1,
tbl_338.field_2,
tbl_338.field_18,
tbl_338.field_17,
tbl_338.field_3,
tbl_338.field_32,
tbl_338.addedon,
tbl_338.updatedon,
tbl_338.field_16,
tbl_338.id,
tbl_338.addedby,
tbl_338.updatedby
ORDER BY tbl_338.id ASC ) t;
【问题讨论】:
【参考方案1】:首先,PostGreSQL 不是为复杂查询而设计的...您应该使用另一个支持这种复杂性的 RDBMS。
-
PostGreSQL 通过一个参数将 join 的优化限制为 12
调用“geqo_threshold”(默认值为 12)
在 PG 中,寻找优化执行计划的时间是 JOIN 的一个阶乘,这是由于优化器中使用的算法...
如果将 geqo_threshold 设置为上限值,则计算
优化计划,会增加太多,可以优于
使用简单的执行计划执行查询。
如果您留下 geqo_threshold 的实际值,执行计划将
可能会在更短的时间内计算出来,但会提供最差的执行
计划..
所以你进退两难了:
你想要一个最差的执行计划 您是否想要一个好的执行计划,这将花费太多时间来计算PG工作人员对geqo使用的讨论,暴露了一个死胡同…… https://www.postgresql.org/docs/7.1/geqo-pg-intro.html#GEQO-FUTURE
那么,该怎么办?
FIRST:尝试增加geqo_threshold 并进行一些测试。但是请使用您在 3 到 5 年内应该拥有的真实数据量来避免影响您的项目。
第二次:如果您的结果从第一部分得出结论,这是一个不可接受的情况...将您的数据库转移到不存在这种情况问题的 RDBMS。 Microsoft SQL Server 实际上是这方面的最佳选择(Oracle 的最佳优化器,成本更低),并且 SQL Server 在 Linux 上可用。
要了解 PostGreSQL 的局限性和糟糕的性能,请阅读我的论文: http://mssqlserver.fr/postgresql-vs-sql-server-mssql-part-3-very-extremely-detailed-comparison/ http://mssqlserver.fr/postgresql-vs-microsoft-sql-server-comparison-part-2-count-performances/ http://mssqlserver.fr/postgresql-vs-microsoft-part-1-dba-queries-performances/
【讨论】:
以上是关于如何使用 LIMIT syntx 优化具有 17 个连接表的复杂查询并限制每个连接的数据的主要内容,如果未能解决你的问题,请参考以下文章