奇怪的查询性能结果-greenplum 5.0中'in clause'的不同表达式数
Posted
技术标签:
【中文标题】奇怪的查询性能结果-greenplum 5.0中\'in clause\'的不同表达式数【英文标题】:strange query perf result - different expression number of 'in clause' in greenplum 5.0奇怪的查询性能结果-greenplum 5.0中'in clause'的不同表达式数 【发布时间】:2019-10-31 06:45:17 【问题描述】:我注意到在 greenplum 5.0 中使用“in 子句”时出现了奇怪的结果。
当'in Clause'的表达式数 25时,查询明显快(比数= 25)。为什么会这样?
我解释了查询,使用新/旧优化器运行,输出是相同的。这里是查询sql并解释结果。
查询 1 - 26 表达式编号
sql:
select * from table1
where column1 in ('1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26')
查询时间:0.8s ~ 0.9s
解释:
Gather Motion 8:1 (slice1; segments: 8) (cost=0.00..481.59 rows=2021 width=1069)
-> Table Scan on table1 (cost=0.00..475.60 rows=253 width=1069)
Filter: column1 = ANY ('1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25'::text[])
Settings: optimizer=on
Optimizer status: PQO version 2.42.0
解释分析:
Gather Motion 8:1 (slice1; segments: 8) (cost=0.00..481.53 rows=2003 width=1064)
Rows out: 0 rows at destination with 52 ms to end, start offset by 0.477 ms.
-> Table Scan on table1 (cost=0.00..475.63 rows=251 width=1064)
Filter: column1 = ANY ('1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26'::text[])
Rows out: 0 rows (seg0) with 51 ms to end, start offset by -358627 ms.
Slice statistics:
(slice0) Executor memory: 437K bytes.
(slice1) Executor memory: 259K bytes avg x 8 workers, 281K bytes max (seg7).
Statement statistics:
Memory used: 262144K bytes
Settings: optimizer=on
Optimizer status: PQO version 2.42.0
Total runtime: 53.107 ms
查询 2 - 25 表达式编号
sql:
select * from table1
where column1 in ('1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25')
查询时间:1.2s ~ 1.5s
解释:
Gather Motion 8:1 (slice1; segments: 8) (cost=0.00..481.59 rows=2021 width=1069)
-> Table Scan on table1 (cost=0.00..475.60 rows=253 width=1069)
Filter: column1 = ANY ('1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25'::text[])
Settings: optimizer=on
Optimizer status: PQO version 2.42.0
解释分析:
Gather Motion 8:1 (slice1; segments: 8) (cost=0.00..481.53 rows=2003 width=1064)
Rows out: 0 rows at destination with 60 ms to end, start offset by 0.517 ms.
-> Table Scan on table1 (cost=0.00..475.63 rows=251 width=1064)
Filter: column1 = ANY ('1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25'::text[])
Rows out: 0 rows (seg0) with 59 ms to end, start offset by -155783 ms.
Slice statistics:
(slice0) Executor memory: 437K bytes.
(slice1) Executor memory: 191K bytes avg x 8 workers, 191K bytes max (seg0).
Statement statistics:
Memory used: 262144K bytes
Settings: optimizer=on
Optimizer status: PQO version 2.42.0
Total runtime: 60.584 ms
gp 在 3 个 vm、1 个 master 和 2 个 segment 中运行,每个 segment 有 4 个数据目录。
table1 有 500,000 行 50 列,主键和分布键是另一列,在 uuid 中。 column1 不是分发键或主键,只是自然键之一。
【问题讨论】:
【参考方案1】:您可以运行解释分析来查看计划究竟花费了哪些时间。在这里分享。
【讨论】:
运行时间约为 60 毫秒。我认为数据库级别没有任何区别。以上是关于奇怪的查询性能结果-greenplum 5.0中'in clause'的不同表达式数的主要内容,如果未能解决你的问题,请参考以下文章