Postgresql 日期函数扫描每个分区中的每一行并且执行速度非常慢
Posted
技术标签:
【中文标题】Postgresql 日期函数扫描每个分区中的每一行并且执行速度非常慢【英文标题】:Postgresql date function scanning every row in every partition and is very slow performing 【发布时间】:2018-06-01 05:46:14 【问题描述】:我有一个大分区表tbl_VehicleEntry
。
我创建了一个函数F_GetSysDate()
(为了我的函数与oracle的兼容性)
创建表 tbl_vehicleentry (
vehicleentry_code numeric(12,0) NOT NULL,
shift_date timestamp without time zone NOT NULL,
shift_code numeric(1,0) NOT NULL,
booth_code numeric(2,0) NOT NULL,
.
.
N number of columns);
这样的分区...
创建表 tbl_vehicleentry_2016 (
CONSTRAINT tbl_vehicleentry_2016_shift_date_check CHECK (((shift_date >= '2016-01-01'::date) AND (shift_date < '2017-01-01'::date)))
) 继承 (tbl_vehicleentry);
ALTER TABLE tbl_vehicleentry_2016 所有者到 tms;
创建表 tbl_vehicleentry_201701 (
CONSTRAINT tbl_vehicleentry_201701_shift_date_check CHECK (((shift_date >= '2017-01-01'::date) AND (shift_date < '2017-02-01'::date)))
)
继承(tbl_vehicleentry);
ALTER TABLE tbl_vehicleentry_201701 所有者为 tms;
创建表 tbl_vehicleentry_201702 (
CONSTRAINT tbl_vehicleentry_201702_shift_date_check CHECK (((shift_date >= '2017-02-01'::date) AND (shift_date < '2017-03-01'::date)))
)
继承(tbl_vehicleentry);
ALTER TABLE tbl_vehicleentry_201702 所有者为 tms;
创建表 tbl_vehicleentry_201703 (
CONSTRAINT tbl_vehicleentry_201703_shift_date_check CHECK (((shift_date >= '2017-03-01'::date) AND (shift_date < '2017-04-01'::date)))
)
继承(tbl_vehicleentry);
ALTER TABLE tbl_vehicleentry_201703 所有者为 tms;
.....等等 2017 年以后的月度分区
-- FUNCTION: public.f_getsysdate()
-- DROP FUNCTION public.f_getsysdate();
CREATE OR REPLACE FUNCTION public.f_getsysdate(
)
RETURNS timestamp without time zone
LANGUAGE 'plpgsql'
COST 100
STABLE SECURITY DEFINER
AS $BODY$
DECLARE
V_ReturnName VARCHAR2 ;
BEGIN
RETURN current_timestamp::timestamp(0);
END
$BODY$;
ALTER FUNCTION public.f_getsysdate()
OWNER TO tms;
现在,当我运行类似...的查询时......
Explain analyze
SELECT MAX(Vehicleentry_Code) FROM tbl_VehicleEntry
WHERE Shift_Date >= f_getsysdate() - 30
或者
Explain analyze
SELECT MAX(Vehicleentry_Code) FROM tbl_VehicleEntry
WHERE Shift_Date >= f_getsysdate() - interval '30' day
我观察到,它正在扫描表的每个分区的每一行并使其非常慢。 下面是解释
聚合(成本=324.08..324.09 行=1 宽度=32) -> 追加(成本=0.68..323.88 行=79 宽度=16)
-> Index Scan using isd_tbl_vehicleentry on tbl_vehicleentry (cost=0.68..4.70 rows=1 width=8)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_2015 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_2015 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_2016 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_2016 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201701 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201701 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201702 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201702 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201703 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201703 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201704 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201704 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201705 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201705 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201706 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201706 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201707 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201707 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201708 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201708 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201709 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201709 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201710 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201710 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201711 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201711 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201712 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201712 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201801 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201801 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201802 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201802 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201803 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201803 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201804 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201804 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201805 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201805 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201806 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201806 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201807 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201807 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201808 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201808 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201809 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201809 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201810 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201810 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201811 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201811 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Heap Scan on tbl_vehicleentry_201812 (cost=4.41..12.28 rows=3 width=16)
Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
-> Bitmap Index Scan on isd_tbl_vehicleentry_201812 (cost=0.00..4.41 rows=3 width=0)
Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))
**看到它已经扫描了我的 postgresql 表的每个分区的每一行 并使其执行非常缓慢。 有什么问题?
问题肯定出在功能上。
有没有其他方法可以让它更快? 请帮忙。**
【问题讨论】:
如果您需要任何回复,请按原样发布EXPLAIN
输出,不要完全损坏。
我已经发布了 EXPLAIN ANALYZE 输出。请查看问题底部
我的评论中重要的部分是没有完全被破坏。您发布的内容无法阅读。
现在请检查,我已经重新编辑了。
【参考方案1】:
那是因为在计划查询的时候函数的值是未知的,所以优化器不知道是否可以排除某些分区。
您应该首先查询f_getsysdate()
,然后根据结果构造一条 SQL 语句并执行它。这样限制将是一个常量,PostgreSQL 优化器可以使用它。
如果函数是IMMUTABLE
,PostgreSQL 可以做得更好,但基于名称我认为这不是一个选项。
【讨论】:
我做到了。在我的前端,我将 f_getsysdate() 的结果放在一个变量中并将其传递给查询,它工作正常。但问题是,oracle 上的相同查询运行速度非常快,但没有将其结果转换为如上所述的变量。 oracle 当时如何处理它以及为什么 postgres 扫描每个分区的每一行。 无法回答有关Oracle的任何问题,但除非函数是IMMUTABLE
,否则PostgreSQL在实际执行函数之前无法知道结果(因为执行时可能与计划时不同) .它必须在执行之前计划查询。添加一个在执行时消除分区的功能可能会很有趣,但还没有人写过。
所以你的问题得到了回答。
是的,就像我解释的那样。所以不要使用该功能。
请注意,如果您使用 PL/pgSQL 函数或预处理语句,PostgreSQL 现在可能会在数据库会话的生命周期内缓存 f_getsysdate()
的结果。以上是关于Postgresql 日期函数扫描每个分区中的每一行并且执行速度非常慢的主要内容,如果未能解决你的问题,请参考以下文章
如何在 PostgreSQL 中获取表的每一天的第一个日期并将其转换为 JSON
PostgreSQL 11 对索引应该足够的分区表进行并行 seq 扫描