Postgresql 日期函数扫描每个分区中的每一行并且执行速度非常慢

Posted

技术标签:

【中文标题】Postgresql 日期函数扫描每个分区中的每一行并且执行速度非常慢【英文标题】:Postgresql date function scanning every row in every partition and is very slow performing 【发布时间】:2018-06-01 05:46:14 【问题描述】:

我有一个大分区表tbl_VehicleEntry。 我创建了一个函数F_GetSysDate()(为了我的函数与oracle的兼容性)

创建表 tbl_vehicleentry (

vehicleentry_code numeric(12,0) NOT NULL,
shift_date timestamp without time zone NOT NULL,
shift_code numeric(1,0) NOT NULL,
booth_code numeric(2,0) NOT NULL,
.
.
N number of columns);

这样的分区...

创建表 tbl_vehicleentry_2016 (

CONSTRAINT tbl_vehicleentry_2016_shift_date_check CHECK (((shift_date >= '2016-01-01'::date) AND (shift_date < '2017-01-01'::date)))

) 继承 (tbl_vehicleentry);

ALTER TABLE tbl_vehicleentry_2016 所有者到 tms;

创建表 tbl_vehicleentry_201701 (

CONSTRAINT tbl_vehicleentry_201701_shift_date_check CHECK (((shift_date >= '2017-01-01'::date) AND (shift_date < '2017-02-01'::date)))

)

继承(tbl_vehicleentry);

ALTER TABLE tbl_vehicleentry_201701 所有者为 tms;

创建表 tbl_vehicleentry_201702 (

CONSTRAINT tbl_vehicleentry_201702_shift_date_check CHECK (((shift_date >= '2017-02-01'::date) AND (shift_date < '2017-03-01'::date)))

)

继承(tbl_vehicleentry);

ALTER TABLE tbl_vehicleentry_201702 所有者为 tms;

创建表 tbl_vehicleentry_201703 (

CONSTRAINT tbl_vehicleentry_201703_shift_date_check CHECK (((shift_date >= '2017-03-01'::date) AND (shift_date < '2017-04-01'::date)))

)

继承(tbl_vehicleentry);

ALTER TABLE tbl_vehicleentry_201703 所有者为 tms;

.....等等 2017 年以后的月度分区

-- FUNCTION: public.f_getsysdate()
-- DROP FUNCTION public.f_getsysdate();

CREATE OR REPLACE FUNCTION public.f_getsysdate(
    )
    RETURNS timestamp without time zone
    LANGUAGE 'plpgsql'

    COST 100
    STABLE SECURITY DEFINER 
AS $BODY$

DECLARE
    V_ReturnName   VARCHAR2 ;
BEGIN  
    RETURN current_timestamp::timestamp(0);
END

$BODY$;

ALTER FUNCTION public.f_getsysdate()
    OWNER TO tms;

现在,当我运行类似...的查询时......

Explain analyze
SELECT MAX(Vehicleentry_Code) FROM tbl_VehicleEntry
WHERE Shift_Date >= f_getsysdate() - 30 

或者

Explain analyze
SELECT MAX(Vehicleentry_Code) FROM tbl_VehicleEntry
WHERE Shift_Date >= f_getsysdate() - interval '30' day

我观察到,它正在扫描表的每个分区的每一行并使其非常慢。 下面是解释

聚合(成本=324.08..324.09 行=1 宽度=32) -> 追加(成本=0.68..323.88 行=79 宽度=16)

    ->  Index Scan using isd_tbl_vehicleentry on tbl_vehicleentry  (cost=0.68..4.70 rows=1 width=8)
          Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))


    ->  Bitmap Heap Scan on tbl_vehicleentry_2015  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))


          ->  Bitmap Index Scan on isd_tbl_vehicleentry_2015  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_2016  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_2016  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201701  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201701  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201702  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201702  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201703  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201703  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201704  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201704  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201705  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201705  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201706  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201706  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201707  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201707  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201708  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201708  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201709  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201709  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201710  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201710  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201711  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201711  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201712  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201712  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201801  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201801  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201802  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201802  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201803  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201803  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201804  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201804  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201805  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201805  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201806  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201806  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201807  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201807  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201808  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201808  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201809  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201809  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201810  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201810  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201811  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201811  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

    ->  Bitmap Heap Scan on tbl_vehicleentry_201812  (cost=4.41..12.28 rows=3 width=16)
          Recheck Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

          ->  Bitmap Index Scan on isd_tbl_vehicleentry_201812  (cost=0.00..4.41 rows=3 width=0)
                Index Cond: (shift_date >= (f_getsysdate() - '30 days'::interval day))

**看到它已经扫描了我的 postgresql 表的每个分区的每一行 并使其执行非常缓慢。 有什么问题?

问题肯定出在功能上。

有没有其他方法可以让它更快? 请帮忙。**

【问题讨论】:

如果您需要任何回复,请按原样发布EXPLAIN 输出,不要完全损坏。 我已经发布了 EXPLAIN ANALYZE 输出。请查看问题底部 我的评论中重要的部分是没有完全被破坏。您发布的内容无法阅读。 现在请检查,我已经重新编辑了。 【参考方案1】:

那是因为在计划查询的时候函数的值是未知的,所以优化器不知道是否可以排除某些分区。

您应该首先查询f_getsysdate(),然后根据结果构造一条 SQL 语句并执行它。这样限制将是一个常量,PostgreSQL 优化器可以使用它。

如果函数是IMMUTABLE,PostgreSQL 可以做得更好,但基于名称我认为这不是一个选项。

【讨论】:

我做到了。在我的前端,我将 f_getsysdate() 的结果放在一个变量中并将其传递给查询,它工作正常。但问题是,oracle 上的相同查询运行速度非常快,但没有将其结果转换为如上所述的变量。 oracle 当时如何处理它以及为什么 postgres 扫描每个分区的每一行。 无法回答有关Oracle的任何问题,但除非函数是IMMUTABLE,否则PostgreSQL在实际执行函数之前无法知道结果(因为执行时可能与计划时不同) .它必须在执行之前计划查询。添加一个在执行时消除分区的功能可能会很有趣,但还没有人写过。 所以你的问题得到了回答。 是的,就像我解释的那样。所以不要使用该功能。 请注意,如果您使用 PL/pgSQL 函数或预处理语句,PostgreSQL 现在可能会在数据库会话的生命周期内缓存 f_getsysdate() 的结果。

以上是关于Postgresql 日期函数扫描每个分区中的每一行并且执行速度非常慢的主要内容,如果未能解决你的问题,请参考以下文章

按日期范围分区 PostgreSQL 扫描所有分区

如何在 PostgreSQL 中获取表的每一天的第一个日期并将其转换为 JSON

为一个季度中的每一天创建具有处方金额的日期表

PostgreSQL 11 对索引应该足够的分区表进行并行 seq 扫描

档案目录中的每一列是什么意思? (pg_dump / pg_restore)

spark工作原理