过滤集合返回函数结果

Posted

技术标签:

【中文标题】过滤集合返回函数结果【英文标题】:Filtering set returning function results 【发布时间】:2017-12-07 22:42:31 【问题描述】:

我希望阐明我对集合返回函数在 PostgreSQL 中的幕后行为方式的理解。

让我们设置我有一个名为“a_at_date”的集合返回函数,它返回:

 SELECT * FROM a WHERE date = a_date

其中 a_date 是函数参数。

如果我这样使用:

SELECT *
FROM a_at_date(a_date) 
WHERE other_field = 123

然后,例如,这是否可以像它一样利用 [date, other_field] 上的索引:

SELECT *
FROM a
WHERE a = a_date AND other_field = 123

换句话说,集合返回函数是否​​独立于任何外部查询运行并因此限制了索引选项?

【问题讨论】:

您可以查看执行计划。对 Postgres 优化不太了解,我希望 set-returning 函数在优化方面是一个“障碍”。也就是说,如果将外部谓词传递给函数以进行优化,我会感到惊讶。 对 PostgreSQL 感到惊讶 :^) 【参考方案1】:

原则上,优化器不知道函数的作用——函数体是由函数的过程语言的调用处理程序处理的字符串。

一个例外是用LANGUAGE sql 编写的函数。如果它们足够简单,并且可以证明内联它们不会改变 SQL 语句的语义,则查询重写器将内联它们。

查看backend/optimizer/prep/prepjointree.c中的以下评论:

/*
 * inline_set_returning_functions
 *              Attempt to "inline" set-returning functions in the FROM clause.
 *
 * If an RTE_FUNCTION rtable entry invokes a set-returning function that
 * contains just a simple SELECT, we can convert the rtable entry to an
 * RTE_SUBQUERY entry exposing the SELECT directly.  This is especially
 * useful if the subquery can then be "pulled up" for further optimization,
 * but we do it even if not, to reduce executor overhead.
 *
 * This has to be done before we have started to do any optimization of
 * subqueries, else any such steps wouldn't get applied to subqueries
 * obtained via inlining.  However, we do it after pull_up_sublinks
 * so that we can inline any functions used in SubLink subselects.
 *
 * Like most of the planner, this feels free to scribble on its input data
 * structure.
 */

inline_set_returning_functionbackend/optimizer/util/clauses.c 中也有两个有指导意义的 cmets:

/*
 * Forget it if the function is not SQL-language or has other showstopper
 * properties.  In particular it mustn't be declared STRICT, since we
 * couldn't enforce that.  It also mustn't be VOLATILE, because that is
 * supposed to cause it to be executed with its own snapshot, rather than
 * sharing the snapshot of the calling query.  (Rechecking proretset is
 * just paranoia.)
 */

/*
 * Make sure the function (still) returns what it's declared to.  This
 * will raise an error if wrong, but that's okay since the function would
 * fail at runtime anyway.  Note that check_sql_fn_retval will also insert
 * RelabelType(s) and/or NULL columns if needed to make the tlist
 * expression(s) match the declared type of the function.
 *
 * If the function returns a composite type, don't inline unless the check
 * shows it's returning a whole tuple result; otherwise what it's
 * returning is a single composite column which is not what we need. (Like
 * check_sql_fn_retval, we deliberately exclude domains over composite
 * here.)
 */

使用EXPLAIN 查看您的函数是否内联。

一个例子:

CREATE TABLE a (
   "date" date NOT NULL,
   other_field text NOT NULL
);

CREATE OR REPLACE FUNCTION a_at_date(date)
   RETURNS TABLE ("date" date, other_field text)
   LANGUAGE sql STABLE CALLED ON NULL INPUT
   AS 'SELECT "date", other_field FROM a WHERE "date" = $1';

EXPLAIN (VERBOSE, COSTS off)
SELECT *
FROM a_at_date(current_date)
WHERE other_field = 'value';

                               QUERY PLAN                                
-------------------------------------------------------------------------
 Seq Scan on laurenz.a
   Output: a.date, a.other_field
   Filter: ((a.other_field = 'value'::text) AND (a.date = CURRENT_DATE))
(3 rows)

【讨论】:

以上是关于过滤集合返回函数结果的主要内容,如果未能解决你的问题,请参考以下文章

JOIN 集合返回函数结果

jQuery 基础

集合返回函数(json_array_elements)与表列的连接结果

MongoDB--MapReduce分布统计s

zepto源码--$.map,$.each,$.grep--学习笔记

Django中模型