过滤集合返回函数结果
Posted
技术标签:
【中文标题】过滤集合返回函数结果【英文标题】:Filtering set returning function results 【发布时间】:2017-12-07 22:42:31 【问题描述】:我希望阐明我对集合返回函数在 PostgreSQL 中的幕后行为方式的理解。
让我们设置我有一个名为“a_at_date”的集合返回函数,它返回:
SELECT * FROM a WHERE date = a_date
其中 a_date 是函数参数。
如果我这样使用:
SELECT *
FROM a_at_date(a_date)
WHERE other_field = 123
然后,例如,这是否可以像它一样利用 [date, other_field] 上的索引:
SELECT *
FROM a
WHERE a = a_date AND other_field = 123
换句话说,集合返回函数是否独立于任何外部查询运行并因此限制了索引选项?
【问题讨论】:
您可以查看执行计划。对 Postgres 优化不太了解,我希望 set-returning 函数在优化方面是一个“障碍”。也就是说,如果将外部谓词传递给函数以进行优化,我会感到惊讶。 对 PostgreSQL 感到惊讶 :^) 【参考方案1】:原则上,优化器不知道函数的作用——函数体是由函数的过程语言的调用处理程序处理的字符串。
一个例外是用LANGUAGE sql
编写的函数。如果它们足够简单,并且可以证明内联它们不会改变 SQL 语句的语义,则查询重写器将内联它们。
查看backend/optimizer/prep/prepjointree.c
中的以下评论:
/*
* inline_set_returning_functions
* Attempt to "inline" set-returning functions in the FROM clause.
*
* If an RTE_FUNCTION rtable entry invokes a set-returning function that
* contains just a simple SELECT, we can convert the rtable entry to an
* RTE_SUBQUERY entry exposing the SELECT directly. This is especially
* useful if the subquery can then be "pulled up" for further optimization,
* but we do it even if not, to reduce executor overhead.
*
* This has to be done before we have started to do any optimization of
* subqueries, else any such steps wouldn't get applied to subqueries
* obtained via inlining. However, we do it after pull_up_sublinks
* so that we can inline any functions used in SubLink subselects.
*
* Like most of the planner, this feels free to scribble on its input data
* structure.
*/
inline_set_returning_function
backend/optimizer/util/clauses.c
中也有两个有指导意义的 cmets:
/*
* Forget it if the function is not SQL-language or has other showstopper
* properties. In particular it mustn't be declared STRICT, since we
* couldn't enforce that. It also mustn't be VOLATILE, because that is
* supposed to cause it to be executed with its own snapshot, rather than
* sharing the snapshot of the calling query. (Rechecking proretset is
* just paranoia.)
*/
和
/*
* Make sure the function (still) returns what it's declared to. This
* will raise an error if wrong, but that's okay since the function would
* fail at runtime anyway. Note that check_sql_fn_retval will also insert
* RelabelType(s) and/or NULL columns if needed to make the tlist
* expression(s) match the declared type of the function.
*
* If the function returns a composite type, don't inline unless the check
* shows it's returning a whole tuple result; otherwise what it's
* returning is a single composite column which is not what we need. (Like
* check_sql_fn_retval, we deliberately exclude domains over composite
* here.)
*/
使用EXPLAIN
查看您的函数是否内联。
一个例子:
CREATE TABLE a (
"date" date NOT NULL,
other_field text NOT NULL
);
CREATE OR REPLACE FUNCTION a_at_date(date)
RETURNS TABLE ("date" date, other_field text)
LANGUAGE sql STABLE CALLED ON NULL INPUT
AS 'SELECT "date", other_field FROM a WHERE "date" = $1';
EXPLAIN (VERBOSE, COSTS off)
SELECT *
FROM a_at_date(current_date)
WHERE other_field = 'value';
QUERY PLAN
-------------------------------------------------------------------------
Seq Scan on laurenz.a
Output: a.date, a.other_field
Filter: ((a.other_field = 'value'::text) AND (a.date = CURRENT_DATE))
(3 rows)
【讨论】:
以上是关于过滤集合返回函数结果的主要内容,如果未能解决你的问题,请参考以下文章
集合返回函数(json_array_elements)与表列的连接结果