哪个查询更有效?内部联接与子查询?总和案例与拥有
Posted
技术标签:
【中文标题】哪个查询更有效?内部联接与子查询?总和案例与拥有【英文标题】:Which of queries is more efficient? Inner Join vs subquery ? Sum case vs having 【发布时间】:2015-04-05 09:20:07 【问题描述】:我正在使用 PostgreSql 9。我有一个简单的问题。 哪个查询效率更高?
SELECT users_sessions.session_id, users_sessions.series
FROM users_sessions
WHERE users_sessions.user_id = 8
AND users_sessions.session_id IN (
SELECT session_id
FROM sessions_history
GROUP BY sessions_history.session_id
HAVING SUM(CASE WHEN sessions_history.action = 2 THEN 1 END) = 0
)
VS.
SELECT US8.session_id,Us8.series
FROM
( SELECT us.session_id as S_ID, us.series
FROM users_sessions as US
WHERE US.user_id = 8 ) AS US8
INNER JOIN
(SELECT SH.session_id as SH_ID
FROM session_history as SH
WHERE SH.action <> 2) AS SH2
ON US8.session_id = SH2.session_id
【问题讨论】:
查看explain (analyze, verbose)
的输出
嗯,你试过了吗?第二个甚至似乎都无效。测试它们,写下所有结果,然后问一个好问题。
您的 Postgres 版本? “9”不是有效版本。 “9.2”或“9.3”是...
我认为第二个查询不好。
@KamilJ - 纯粹评论欧文的观点;正确的语法是HAVING SUM(1) = 0
(从 HAVING 子句中的 SELECT 子句重复逻辑) 或者如果您不想重复自己; SELECT * FROM (<your query, without a HAVING clause>) AS aggregate WHERE s = 0
.
【参考方案1】:
另一种解决方案使用 NOT EXISTS,这是您似乎要问的内容的精确翻译:
给我不存在操作 2 的会话:
SELECT session_id, series
FROM users_sessions as us
WHERE users_sessions.user_id = 8
AND NOT EXISTS
( SELECT *
FROM sessions_history as sh
WHERE action = 2
AND us.session_id = sh.session_id
)
【讨论】:
【参考方案2】:您的第二个查询 is 在语法上无效。您不能在HAVING
子句中引用SELECT
列表(“输出列”)中的列别名。
不管怎样,这两个查询都不好。如果您确实想找到不存在的(user_id, action)
组合,请尝试:
SELECT t.*, 0 AS s
FROM (SELECT 8 AS user_id, 2 AS action) t
LEFT JOIN (
users_sessions us
JOIN session_history sh USING (session_id)
) USING (user_id, action)
WHERE sh.session_id IS NULL;
我将子查询t
和LEFT JOIN
中的单行派生表引入到两个基表的组合中。如果组合 不 存在,则只返回一行。
假设列名 user_id
和 action
在两个表中只出现一次。否则,对第二个连接使用更明确的条件:
ON t.user_id = us.user_id AND t.action = sh.action
详情:
Select rows which are not present in other table另类
可能更快,但是:
SELECT t.*, 0 AS s
FROM (SELECT 8 AS user_id, 2 AS action) t
LEFT JOIN users_sessions us USING (user_id)
LEFT JOIN session_history sh USING (action, session_id)
WHERE sh.session_id IS NULL;
USING
在本例中无论哪种方式都是安全的。
【讨论】:
你真的是USING
的粉丝吗?当您的查询也包括范围查找等时,它会不会变得非常不整洁?
@MatBailie:我对简洁的代码情有独钟。对于持久化查询,使用显式形式更安全,这样更不容易受到对基础表的后续架构更改的影响。不过,示例中的第一个 USING
子句是安全的。【参考方案3】:
我的 tuppence 值;
我可以用两种不同的方式解读你的意图,并且对每一种都有不同的解决方案(我首选的解决方案在每种情况下都是第一个);
具有至少一个历史记录的会话 2(而不关心有多少历史记录在其中 action = 2)
SELECT
us.session_id,
us.series
FROM
users_sessions AS us
INNER JOIN
session_history AS sh
ON sh.session_id = us.session_id
AND sh.action <> 2
WHERE
us.user_id = 8
GROUP BY
us.session_id,
us.series
或
SELECT
us.session_id,
us.series
FROM
users_sessions AS us
WHERE
us.user_id = 8
AND EXISTS (SELECT *
FROM session_history AS sh
WHERE sh.session_id = us.session_id
AND sh.action <> 2
)
或
SELECT
us.session_id,
us.series
FROM
users_sessions AS us
INNER JOIN
(
SELECT
session_id
FROM
session_history
WHERE
sh.action <> 2
GROUP BY
session_id
)
AS sh
ON sh.session_id = us.session_id
WHERE
us.user_id = 8
没有历史记录且 action = 2 的会话(但可以有其他历史记录)
SELECT
us.session_id,
us.series
FROM
users_sessions AS us
LEFT JOIN
session_history AS sh
ON sh.session_id = us.session_id
AND sh.action = 2
WHERE
us.user_id = 8
AND sh.session_id IS NULL
-- No GROUP BY needed this time
或
SELECT
us.session_id,
us.series
FROM
users_sessions AS us
WHERE
us.user_id = 8
AND NOT EXISTS (SELECT *
FROM session_history AS sh
WHERE sh.session_id = us.session_id
AND sh.action = 2
)
【讨论】:
【参考方案4】:您对此有何看法?
SELECT US8.session_id,Us8.series
FROM
( SELECT us.session_id as S_ID, us.series
FROM users_sessions as US
WHERE US.user_id = 8 ) AS US8
INNER JOIN
(SELECT SH.session_id as SH_ID
FROM session_history as SH
WHERE SH.action <> 2) AS SH2
ON US8.session_id = SH2.session_id
【讨论】:
session_id
是否有可能在 session_history
中有零对应行?如果是这样,您需要 LEFT OUTER JOIN
或 NOT EXISTS
类型的查找,根据此处的其他答案。
您首先需要确切地(在问题中)定义查询应该做什么。这里有很多优点。
session_id
是否可以在session_history
中有多个对应的行?如果是这样,您的SH2
子查询应该有DISTINCT
或GROUP BY
以防止结果重复。以上是关于哪个查询更有效?内部联接与子查询?总和案例与拥有的主要内容,如果未能解决你的问题,请参考以下文章