从两个不同的表中创建对两个列求和的 pgsql 视图
Posted
技术标签:
【中文标题】从两个不同的表中创建对两个列求和的 pgsql 视图【英文标题】:Create pgsql view who SUM two column from two different table 【发布时间】:2021-04-23 10:07:58 【问题描述】:我有 2 个视图:
预后评分:
user_id | isGoodPrognosis | pts_won | good_gap | good_score |
---|---|---|---|---|
1 | true | 1 | 0 | 0 |
1 | true | 1 | 0 | 0 |
2 | true | 3 | 1 | 1 |
2 | false | 0 | 0 | 0 |
question_scores:
user_id | isGoodPrognosis | pts_won |
---|---|---|
1 | false | 0 |
2 | true | 10 |
并且想创建第三个视图来计算用户的总分: (我还需要 users 表中的其他数据)
score_calculations:
user_id | pts__won | good_gap | good_score | good_winner | company_id | team_id | name |
---|---|---|---|---|---|---|---|
1 | 2 | 0 | 0 | 2 | 1 | 1 | John |
2 | 13 | 1 | 1 | 1 | 1 | 1 | Sam |
为此,我这样做了:
CREATE VIEW score_calculations
AS SELECT
users.id as user_id,
users.name as name,
users.company_id as company_id,
users.team_id as team_id,
users.email_verified AS email_verified,
users.banned AS banned,
-- users.email as email,
SUM(COALESCE(prognosis_scores."pts_won", 0) + COALESCE (question_scores."pts_won", 0) ) as pts_won,
SUM(prognosis_scores."good_gap") as good_gap,
SUM(prognosis_scores."good_score") as good_score,
SUM(prognosis_scores."isGoodPrognosis"::INT) as good_winner
FROM users
LEFT JOIN prognosis_scores
ON prognosis_scores.user_id=users.id
LEFT JOIN question_scores
ON question_scores.user_id=users.id
GROUP BY users.id , users.name, users.company_id,team_id,email_verified,banned;
但SUM(COALESCE(prognosis_scores."pts_won", 0) + COALESCE (question_scores."pts_won", 0) ) as pts_won,
效果不佳:SUM with multiple LEFT JOINS with VIEWS
所以我最终得到了这个:
CREATE VIEW score_calculations
AS SELECT u.id as user_id, u.name, u.company_id, u.team_id, u.email_verified,u.banned,
-- users.email as email,
COALESCE(ps.pts_won, 0) + COALESCE (qs.pts_won, 0) as pts_won,
ps.good_gap, ps.good_score, ps.good_winner
FROM users u LEFT JOIN LATERAL
(SELECT SUM(ps."pts_won") as pts_won,
SUM(ps.good_gap) as good_gap,
SUM(ps.good_score) as good_score,
SUM(ps."isGoodPrognosis"::INT) as good_winner
FROM prognosis_scores ps
WHERE ps.user_id = u.id
) ps
ON 1=1 LEFT JOIN LATERAL
(SELECT SUM(qs."pts_won") as pts_won
FROM question_scores qs
WHERE qs.user_id = u.id
) qs
ON 1=1;
问题是第二块代码很慢,当我尝试运行SELECT * FROM score_calculations
时,执行时间大约是16s,而第一块代码很快,执行时间大约是400ms。
对于这个测试,我有大约 1000 个用户和大约 30000 个预测分数
问题是:如何优化或更改第二段代码(score_calculations 视图)?
【问题讨论】:
要诊断性能问题,请使用 EXPLAIN ANALYZE。将分析查询的结果包含在您的问题中以获得最佳答案。 【参考方案1】:答案试图优化查询:
SELECT * FROM score_calculations
如果您需要多个用户的信息,使用关键字LATERAL 可以提高性能,但对于所有不使用 LATERAL 的用户而言,性能通常会更好。
CREATE VIEW score_calculations
AS
SELECT
u.id as user_id, u.name, u.company_id, u.team_id, u.email_verified,u.banned,
-- users.email as email,
COALESCE(ps.pts_won, 0) + COALESCE (qs.pts_won, 0) as pts_won,
ps.good_gap, ps.good_score, ps.good_winner
FROM users u
LEFT JOIN (
SELECT
ps.user_id,
SUM(ps.pts_won) as pts_won,
SUM(ps.good_gap) as good_gap,
SUM(ps.good_score) as good_score,
SUM(ps."isGoodPrognosis"::INT) as good_winner
FROM
prognosis_scores ps
GROUP BY
ps.user_id
) ps ON (ps.user_id = u.id)
LEFT JOIN (
SELECT
qs.user_id,
SUM(qs.pts_won) as pts_won
FROM
question_scores qs
GROUP BY
qs.user_id
) qs ON (qs.user_id = u.id);
如果您将 WHERE 条件与 users
表中的属性一起使用,则您的原始视图很好。您应该检查您是否在prognosis_scores.user_id
和question_scores.user_id
上存在缺陷。
【讨论】:
【参考方案2】:对于横向连接,您需要以下索引:
prognosis_scores(user_id)
question_scores(user_id)
需要索引来在这些表中查找用户的特定行——没有它们,数据库需要求助于其他机制,例如嵌套循环。
我想如果使用正确的索引,这个版本实际上可能比第一个版本更快,因为它避免了外部聚合(对此没有任何承诺)。
【讨论】:
【参考方案3】:试试这个方法(你可以添加与用户的 JOIN):
SELECT
ps.user_id,
SUM(COALESCE(ps."pts_won", 0) + COALESCE (qs."pts_won", 0) ) as pts_won,
SUM(ps."good_gap") as good_gap,
SUM(ps."good_score") as good_score,
SUM(ps.good_winner) as good_winner
FROM (SELECT ps.user_id,
SUM(ps."pts_won") AS pts_won,
SUM(ps."good_gap") as good_gap,
SUM(ps."good_score") as good_score,
SUM(ps."isGoodPrognosis"::INT) as good_winner
FROM prognosis_scores ps
GROUP BY ps.user_id) AS ps
LEFT JOIN question_scores qs ON qs.user_id = ps.user_id
GROUP BY ps.user_id
ORDER BY ps.user_id;
【讨论】:
以上是关于从两个不同的表中创建对两个列求和的 pgsql 视图的主要内容,如果未能解决你的问题,请参考以下文章
复合求和:我想创建一个复合查询,它从两个不同的表中获取两列的单独总和,然后对它们求和
如何使用 Nhibernate 从连接两个具有所有 id 的表中选择只有一个不同列的多个列是 UNIQUEIDENTIFIER