我如何优化此查询以用于计算响应的 sql

Posted

技术标签:

【中文标题】我如何优化此查询以用于计算响应的 sql【英文标题】:How can i optimize this query for sql for counting response 【发布时间】:2019-12-21 10:17:41 【问题描述】:

我有一个问题响应表,当我尝试通过查询来计算问题的响应数以创建图表时,加载需要 65 秒

所以请指导我如何优化这个查询

SELECT
vr.question_id,
(SELECT COUNT(response) FROM visitors_response  WHERE question_id = vr.question_id AND response = 5 ) AS one_star,
(SELECT COUNT(response) FROM visitors_response  WHERE question_id = vr.question_id AND response = 4 ) AS two_star,
(SELECT COUNT(response) FROM visitors_response  WHERE question_id = vr.question_id AND response = 3 ) AS three_star,
(SELECT COUNT(response) FROM visitors_response  WHERE question_id = vr.question_id AND response = 2 ) AS four_star,
(SELECT COUNT(response) FROM visitors_response  WHERE question_id = vr.question_id AND response = 1 ) AS five_star,
(SELECT AVG(response)   FROM visitors_response  WHERE question_id = vr.question_id ) AS average 
FROM visitors_response vr
JOIN questions q ON q.id = vr.question_id 
JOIN survey s ON s.id = q.survey_id
WHERE s.user_id = 101 AND s.status = 'active' 
GROUP BY vr.question_id

【问题讨论】:

【参考方案1】:

尝试条件聚合:

SELECT
vr.question_id,
COUNT(CASE WHEN response = 5 THEN response END) AS one_star,
COUNT(CASE WHEN response = 4 THEN response END) AS two_star,
COUNT(CASE WHEN response = 3 THEN response END) AS three_star,
COUNT(CASE WHEN response = 2 THEN response END) AS four_star,
COUNT(CASE WHEN response = 1 THEN response END) AS five_star,
AVG(response) AS average 
FROM visitors_response vr
JOIN questions q ON q.id = vr.question_id 
JOIN survey s ON s.id = q.survey_id
WHERE s.user_id = 101 AND s.status = 'active' 
GROUP BY vr.question_id

或者使用SUM代替COUNT

SELECT
vr.question_id,
SUM(response = 5) AS one_star,
SUM(response = 4) AS two_star,
SUM(response = 3) AS three_star,
SUM(response = 2) AS four_star,
SUM(response = 1) AS five_star,
AVG(response) AS average 
FROM visitors_response vr
JOIN questions q ON q.id = vr.question_id 
JOIN survey s ON s.id = q.survey_id
WHERE s.user_id = 101 AND s.status = 'active' 
GROUP BY vr.question_id

【讨论】:

检查结果。 JOIN 可能会增加行数,然后计数发生在 之前 GROUP BY【参考方案2】:

您可以在任何聚合函数中使用 IF() 函数。诀窍是 COUNT() 不仅计算空值,因此将 NULL 放入“else”部分。喜欢这里:

SELECT 
  COUNT(IF(response=1,1,NULL)) AS one_star,
  COUNT(IF(response=2,1,NULL)) AS two_star,
  COUNT(IF(response=3,1,NULL)) AS three_star,
  COUNT(IF(response=4,1,NULL)) AS four_star,
  COUNT(IF(response=5,1,NULL)) AS five_star,
  AVG(response) AS average
FROM visitors_response vr
JOIN questions q ON q.id = vr.question_id 
JOIN survey s ON s.id = q.survey_id
WHERE s.user_id = 101 AND s.status = 'active' 
GROUP BY vr.question_id

或者你也可以通过“OR”操作来做同样的事情:

  COUNT(response=1 OR NULL) AS one_star,

对我来说,这是最短且最容易理解的选项。

【讨论】:

【参考方案3】:

请注意,对于任何查询优化问题,您都应该为查询中涉及的每个表提供SHOW CREATE TABLE tablename 语句。

也就是说,如果以下索引不存在,请将它们添加到您的表中:

survey: (user_id,status)
questions: (survey_id)
visitor_responses: (question_id,response)

以上索引假设surveyquestions表上的id分别是每个表的主键。

报告如何提高性能,并为每个表添加最新的 SHOW CREATE TABLE tablename 语句,以便我们现在可以帮助确保您没有任何冗余索引。

如果性能不低于 1 秒,或您希望超过的任何其他阈值,还包括当前的EXPLAIN 计划。

【讨论】:

以上是关于我如何优化此查询以用于计算响应的 sql的主要内容,如果未能解决你的问题,请参考以下文章

计算比率的sql查询优化

如何优化此索引算法

如何通过经纬度计算距离来优化 SQL 查询?

如何优化我的 Firebird SQL 查询?

用于 Google BigQuery 的 SQL 查询以计算会话和浏览量

优化 SQL:如何重写此查询以提高性能? (使用子查询,摆脱 GROUP BY?)