子选择或连接?有没有更好的方法来编写这个 mysql 查询?

Posted

技术标签:

【中文标题】子选择或连接?有没有更好的方法来编写这个 mysql 查询?【英文标题】:SubSelect or Joins? Is there a better way to write this mysql query? 【发布时间】:2012-10-11 19:30:50 【问题描述】:

我知道总有更好的方法来做某事,但我不确定如何做?优化此查询的最佳方法是什么?我应该使用联接、单独的查询等吗?我知道这不是一个复杂的查询……只是想扩展我的知识。

任何建议将不胜感激!

SELECT
  community_threads.id AS thread_id,
  community_threads.title AS thread_title,
  community_threads.date AS thread_date,
  community_threads.author_id AS author_id,
  `user`.display_name AS author_name,
  `user`.organization AS author_organization,
  (SELECT date FROM community_replies replies WHERE replies.thread_id = community_threads.id ORDER BY date DESC LIMIT 1) AS reply_date,
  (SELECT   count(id) FROM community_replies replies WHERE replies.thread_id = community_threads.id ORDER BY date DESC LIMIT 1) AS total_replies
FROM
  community_threads
INNER JOIN `user` ON community_threads.author_id = `user`.id
WHERE
  category_id = '1'
ORDER BY
  reply_date DESC
LIMIT 0, 5

【问题讨论】:

【参考方案1】:

这可以通过 JOIN 对子选择进行改进,该子选择根据 thread_id 获取聚合 COUNT() 和聚合 MAX(date)。不必为每一行评估子选择,而应为整个查询只评估一次派生表,并与来自community_threads 的其余行进行连接。

SELECT
  community_threads.id AS thread_id,
  community_threads.title AS thread_title,
  community_threads.date AS thread_date,
  community_threads.author_id AS author_id,
  `user`.display_name AS author_name,
  `user`.organization AS author_organization,
  /* From the joined subqueries */
  maxdate.date AS reply_date,
  threadcount.num AS total_replies
FROM
  community_threads
  INNER JOIN `user` ON community_threads.author_id = `user`.id
  /* JOIN against subqueries to return MAX(date) (same as order by date DESC limit 1) and COUNT(*) from replies */
  /* number of replies per thread_id */
  INNER JOIN  (
    SELECT thread_id, COUNT(*) AS num FROM replies GROUP BY thread_id
  ) threadcount ON community_threads.id = threadcount.thread_id
  /* Most recent date per thread_id */
  INNER JOIN (
    SELECT thread_id, MAX(date) AS date FROM replies GROUP BY thread_id
  ) maxdate ON community_threads.id = maxdate.thread_id
WHERE
  category_id = '1'
ORDER BY
  reply_date DESC
LIMIT 0, 5

如果您将LIMIT 0, 5 放在reply_date 子查询中,您可能会获得更好的性能。这只会拉取子查询中最近的 5 个,而 INNER JOIN 将丢弃来自 community_threads 不匹配的所有内容。

/* I *think* this will work...*/
SELECT
  community_threads.id AS thread_id,
  community_threads.title AS thread_title,
  community_threads.date AS thread_date,
  community_threads.author_id AS author_id,
  `user`.display_name AS author_name,
  `user`.organization AS author_organization,
  /* From the joined subqueries */
  maxdate.date AS reply_date,
  threadcount.num AS total_replies
FROM
  community_threads
  INNER JOIN `user` ON community_threads.author_id = `user`.id
  INNER JOIN  (
    SELECT thread_id, COUNT(*) AS num FROM replies GROUP BY thread_id
  ) threadcount ON community_threads.id = threadcount.thread_id
  /* LIMIT in this subquery */
  INNER JOIN (
    SELECT thread_id, MAX(date) AS date FROM replies GROUP BY thread_id ORDER BY date DESC LIMIT 0, 5
  ) maxdate ON community_threads.id = maxdate.thread_id
WHERE
  category_id = '1'
ORDER BY
  reply_date DESC

【讨论】:

有趣,谢谢.. 以前从未想过使用 MIN/MAX。一旦我获得声誉,我会投票。【参考方案2】:

据我所知,这似乎是一个团体的好机会。

SELECT 
    community_threads.id AS thread_id,
    community_threads.title AS thread_title,
    community_threads.date AS thread_date,
    community_threads.author_id AS author_id,
    `user`.display_name AS author_name,
    `user`.organization AS author_organization,
    MAX(replies.date) AS reply_date,
    COUNT(replies.id) AS total_replies
FROM community_threads
INNER JOIN `user` ON community_threads.author_id = `user`.id
INNER JOIN community_replies AS replies ON replies.thread_id = community_threads.id
WHERE category_id = 1
GROUP BY thread_id, thread_title, thread_date, author_id, author_name, author_organization)
ORDER BY reply_date DESC
LIMIT 0, 5

希望这会有所帮助。

【讨论】:

以上是关于子选择或连接?有没有更好的方法来编写这个 mysql 查询?的主要内容,如果未能解决你的问题,请参考以下文章

有没有比多个子查询更好的方法来获得这个,也许有更多的连接?

有没有更好的方法来排序这个查询?

有没有更好的方法来编写这个 SparkSQL 语句?

有没有更好的方法来编写这个 BigQuery Sql?

有没有更好的方法在打字稿中编写这种递归方法

有没有更好、更干净的方法来编写这个本地存储数据