从 LEFT JOIN 列获取最新结果

Posted

技术标签:

【中文标题】从 LEFT JOIN 列获取最新结果【英文标题】:Get most recent result from a LEFT JOIN column 【发布时间】:2021-06-26 12:06:25 【问题描述】:

我正在从头开始创建一个自定义论坛,并尝试使用一些 LEFT JOIN 查询来获取信息,例如 total poststotal threads 和大多数 recent thread。我设法获取了数据,但 recent thread 一直返回一个随机值,而不是最近的线程。

CREATE TABLE forum_categories
    (`name` varchar(18), `label` varchar(52), `id` int)
;
    
INSERT INTO forum_categories
    (`name`, `label`, `id`)
VALUES
    ('General Discussion', 'Talk about anything and everything Digimon!', 1),
    ('Deck Discussion', 'Talk about Digimon TCG Decks and Strategies!', 2),
    ('Card Discussion', 'Talk about Digimon TCG Cards!', 3),
    ('Website Feedback', 'A place to discuss and offer feedback on the website', 4)
;

CREATE TABLE forum_topics
    (`name` varchar(18), `id` int, `parent_id` int, `author_id` int, date date)
;
    
INSERT INTO forum_topics
    (`name`, `id`, `parent_id`, `author_id`, `date`)
VALUES
    ('My First Topic', 1, 1, 16, '2021-03-29'),
    ('My Second Topic', 2, 1, 16, '2021-03-30')
;

CREATE TABLE forum_topics_content
    (`id` int, `topic_id` int, `author_id` int, date datetime, `content` varchar(300))
;
    
INSERT INTO forum_topics_content
    (`id`, `topic_id`, `author_id`, `date`, `content`)
VALUES
    (1, 1, 16, '2021-03-29 15:46:55', 'Hey guys! This is my first post!'),
    (2, 1, 16, '2021-03-30 08:05:13', 'This is my first topic reply!')
;

我的查询:

SELECT forum_categories.name, label, forum_categories.id, COUNT(DISTINCT(forum_topics.id)) as 'topics', COUNT(DISTINCT(forum_topics_content.id)) as 'posts', SUBSTRING(forum_topics.name,1, 32) as 'thread'
FROM forum_categories 
LEFT JOIN forum_topics ON forum_categories.id = forum_topics.parent_id
LEFT JOIN forum_topics_content ON forum_topics.id = forum_topics_content.topic_id
GROUP BY forum_categories.id
ORDER BY forum_categories.id, forum_topics.date DESC

我认为拥有forum_topics.date DESCORDER BY 对我有用,并输出最新的线程"My Second Topic",但事实并非如此。 我有点难过,尝试了ORDER BY 的不同变体,但无济于事。 thread 不断从两个可能的结果中返回一个随机结果。

此小提琴上提供了完整的数据示例:https://www.db-fiddle.com/f/auDzUABaEpYzLKDkRqE7ok/0

期望的结果'thread' 始终是最新的线程,在此示例中为"My Second Topic"。然而,它似乎总是在 "My First Topic""My Second Topic" 之间随机选择。

第一行的输出应该总是:

'General Discussion' , 'Talk about anything and everything Digimon!' 1, 2, 2, 'My Second Topic'

【问题讨论】:

给定样本数据集,请编辑您的问题以提供所需的相应结果。 你为什么(试图)在表forum_topics_content中插入一个日期时间值,它只包含一个date字段。 @Luuk 只是输入错误。它实际上是一个日期时间字段。我已经更新并修复了。 听起来像是一个“groupise-max”问题。查看添加的标签。 【参考方案1】:

线程不断从两个可能的结果中返回一个随机结果。

提供的查询只是不确定的,相当于:

SELECT forum_categories.name, 
  forum_categories.label, 
  forum_categories.id,
  COUNT(DISTINCT(forum_topics.id)) as 'topics',
  COUNT(DISTINCT(forum_topics_content.id)) as 'posts',
  SUBSTRING(ANY_VALUE(forum_topics.name),1, 32) as 'thread'
FROM forum_categories 
LEFT JOIN forum_topics ON forum_categories.id = forum_topics.parent_id
LEFT JOIN forum_topics_content ON forum_topics.id = forum_topics_content.topic_id
GROUP BY forum_categories.id,forum_categories.name,forum_categories.label
ORDER BY forum_categories.id, ANY_VALUE(forum_topics.date) DESC;

假设 forum_categories.id 是 PRIMARY KEY,则名称/标签在功能上是依赖的,但该列的其余部分只是 ANY_VALUE

如果 SELECT 列表中的列在功能上不依赖或不包含聚合函数,则查询不正确。在 mysql 8.0 上或启用 ONLY_FULL_GROUP_BY 时,结果为错误。

相关:Group by clause in mySQL and postgreSQL, why the error in postgreSQL?


有不同的方法可以达到预期的结果(相关子查询、窗口函数、限制)等等。

这里使用GROUP_CONCAT:

SELECT forum_categories.name, 
  forum_categories.label, 
  forum_categories.id,
  COUNT(DISTINCT(forum_topics.id)) as `topics`,
  COUNT(DISTINCT(forum_topics_content.id)) as `posts`,
  SUBSTRING_INDEX(GROUP_CONCAT(SUBSTRING(forum_topics.name,1,32)
                 ORDER BY forum_topics.`date` DESC 
                 SEPARATOR '~'),
                 '~',1) AS `thread`
FROM forum_categories 
LEFT JOIN forum_topics ON forum_categories.id = forum_topics.parent_id
LEFT JOIN forum_topics_content ON forum_topics.id = forum_topics_content.topic_id
GROUP BY forum_categories.id,forum_categories.name,forum_categories.label
ORDER BY forum_categories.id;

它是如何工作的:

GROUP_CONCAT 是聚合函数,允许连接字符串保留顺序。

My Second Topic~My First Topic~My First Topic

然后SUBSTRING_INDEX 返回字符串的一部分,直到第一次出现分隔符~

db<>fiddle demo

【讨论】:

【参考方案2】:

在你的小提琴中你有:

SET SESSION sql_mode = '';

您应该将其更改为:

SET SESSION sql_mode = 'ONLY_FULL_GROUP_BY';

你会得到这样的错误:

Query Error: Error: ER_WRONG_FIELD_WITH_GROUP: Expression #1 of 
SELECT list is not in GROUP BY clause and contains nonaggregated 
column 'test.forum_categories.name' which is not functionally 
dependent on columns in GROUP BY clause; this is incompatible with
 sql_mode=only_full_group_by

文档中指出:

ONLY_FULL_GROUP_BY

拒绝选择列表、HAVING 条件或 ORDER 的查询 BY 列表指的是未在 GROUP BY 子句也不在功能上依赖(唯一确定 by) GROUP BY 列。

从 MySQL 5.7.5 开始,默认 SQL 模式包括 ONLY_FULL_GROUP_BY。 (在 5.7.5 之前,MySQL 不检测函数依赖和 ONLY_FULL_GROUP_BY 默认不启用。对于描述 5.7.5 之前的行为,请参阅 MySQL 5.6 参考手册。)

他们这样做是有充分理由的。 见:why should not disable only full group by

【讨论】:

以上是关于从 LEFT JOIN 列获取最新结果的主要内容,如果未能解决你的问题,请参考以下文章

如何使用Yii2中的left join从两个表中获取所有列数据

Mysql LEFT JOIN -> 获取最新的主题id/主题标题

SQL JOINSQL INNER JOIN 关键字SQL LEFT JOIN 关键字SQL RIGHT JOIN 关键字SQL FULL JOIN 关键字

使用 GROUP_CONCAT 进行 LEFT JOIN 的奇怪结果

MySql如何将LEFT JOIN查询与NOT IN查询结合起来

SQL JOIN