按组限制而不省略重复项?
Posted
技术标签:
【中文标题】按组限制而不省略重复项?【英文标题】:LIMIT by GROUP without omitting the duplicates? 【发布时间】:2018-05-09 15:35:03 【问题描述】:我有这个有点复杂的查询(精简版):
SELECT a.item_id,
a.title,
a.series,
c.title AS manufacturer_title
FROM catalog_items AS a
JOIN catalog_franchises AS c ON a.manufacturer_id = c.franchise_id
JOIN catalog_franchises AS e ON a.game_id = e.franchise_id
WHERE e.franchise_id = 6
AND a.valid = TRUE
AND c.valid = TRUE
AND e.valid = TRUE
ORDER BY c.title, a.series, a.title
我试图关注的重要领域是c.title
和a.series
。因此,为简洁起见,此查询返回:
item_id series manufacturer_title
46 2 fantasy flight games
63 1 gaming heads
64 1 gaming heads
33 2 reaper miniatures
124 1 triforce
125 1 triforce
45 1 triforce
43 1 usaopoly
我试图做的是添加一个基于这些字段的独特组合的LIMIT
...如果我添加:GROUP BY c.title, a.series
它会给我独特的组:
item_id series manufacturer_title
46 2 fantasy flight games
63 1 gaming heads
33 2 reaper miniatures
124 1 triforce
43 1 usaopoly
但我想要 所有 行,受这些组的限制。所以如果限制是 3,我想要前 3 组中的所有项目:
item_id series manufacturer_title
46 2 fantasy flight games
63 1 gaming heads
64 1 gaming heads
33 2 reaper miniatures
我希望这是有道理的。如何修改我的查询以实现此目的?
【问题讨论】:
什么是主键和唯一键? @PaulSpiegel 以_id
结尾的每个字段都是主键/唯一键。
这是对group by
的滥用;它旨在用于聚合,而不是“半区别性”。
@Uueerdo 如果你有一个非滥用的解决方案,我愿意接受。
只是简单地指出,group by
的使用导致每个组的 item_id 的有效随机选择(从每个组的项目 id 值中);并且只允许在 mysql 中使用(并且它的较新版本默认配置为禁用它)。
【参考方案1】:
您可以先在子查询中选择和限制组,然后将其连接回表以获得最终结果。
SELECT DISTINCT a2.series, c2.title
FROM catalog_items AS a2
JOIN catalog_franchises AS c2 ON a2.manufacturer_id = c2.franchise_id
JOIN catalog_franchises AS e2 ON a2.game_id = e2.franchise_id
WHERE e2.franchise_id = 6
AND a2.valid = TRUE
AND c2.valid = TRUE
AND e2.valid = TRUE
ORDER BY c2.title, a2.series
LIMIT 3
只需将其添加到原始查询的 WHERE:
AND (a.series, c.title) IN (query above)
注意:是的,is 确实会导致几乎重复自身内部的原始查询;但这通常是这类查询的最终结果。
【讨论】:
【参考方案2】:修改您的原始查询以获取前 3 个组:
SELECT DISTINCT c.title, a.series
FROM catalog_items AS a
JOIN catalog_franchises AS c ON a.manufacturer_id = c.franchise_id
JOIN catalog_franchises AS e ON a.game_id = e.franchise_id
WHERE e.franchise_id = 6
AND a.valid = TRUE
AND c.valid = TRUE
AND e.valid = TRUE
ORDER BY c.title, a.series
LIMIT 3
将其用作 FROM 子句(派生表)中的子查询,以将结果限制为 3 个组:
SELECT a.item_id,
a.title,
a.series,
c.title AS manufacturer_title
FROM catalog_items AS a
JOIN catalog_franchises AS c ON a.manufacturer_id = c.franchise_id
JOIN catalog_franchises AS e ON a.game_id = e.franchise_id
-- begin injected code
JOIN (
SELECT DISTINCT c.title, a.series
FROM catalog_items AS a
JOIN catalog_franchises AS c ON a.manufacturer_id = c.franchise_id
JOIN catalog_franchises AS e ON a.game_id = e.franchise_id
WHERE e.franchise_id = 6
AND a.valid = TRUE
AND c.valid = TRUE
AND e.valid = TRUE
ORDER BY c.title, a.series
LIMIT 3
) x ON a.series = x.series AND c.title = x.title
-- end injected code
WHERE e.franchise_id = 6
AND a.valid = TRUE
AND c.valid = TRUE
AND e.valid = TRUE
ORDER BY c.title, a.series, a.title
这与 Uueerdo 建议的基本相同,但使用 JOIN 而不是 WHERE .. IN ..
条件。
【讨论】:
我试图绕过这个,但它似乎返回重复的结果/将项目与他们不属于的制造商放在一起...... “试图把我的头绕在这个周围” - 我也是 ;-) @mistermartin 我试图删除一些 JOIN,但我想您需要在内部和外部查询中使用所有三个表。【参考方案3】:在单列中获取结果会有帮助吗?如果是这样,您可以使用group_concat()
:
SELECT GROUP_CONCAT(a.item_id) as items,
a.series, c.title AS manufacturer_title
FROM catalog_items a JOIN
catalog_franchises c
ON a.manufacturer_id = c.franchise_id JOIN
catalog_franchises e
ON a.game_id = e.franchise_id
WHERE e.franchise_id = 6 AND
a.valid = TRUE AND
c.valid = TRUE AND
e.valid = TRUE
GROUP BY a.series, c.title
ORDER BY c.title, a.series;
虽然您可以在单独的行中获取结果,但这比在 MySQL(v8 之前)中应该更难。如果满足您的需求,这是一个相对简单的解决方案。
编辑:
一种获得你想要的东西的方法是使用变量:
SELECT ist.*
FROM (SELECT ist.*,
(@rn := if(@st = concat_ws(':', a.series, manufacturer_title), @rn,
if(@st := concat_ws(':', a.series, manufacturer_title), @rn + 1, @rn + 1)
)
) as rn
FROM (SELECT a.item_id, a.series, c.title AS manufacturer_title
FROM catalog_items a JOIN
catalog_franchises c
ON a.manufacturer_id = c.franchise_id JOIN
catalog_franchises e
ON a.game_id = e.franchise_id
WHERE e.franchise_id = 6 AND
a.valid = TRUE AND
c.valid = TRUE AND
e.valid = TRUE
ORDER BY c.title, a.series
) ist CROSS JOIN
(SELECT @rn := 0, @st := '') params
) ist
WHERE rn <= 3;
【讨论】:
这是一个有趣的解决方案,但我相信单独的行更适合简化我处理数据的方式。如果可以的话。以上是关于按组限制而不省略重复项?的主要内容,如果未能解决你的问题,请参考以下文章