使用 ORDER BY id 时 MySQL 查询慢
Posted
技术标签:
【中文标题】使用 ORDER BY id 时 MySQL 查询慢【英文标题】:Slow MySQL query when using ORDER BY id 【发布时间】:2020-11-17 07:58:34 【问题描述】:我有一个非常慢的查询,其中第一部分由 gem 创建(https://github.com/CanCanCommunity/cancancan,它创建选择和内部查询),我在其中添加了 ORDER BY
和 LIMIT
用于基于游标的分页。
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 3.48 sec)
这是表格:
CREATE TABLE `spree_products` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`available_on` datetime DEFAULT NULL,
`permalink` varchar(255) DEFAULT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`count_on_hand` int(11) DEFAULT NULL,
`vendor_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_spree_products_on_vendor_id` (`vendor_id`)
) ENGINE=InnoDB AUTO_INCREMENT=37209248 DEFAULT CHARSET=utf8mb4
CREATE TABLE `spree_vendors` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`active` tinyint(1) DEFAULT '0',
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4413 DEFAULT CHARSET=utf8mb4
(我删除了不必要的字段以保持整洁)
上面查询中的EXPLAIN
返回:
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | spree_vendors | NULL | ALL | PRIMARY | NULL | NULL | NULL | 3465 | 10.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | spree_products | NULL | ref | PRIMARY,index_spree_products_on_vendor_id | index_spree_products_on_vendor_id | 5 | _hubert_test.spree_vendors.id | 8613 | 100.00 | Using index |
| 1 | SIMPLE | spree_products | NULL | eq_ref | PRIMARY | PRIMARY | 4 | _hubert_test.spree_products.id | 1 | 100.00 | NULL |
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
当我删除 ORDER BY
时,查询速度很快:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
LIMIT 50;
=> 50 rows in set (0.00 sec)
当我从外部查询中保留ORDER BY
部分,但从子查询中删除WHERE
部分时,查询也很快:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
我尝试将复合索引添加到 spree_vendors.id / spree_vendors.active
,但这没有帮助。
关于如何优化此查询的任何想法?
更新 1:
JOIN
的变体也很慢。 DISTINCT
由 gem 添加以防止重复记录,以防您未选择所有列:
SELECT DISTINCT `spree_products`.*
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 43.13 sec)
没有DISTINCT
,查询速度很快。
更新 2
有人指出,在子查询中使用LEFT OUTER JOIN
会返回整个表。但是当使用INNER JOIN
时它仍然很慢:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
INNER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 3.98 sec)
【问题讨论】:
问题出在 WHERE IN,而不是 ORDER BY。重写为 INNER JOIN 或至少 WHERE EXISTS。如果您的框架不能使用原始 SQL。 附言。通过文本逻辑,您的 WHERE IN 和整个子查询必须完全删除 - 它只是检查spree_products.id
不为 NULL。
...同'inner join'
请注意,没有 ORDER BY 的 LIMIT 是毫无意义的
我已经更新了问题并添加了它的 JOIN 版本。 @Akina“它只是检查......”是什么意思?子查询选择所有spree_products
具有spree_vendor
和active = TRUE
,或者我错过了什么?
【参考方案1】:
鉴于id
必须是PRIMARY,您的查询在功能上必须与此相同:
SELECT [DISTINCT] p.*
FROM spree_products p
JOIN spree_vendors v
ON v.id = p.vendor_id
WHERE v.active = 1
ORDER
BY p.id ASC
LIMIT 50;
这将受益于 p.vendor_id 上的索引,也许还有 v.active。
【讨论】:
关闭,框架添加了SELECT DISTINCT
。没有 distinct 查询很快,否则很慢
你有索引吗?
对查询运行解释并添加 DISTINCT,它会返回此 | 1 | SIMPLE | p | NULL | index | index_spree_products_on_vendor_id | PRIMARY | 4 | NULL | 499 | 100.00 | Using where; Using temporary |
和此索引 | 1 | SIMPLE | v | NULL | eq_ref | PRIMARY,index_spree_vendors_on_id_and_active | PRIMARY | 4 | _my_table.p.vendor_id | 1 | 10.00 | Using where; Distinct |
p.vendor_id
上有一个索引,我尝试在v.active
上加一个,没有区别,查询还是很慢。
那我很难过。这是一个很小的数据集;结果应该是即时的,有或没有 ORDER BY。以上是关于使用 ORDER BY id 时 MySQL 查询慢的主要内容,如果未能解决你的问题,请参考以下文章
使用“WHERE [tinyint] ORDER BY ID”的慢 MySQL 查询
在 mysql 中使用 group by 查询和 order by 查询选择
为啥 MySQL 查询在使用 LIMIT 和 Order BY 时会变慢?