使用算术运算优化 MySQL 嵌套选择
Posted
技术标签:
【中文标题】使用算术运算优化 MySQL 嵌套选择【英文标题】:Optimize MySQL nested select with arithmetic operation 【发布时间】:2012-12-17 18:44:36 【问题描述】:我在 mysql 5.1 非规范化表上运行了这个 sql 查询。它按我想要的方式工作,但它可能会很慢。我在 day 列上添加了一个索引,但它仍然需要更快。关于如何更快地获得这个的任何建议? (也许用连接代替?)
SELECT DISTINCT(bucket) AS b,
(possible_free_slots -
(SELECT COUNT(availability)
FROM ip_bucket_list
WHERE bucket = b
AND availability = 'used'
AND tday = 'evening'
AND day LIKE '2012-12-14%'
AND network = '10_83_mh1_bucket')) AS free_slots
FROM ip_bucket_list
ORDER BY free_slots DESC;
单个查询很快:
SELECT DISTINCT(bucket) FROM ip_bucket_list;
1024 rows in set (0.05 sec)
SELECT COUNT(availability) from ip_bucket_list WHERE bucket = 0 AND availability = 'used' AND tday = 'evening' AND day LIKE '2012-12-14%' AND network = '10_83_mh1_bucket';
1 row in set (0.00 sec)
表:
mysql> describe ip_bucket_list;
+---------------------+--------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+-------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| ip | varchar(50) | YES | | NULL | |
| bucket | int(11) | NO | MUL | NULL | |
| availability | varchar(20) | YES | | NULL | |
| network | varchar(100) | NO | MUL | NULL | |
| possible_free_slots | int(11) | NO | | NULL | |
| tday | varchar(20) | YES | | NULL | |
| day | timestamp | NO | MUL | CURRENT_TIMESTAMP | |
+---------------------+--------------+------+-----+-------------------+----------------+
和 DESC:
DESC SELECT DISTINCT(bucket) as b,(possible_free_slots - (SELECT COUNT(availability) from ip_bucket_list WHERE bucket = b AND availability = 'used' AND tday = 'evening' AND day LIKE '2012-12-14%' AND network = '10_83_mh1_bucket')) as free_slots FROM ip_bucket_list ORDER BY free_slots DESC;
+----+--------------------+----------------+------+-----------------------------------------+--------+---------+------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------------+------+-----------------------------------------+--------+---------+------+--------+---------------------------------+
| 1 | PRIMARY | ip_bucket_list | ALL | NULL | NULL | NULL | NULL | 328354 | Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | ip_bucket_list | ref | bucket,network,ip_bucket_list_day_index | bucket | 4 | func | 161 | Using where |
+----+--------------------+----------------+------+-----------------------------------------+--------+---------+------+--------+---------------------------------+
【问题讨论】:
您能发布一些示例行和预期输出吗?这可能会有所帮助。 【参考方案1】:我会使用连接将相关子查询从SELECT
子句移到FROM
子句中:
SELECT distinct bucket as b,
(possible_free_slots - a.avail) as free_slots
FROM ip_bucket_list ipbl left outer join
(SELECT bucket COUNT(availability) as avail
from ip_bucket_list
WHERE availability = 'used' AND tday = 'evening' AND
day LIKE '2012-12-14%' AND network = '10_83_mh1_bucket'
) on a
on ipbl.bucket = avail.bucket
ORDER BY free_slots DESC;
SELECT
子句中的版本可能正在为每一行重新运行(甚至在distinct
运行之前)。通过将其放在from
子句中,ip_bucket_list 表将只被扫描一次。
另外,如果您希望每个存储桶只显示一次,那么我建议您使用group by
而不是distinct
。它将阐明查询的目的。您可以完全消除对表格的第二次引用,例如:
SELECT bucket as b,
max(possible_free_slots -
(case when availability = 'used' AND tday = 'evening' AND
day LIKE '2012-12-14%' AND network = '10_83_mh1_bucket'
then 1 else 0
end)
) as free_slots
FROM ip_bucket_list
group by bucket
ORDER BY free_slots DESC;
为了加快您的查询版本,您需要在bucket
上建立一个索引,因为它用于相关子查询。
【讨论】:
感谢您的回复。唯一的事情是我需要过滤我的第二个 SELECT 只包含当前存储桶的结果。 我最终只是将 DISTINCT 替换为 GROUP BY 子句,从而加快了查询速度。【参考方案2】:尝试将子查询移到主查询中 - 像这样:
SELECT b.bucket AS b,
b.possible_free_slots - COUNT(l.availability) AS free_slots
FROM ip_bucket_list b
LEFT JOIN ip_bucket_list l
ON l.bucket = b.bucket
AND l.availability = 'used'
AND l.tday = 'evening'
AND l.day LIKE '2012-12-14%'
AND l.network = '10_83_mh1_bucket'
GROUP BY b.bucket, b.possible_free_slots
ORDER BY 2 DESC
【讨论】:
以上是关于使用算术运算优化 MySQL 嵌套选择的主要内容,如果未能解决你的问题,请参考以下文章