写在前面的话
不要求每个人一定理解 联表查询(join/left join/inner join等)时的mysql运算过程;
不要求每个人一定知道线上(现在或未来)哪张表数据量大,哪张表数据量小;
但要经常使用explain查看执行计划,这是一种美德!
联表查询的基础知识
下面两个查询,它们只差了一个order by,效果却迥然不同。
第一个查询:
EXPLAIN extended SELECT ads.id FROM ads, city WHERE city.city_id = 8005 AND ads.status = ‘online‘ AND city.ads_id=ads.id ORDER BY ads.id desc
执行计划为:
id select_type table type possible_keys key key_len ref rows filtered Extra ------ ----------- ------ ------ -------------- ------- ------- -------------------- ------ -------- ------------------------------- 1 SIMPLE city ref ads_id,city_id city_id 4 const 2838 100.00 Using temporary; Using filesort 1 SIMPLE ads eq_ref PRIMARY PRIMARY 4 city.ads_id 1 100.00 Using where
第二个查询:
EXPLAIN extended SELECT ads.id FROM ads,city WHERE city.city_id =8005 AND ads.status = ‘online‘ AND city.ads_id=ads.id ORDER BY city.ads_id desc
执行计划里没有了using temporary:
id select_type table type possible_keys key key_len ref rows filtered Extra ------ ----------- ------ ------ -------------- ------- ------- -------------------- ------ -------- --------------------------- 1 SIMPLE city ref ads_id,city_id city_id 4 const 2838 100.00 Using where; Using filesort 1 SIMPLE ads eq_ref PRIMARY PRIMARY 4 city.ads_id 1 100.00 Using where
为什么第一个查询using temporary,第二个查询不用临时表呢?
驱动表的定义
1)指定了联接条件时,满足查询条件的记录行数少的表为驱动表;
2)未指定联接条件时,行数少的表为驱动表(Important!)。
忠告:如果你搞不清楚该让谁做驱动表、谁 join 谁,请让 MySQL 运行时自行判断
小结果集驱动大结果集
实例讲解
先了解一下 mb 表有 千万级记录,mbei 表要少得多。慢查实例如下:
explain SELECT mb.id, …… FROMmb LEFT JOIN mbei ON mb.id=mbei.mb_id INNER JOIN u ON mb.uid=u.uid WHERE 1=1 ORDER BY mbei.apply_time DESC limit 0,10
id select_type table type possible_keys key key_len ref rows Extra ------ ----------- ------ ------ -------------- -------------- ------- ------------------- ------- -------------------------------------------- 1 SIMPLE mb index userid userid 4 (NULL) 6060455 Using index; Using temporary; Using filesort 1 SIMPLE mbei eq_ref mb_id mb_id 4 mb.id 1 1 SIMPLE u eq_ref PRIMARY PRIMARY 4 mb.uid 1 Using index
由于动用了“LEFT JOIN”,所以攻城狮已经指定了驱动表,虽然这张驱动表的结果集记录数达到百万级!
如何优化?
优化第一步:LEFT JOIN改为JOIN
干嘛要 left join 啊?直接 join!
explain SELECT mb.id…… FROM mb JOIN mbei ON mb.id=mbei.mb_id INNER JOIN u ON mb.uid=u.uid WHERE 1=1 ORDER BY mbei.apply_time DESC limit 0,10
立竿见影,驱动表立刻变为小表 mbei 了, Using temporary 消失了,影响行数少多了:
id select_type table type possible_keys key key_len ref rows Extra ------ ----------- ------ ------ -------------- ------- ------- ---------------------------- ------ -------------- 1 SIMPLE mbei ALL mb_id (NULL) (NULL) (NULL) 13383 Using filesort 1 SIMPLE mb eq_ref PRIMARY,userid PRIMARY 4 mbei.mb_id 1 1 SIMPLE u eq_ref PRIMARY PRIMARY 4 mb.uid 1 Using index
优化第一步之分支1:尽量不要根据非驱动表的字段排序
left join不变。干嘛要根据非驱动表的字段排序呢?我们前面说过“对驱动表可以直接排序,对非驱动表(的字段排序)需要对循环查询的合并结果(临时表)进行排序!”的。
explain SELECT mb.id…… FROM mb LEFT JOIN mbei ON mb.id=mbei.mb_id INNER JOINu ON mb.uid=u.uid WHERE 1=1 ORDER BY mb.id DESC limit 0,10
也满足业务场景,做到了rows最小:
id select_type table type possible_keys key key_len ref rows Extra ------ ----------- ------ ------ -------------- -------------- ------- ------------------- ------ ----------- 1 SIMPLE mb index userid PRIMARY 4 (NULL) 10 1 SIMPLE mbei eq_ref mb_id mb_id 4 mb.id 1 Using index 1 SIMPLE u eq_ref PRIMARY PRIMARY 4 mb.uid 1 Using index
优化第二步:去除所有JOIN,让MySQL自行决定!
写这么多密密麻麻的 left join/inner join 很开心吗?
explain SELECT mb.id…… FROM mb,mbei,u WHERE mb.id=mbei.mb_id and mb.uid=u.user_id order by mbei.apply_time desc limit 0,10
立竿见影,驱动表一样是小表 mbei:
id select_type table type possible_keys key key_len ref rows Extra ------ ----------- ------ ------ -------------- ------- ------- ---------------------------- ------ -------------- 1 SIMPLE mbei ALL mb_id (NULL) (NULL) (NULL) 13388 Using filesort 1 SIMPLE mb eq_ref PRIMARY,userid PRIMARY 4 mbei.mb_id 1 1 SIMPLE u eq_ref PRIMARY PRIMARY 4 mb.uid 1 Using index
总结
- 出现了Using temporary;
- rows过多,或者几乎是全表的记录数;
- key 是 (NULL);
- possible_keys 出现过多(待选)索引。