Mysql Join大表小表到底谁驱动谁

Posted valleyup

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Mysql Join大表小表到底谁驱动谁相关的知识,希望对你有一定的参考价值。

1、准备

mysql> create table dept(
    id int unsigned auto_increment not null primary key,
    name varchar(20) default ‘‘ not null,
    key(name)
)engine=innodb default charset=utf8mb4;
CREATE TABLE `userinfo` (
  `id` int NOT NULL AUTO_INCREMENT,
  `name` varchar(255) COLLATE utf8mb4_general_ci NOT NULL,
  `passwd` varchar(255) COLLATE utf8mb4_general_ci DEFAULT NULL,
  `phone` varchar(255) COLLATE utf8mb4_general_ci DEFAULT NULL,
  `dept` int NOT NULL DEFAULT 0,
  PRIMARY KEY (`id`),
  KEY `name` (`name`),
  KEY `dept` (`dept`)
) ENGINE=InnoDB AUTO_INCREMENT=1001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci 

用 excel 伪造数据,并导入

mysql> load data infile /var/lib/mysql-files/new5.txt into table dept fields terminated by   lines terminated by 
;
Query OK, 1000 rows affected (0.03 sec)
Records: 1000  Deleted: 0  Skipped: 0  Warnings: 0

mysql> load data infile /var/lib/mysql-files/new.txt into table userinfo fields terminated by   lines terminated by 
;
Query OK, 120000 rows affected (0.02 sec)
Records: 120000  Deleted: 0  Skipped: 0  Warnings: 0

dept 1000 条, userinfo 120000 条

一、查找 userinfo 名字是 a25 的 id,name,及 dept name

由于有 userinfo = ‘a25‘ 的限制条件,不管谁 join 谁,或者 left join,结果都是一样的

mysql> select count(1) from userinfo left join dept using(dept_id) where userinfo.name = a25;
+----------+
| count(1) |
+----------+
|      187 |
+----------+
1 row in set

mysql> select count(1) from dept left join userinfo using(dept_id) where userinfo.name = a25;
+----------+
| count(1) |
+----------+
|      187 |
+----------+
1 row in set

mysql> select count(1) from dept join userinfo using(dept_id) where userinfo.name = a25;
+----------+
| count(1) |
+----------+
|      187 |
+----------+
1 row in set

mysql> select count(1) from userinfo join dept using(dept_id) where userinfo.name = a25;
+----------+
| count(1) |
+----------+
|      187 |
+----------+
1 row in set

那就比较四种方式的效率:

开启慢查询:

set long_query_time = 0.0001;
set global slow_query_log = ON;
mysql> show variables like "slow%";
+---------------------+--------------------------------------+
| Variable_name       | Value                                |
+---------------------+--------------------------------------+
| slow_launch_time    | 2                                    |
| slow_query_log      | ON                                   |
| slow_query_log_file | /var/lib/mysql/muqing-web-2-slow.log |
+---------------------+--------------------------------------+
tail -f /var/lib/mysql/muqing-web-2-slow.log // 实时查看耗时

1、userinfo left join dept

mysql> explain select userinfo.user_id, userinfo.name, dept.name from userinfo left join dept using(dept_id) where userinfo.name = a25;
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
| id | select_type | table    | partitions | type   | possible_keys | key     | key_len | ref                   | rows | filtered | Extra       |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
|  1 | SIMPLE      | userinfo | NULL       | ref    | name,name_2   | name    | 1022    | const                 |  187 |      100 | NULL        |
|  1 | SIMPLE      | dept     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.userinfo.dept_id |    1 |      100 | Using where |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
2 rows in set

mysql> show status like last_query_cost;
+-----------------+------------+
| Variable_name   | Value      |
+-----------------+------------+
| Last_query_cost | 130.899000 |
+-----------------+------------+
1 row in set

mysql> explain select straight_join userinfo.user_id, userinfo.name, dept.name from userinfo left join dept using(dept_id) where userinfo.name = a25;
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
| id | select_type | table    | partitions | type   | possible_keys | key     | key_len | ref                   | rows | filtered | Extra       |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
|  1 | SIMPLE      | userinfo | NULL       | ref    | name,name_2   | name    | 1022    | const                 |  187 |      100 | NULL        |
|  1 | SIMPLE      | dept     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.userinfo.dept_id |    1 |      100 | Using where |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
2 rows in set

mysql> show status like last_query_cost;
+-----------------+------------+
| Variable_name   | Value      |
+-----------------+------------+
| Last_query_cost | 130.899000 |
+-----------------+------------+
1 row in set

可以看到优化器选择的是顺序是先 userinfo 表,warnings 信息:

mysql> explain select straight_join userinfo.user_id, userinfo.name, dept.name from userinfo left join dept using(dept_id) where userinfo.name = a25;
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
| id | select_type | table    | partitions | type   | possible_keys | key     | key_len | ref                   | rows | filtered | Extra       |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
|  1 | SIMPLE      | userinfo | NULL       | ref    | name,name_2   | name    | 1022    | const                 |  187 |      100 | NULL        |
|  1 | SIMPLE      | dept     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.userinfo.dept_id |    1 |      100 | Using where |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
2 rows in set

mysql> show warnings;
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                                            |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select straight_join `test`.`userinfo`.`user_id` AS `user_id`,`test`.`userinfo`.`name` AS `name`,`test`.`dept`.`name` AS `name` from `test`.`userinfo` left join `test`.`dept` on((`test`.`userinfo`.`dept_id` = `test`.`dept`.`dept_id`)) where (`test`.`userinfo`.`name` = a25) |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set

2、userinfo join dept

mysql> explain select userinfo.user_id, userinfo.name, dept.name from userinfo join dept using(dept_id) where userinfo.name = a25;
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
| id | select_type | table    | partitions | type   | possible_keys | key     | key_len | ref                   | rows | filtered | Extra       |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
|  1 | SIMPLE      | userinfo | NULL       | ref    | name,name_2   | name    | 1022    | const                 |  187 |      100 | NULL        |
|  1 | SIMPLE      | dept     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.userinfo.dept_id |    1 |      100 | Using where |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
2 rows in set

mysql> show status like last_query_cost;
+-----------------+------------+
| Variable_name   | Value      |
+-----------------+------------+
| Last_query_cost | 130.899000 |
+-----------------+------------+
1 row in set

mysql> explain select straight_join userinfo.user_id, userinfo.name, dept.name from userinfo join dept using(dept_id) where userinfo.name = a25;
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
| id | select_type | table    | partitions | type   | possible_keys | key     | key_len | ref                   | rows | filtered | Extra       |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
|  1 | SIMPLE      | userinfo | NULL       | ref    | name,name_2   | name    | 1022    | const                 |  187 |      100 | NULL        |
|  1 | SIMPLE      | dept     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.userinfo.dept_id |    1 |      100 | Using where |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
2 rows in set

mysql> show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                                         |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select straight_join `test`.`userinfo`.`user_id` AS `user_id`,`test`.`userinfo`.`name` AS `name`,`test`.`dept`.`name` AS `name` from `test`.`userinfo` join `test`.`dept` where ((`test`.`userinfo`.`name` = a25) and (`test`.`userinfo`.`dept_id` = `test`.`dept`.`dept_id`)) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
# Time: 2020-04-10T17:00:38.452393Z
# User@Host: root[root] @ localhost []  Id:    11
# Query_time: 0.000741  Lock_time: 0.000137 Rows_sent: 187  Rows_examined: 374
SET timestamp=1586538038;
select sql_no_cache userinfo.user_id, userinfo.name, dept.name from userinfo left join dept using(dept_id) where userinfo.name = a25;
# Time: 2020-04-10T17:01:49.946904Z
# User@Host: root[root] @ localhost []  Id:    11
# Query_time: 0.000766  Lock_time: 0.000149 Rows_sent: 187  Rows_examined: 374
SET timestamp=1586538109;
select straight_join userinfo.user_id, userinfo.name, dept.name from userinfo join dept using(dept_id) where userinfo.name = a25;

可以看出两个时间差不多

3、dept left join userinfo

mysql> explain select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from dept left join userinfo using(dept_id) where userinfo.name = a25;

+----+-------------+----------+------------+-------+---------------+------+---------+-------+------+----------+-------------+
| id | select_type | table    | partitions | type  | possible_keys | key  | key_len | ref   | rows | filtered | Extra       |
+----+-------------+----------+------------+-------+---------------+------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | dept     | NULL       | index | PRIMARY       | name | 82      | NULL  | 1000 |      100 | Using index |
|  1 | SIMPLE      | userinfo | NULL       | ref   | name,name_2   | name | 1022    | const |  187 |       10 | Using where |
+----+-------------+----------+------------+-------+---------------+------+---------+-------+------+----------+-------------+
2 rows in set

mysql> show status like last_query_cost;
+-----------------+---------------+
| Variable_name   | Value         |
+-----------------+---------------+
| Last_query_cost | 106671.417848 |
+-----------------+---------------+
1 row in set
# Time: 2020-04-10T17:32:02.890446Z
# User@Host: root[root] @  [10.0.0.105]  Id:    13
# Query_time: 0.134147  Lock_time: 0.000128 Rows_sent: 187  Rows_examined: 188000
SET timestamp=1586539922;
select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from dept left join userinfo using(dept_id) where userinfo.name = a25;

明显比上面的要慢

4、dept join userinfo

mysql> explain select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from dept join userinfo using(dept_id) where userinfo.name = a25;
+----+-------------+----------+------------+-------+---------------+------+---------+-------+------+----------+-------------+
| id | select_type | table    | partitions | type  | possible_keys | key  | key_len | ref   | rows | filtered | Extra       |
+----+-------------+----------+------------+-------+---------------+------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | dept     | NULL       | index | PRIMARY       | name | 82      | NULL  | 1000 |      100 | Using index |
|  1 | SIMPLE      | userinfo | NULL       | ref   | name,name_2   | name | 1022    | const |  187 |       10 | Using where |
+----+-------------+----------+------------+-------+---------------+------+---------+-------+------+----------+-------------+
2 rows in set

mysql> show status like last_query_cost;
+-----------------+---------------+
| Variable_name   | Value         |
+-----------------+---------------+
| Last_query_cost | 106671.417848 |
+-----------------+---------------+
1 row in set
# Time: 2020-04-10T17:33:35.458204Z
# User@Host: root[root] @  [10.0.0.105]  Id:    13
# Query_time: 0.133475  Lock_time: 0.000164 Rows_sent: 187  Rows_examined: 188000
SET timestamp=1586540015;
select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from dept join userinfo using(dept_id) where userinfo.name = a25;

速度和 3 一样,也远大于前2 

所以是大表驱动小表。而且 show status like ‘last_query_cost‘ 还挺准的

二、查找 dept 名字是 a25 的 id,name,及 dept name

 四种查询等效

mysql> select count(1) from userinfo left join dept using(dept_id) where dept.name = a25;
+----------+
| count(1) |
+----------+
|      360 |
+----------+
1 row in set

mysql> select count(1) from userinfo join dept using(dept_id) where dept.name = a25;
+----------+
| count(1) |
+----------+
|      360 |
+----------+
1 row in set

mysql> select count(1) from dept join userinfo using(dept_id) where dept.name = a25;
+----------+
| count(1) |
+----------+
|      360 |
+----------+
1 row in set

mysql> select count(1) from dept left join userinfo using(dept_id) where dept.name = a25;
+----------+
| count(1) |
+----------+
|      360 |
+----------+
1 row in set

1、userinfo left join dept

mysql> explain select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from userinfo left join dept using(dept_id) where dept.name = a25;
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+--------+----------+-------------+
| id | select_type | table    | partitions | type   | possible_keys | key     | key_len | ref                   | rows   | filtered | Extra       |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+--------+----------+-------------+
|  1 | SIMPLE      | userinfo | NULL       | ALL    | NULL          | NULL    | NULL    | NULL                  | 120234 |      100 | NULL        |
|  1 | SIMPLE      | dept     | NULL       | eq_ref | PRIMARY,name  | PRIMARY | 4       | test.userinfo.dept_id |      1 |        5 | Using where |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+--------+----------+-------------+
2 rows in set

mysql> show status like last_query_cost;
+-----------------+--------------+
| Variable_name   | Value        |
+-----------------+--------------+
| Last_query_cost | 54209.549000 |
+-----------------+--------------+
1 row in set
# Time: 2020-04-10T17:40:53.462276Z
# User@Host: root[root] @  [10.0.0.105]  Id:    13
# Query_time: 0.110679  Lock_time: 0.000145 Rows_sent: 360  Rows_examined: 240000
SET timestamp=1586540453;
select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from userinfo left join dept using(dept_id) where dept.name = a25;

2、userinfo join dept

mysql> explain select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from userinfo join dept using(dept_id) where dept.name = a25;
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+--------+----------+-------------+
| id | select_type | table    | partitions | type   | possible_keys | key     | key_len | ref                   | rows   | filtered | Extra       |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+--------+----------+-------------+
|  1 | SIMPLE      | userinfo | NULL       | ALL    | NULL          | NULL    | NULL    | NULL                  | 120234 |      100 | NULL        |
|  1 | SIMPLE      | dept     | NULL       | eq_ref | PRIMARY,name  | PRIMARY | 4       | test.userinfo.dept_id |      1 |        5 | Using where |
+----+-------------+----------+------------+--------+---------------+---------+---------+-----------------------+--------+----------+-------------+
2 rows in set

mysql> show status like last_query_cost;
+-----------------+--------------+
| Variable_name   | Value        |
+-----------------+--------------+
| Last_query_cost | 54209.549000 |
+-----------------+--------------+
# Time: 2020-04-10T17:44:21.701356Z
# User@Host: root[root] @  [10.0.0.105]  Id:    13
# Query_time: 0.105262  Lock_time: 0.000137 Rows_sent: 360  Rows_examined: 240000
SET timestamp=1586540661;
select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from userinfo join dept using(dept_id) where dept.name = a25;

速度比上面的好一点点

3、dept left join userinfo

mysql> explain select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from dept left join userinfo using(dept_id) where dept.name = a25;
+----+-------------+----------+------------+------+---------------+------+---------+-------+--------+----------+----------------------------------------------------+
| id | select_type | table    | partitions | type | possible_keys | key  | key_len | ref   | rows   | filtered | Extra                                              |
+----+-------------+----------+------------+------+---------------+------+---------+-------+--------+----------+----------------------------------------------------+
|  1 | SIMPLE      | dept     | NULL       | ref  | name          | name | 82      | const |      1 |      100 | Using index                                        |
|  1 | SIMPLE      | userinfo | NULL       | ALL  | NULL          | NULL | NULL    | NULL  | 120234 |      100 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+----------+------------+------+---------------+------+---------+-------+--------+----------+----------------------------------------------------+
2 rows in set

mysql> show status like last_query_cost;
+-----------------+--------------+
| Variable_name   | Value        |
+-----------------+--------------+
| Last_query_cost | 12128.032803 |
+-----------------+--------------+
# Time: 2020-04-10T17:47:06.147081Z
# User@Host: root[root] @  [10.0.0.105]  Id:    13
# Query_time: 0.031848  Lock_time: 0.000137 Rows_sent: 360  Rows_examined: 120001
SET timestamp=1586540826;
select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from dept left join userinfo using(dept_id) where dept.name = a25;

速度竟然比上面的块,不是大表驱动小表吗,继续看

4、dept join userinfo

mysql> explain select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from dept join userinfo using(dept_id) where dept.name = a25;
+----+-------------+----------+------------+------+---------------+------+---------+-------+--------+----------+----------------------------------------------------+
| id | select_type | table    | partitions | type | possible_keys | key  | key_len | ref   | rows   | filtered | Extra                                              |
+----+-------------+----------+------------+------+---------------+------+---------+-------+--------+----------+----------------------------------------------------+
|  1 | SIMPLE      | dept     | NULL       | ref  | PRIMARY,name  | name | 82      | const |      1 |      100 | Using index                                        |
|  1 | SIMPLE      | userinfo | NULL       | ALL  | NULL          | NULL | NULL    | NULL  | 120234 |       10 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+----------+------------+------+---------------+------+---------+-------+--------+----------+----------------------------------------------------+
2 rows in set

mysql> show status like last_query_cost;
+-----------------+--------------+
| Variable_name   | Value        |
+-----------------+--------------+
| Last_query_cost | 12128.032803 |
+-----------------+--------------+
# Time: 2020-04-10T17:48:44.738339Z
# User@Host: root[root] @  [10.0.0.105]  Id:    13
# Query_time: 0.032005  Lock_time: 0.000143 Rows_sent: 360  Rows_examined: 120001
SET timestamp=1586540924;
select sql_no_cache straight_join userinfo.user_id, userinfo.name, dept.name from dept join userinfo using(dept_id) where dept.name = a25;

速度也比 1,2快

left join 和 join 相比 按理说会慢一点,因为:join 的伪代码为

outer_iter = iterator over tb1 where col1 in(5, 6)

outer_row = outer_iter.next

while outer_row

  inner_iter = iterator over tb2 where col3 = outer_row.col3

  inner_row = inner_iter.next

  while inner_row

    out_put [ outer_row.col1, inner_row.col2 ]
    inner_row = inner_iter.next
  end

  outer_row = outer_iter.next

end

而 left join 会多一次判断,多一次返回 null 结果集。

猜想是 where 条件 获取的过滤多,就用谁来做外表,所以讲不定 一定是大表驱动小表

以上是关于Mysql Join大表小表到底谁驱动谁的主要内容,如果未能解决你的问题,请参考以下文章

mysql join 谁是驱动表

Spark的Join连接

Mysql如何Join一条语句

1025关于explain的补充1

关于hive中Map join 时大表left join小表的问题

mysql在连表查询时是小表驱动大表吗