为啥 MySQL 查询使用连接缓冲区?

Posted

技术标签:

【中文标题】为啥 MySQL 查询使用连接缓冲区?【英文标题】:Why is MySQL query using join buffer?为什么 MySQL 查询使用连接缓冲区? 【发布时间】:2013-10-19 12:43:46 【问题描述】:

以下查询正在使用连接缓冲区,我想知道是否有人可以向我解释为什么会这样。只是想进一步了解mysql和索引。

mysql> EXPLAIN SELECT events.event_topic_id, event_topic_name, event_topic_image, event_type_name,city_name FROM events
    ->             JOIN event_topic ON event_topic.event_topic_id=events.event_topic_id
    ->             JOIN event_type ON event_type.event_type_id = event_topic.event_type_id
    ->             JOIN locations ON locations.location_id=events.location_id
    ->             JOIN city ON city.city_id=locations.city_id
    ->             WHERE event_date > NOW()
    ->             GROUP BY events.event_topic_id, city.city_id;
+----+-------------+-------------+--------+---------------------------------------+-----------------+---------+--------------------------------------+------+----------+----------------------------------------------+
| id | select_type | table       | type   | possible_keys                         | key             | key_len | ref                                  | rows | filtered | Extra                                        |
+----+-------------+-------------+--------+---------------------------------------+-----------------+---------+--------------------------------------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | city        | index  | PRIMARY                               | city_name       | 52      | NULL                                 |    6 |   100.00 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | locations   | ref    | PRIMARY,city_id                       | city_id         | 1       | PremiumCONNECT.city.city_id          |    1 |   100.00 | Using index                                  |
|  1 | SIMPLE      | events      | ref    | location_id,event_topic_id,event_date | location_id     | 2       | PremiumCONNECT.locations.location_id |    3 |   100.00 | Using where                                  |
|  1 | SIMPLE      | event_type  | index  | PRIMARY                               | event_type_name | 52      | NULL                                 |    2 |   100.00 | Using index; Using join buffer               |
|  1 | SIMPLE      | event_topic | eq_ref | PRIMARY,event_type_id                 | PRIMARY         | 1       | PremiumCONNECT.events.event_topic_id |    1 |   100.00 | Using where                                  |
+----+-------------+-------------+--------+---------------------------------------+-----------------+---------+--------------------------------------+------+----------+----------------------------------------------+

事件表:

CREATE TABLE `events` (
  `event_id` smallint(8) unsigned NOT NULL AUTO_INCREMENT,
  `location_id` smallint(3) unsigned NOT NULL,
  `event_date` datetime NOT NULL,
  `event_topic_id` tinyint(3) unsigned NOT NULL,
  PRIMARY KEY (`event_id`),
  KEY `location_id` (`location_id`),
  KEY `event_topic_id` (`event_topic_id`),
  KEY `event_date` (`event_date`),
  CONSTRAINT `events_ibfk_2` FOREIGN KEY (`event_topic_id`) REFERENCES `event_topic` (`event_topic_id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `events_ibfk_3` FOREIGN KEY (`location_id`) REFERENCES `locations` (`location_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=91 DEFAULT CHARSET=latin1

活动主题表:

CREATE TABLE `event_topic` (
  `event_topic_id` tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
  `event_topic_name` varchar(100) DEFAULT NULL,
  `event_topic_description` text NOT NULL,
  `event_topic_cost` decimal(7,2) DEFAULT NULL,
  `event_type_id` tinyint(3) unsigned NOT NULL,
  `event_topic_clickthrough` tinytext,
  `event_topic_length` varchar(6) NOT NULL,
  `event_topic_image` varchar(41) DEFAULT NULL,
  `event_topic_image_md5` char(32) NOT NULL,
  PRIMARY KEY (`event_topic_id`),
  KEY `event_type_id` (`event_type_id`),
  KEY `topic_image_sha1` (`event_topic_image_md5`),
  CONSTRAINT `event_topic_ibfk_1` FOREIGN KEY (`event_type_id`) REFERENCES `event_type` (`event_type_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=14 DEFAULT CHARSET=latin1

事件类型表:

CREATE TABLE `event_type` (
  `event_type_id` tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
  `event_type_name` varchar(50) NOT NULL,
  `conf_email` text,
  PRIMARY KEY (`event_type_id`),
  KEY `event_type_name` (`event_type_name`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1

位置表:

CREATE TABLE `locations` (
  `location_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
  `location_name` varchar(50) NOT NULL,
  `location_address` tinytext NOT NULL,
  `location_capacity` smallint(6) NOT NULL,
  `city_id` tinyint(3) unsigned NOT NULL,
  `gps_coords` varchar(30) DEFAULT NULL,
  PRIMARY KEY (`location_id`),
  KEY `city_id` (`city_id`),
  CONSTRAINT `locations_ibfk_1` FOREIGN KEY (`city_id`) REFERENCES `city` (`city_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=latin1

城市表:

CREATE TABLE `city` (
  `city_id` tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
  `city_name` varchar(50) NOT NULL,
  PRIMARY KEY (`city_id`),
  UNIQUE KEY `city_name` (`city_name`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1 

【问题讨论】:

s.petrunia.net/blog/?p=18 通常使用连接缓冲区是一件好事(但也表明您的查询可能会被调整)。 这个执行计划看起来很愚蠢。但是在表中给定的行数下,引擎在所有表之间执行交叉连接的速度可能比它找到一个好的执行计划的速度要快。 【参考方案1】:

正如它在 'http://dev.mysql.com/doc/refman/5.1/en/explain-output.html' 中所说:“来自早期连接的表被部分读取到连接缓冲区中,然后从缓冲区中使用它们的行来执行与当前表的连接。”

因此,在您的情况下,您已经加入了 event_topic,因此优化器能够使用加入缓冲区中的 event_topic 内容。

使用缓冲区是一件好事;您可能在 EXPLAIN 输出的第一行注意到了不受欢迎的“使用临时;使用文件排序”,这可能来自 GROUP BY,在这种情况下可能是不可避免的。

顺便问一下,你会遇到关于 city_name 的“UNIQUE”约束的问题吗?我在想斯普林菲尔德(两个在新泽西)、华盛顿、格林维尔等。

【讨论】:

好的,有道理。我调查它的原因是 mysqltuner 抱怨有“在没有索引的情况下执行的连接”,我应该增加 join_buffer_size 或“总是使用带有连接的索引”。有没有办法知道这个查询需要缓冲多少信息? W.r.t 由于城市是独一无二的,这个系统中的活动将仅在南非的主要城市举办,因此不会有任何同名的城市。感谢您指出这一点! 好吧 - 假设你正在为美国编写一个系统,那我就错了。【参考方案2】:

尝试使用:

"STRAIGHT_JOIN" and "FORCE INDEX":
EXPLAIN SELECT events.event_topic_id, event_topic_name, event_topic_image, event_type_name,city_name FROM events
    ->             straight_join event_topic force index(primary) ON event_topic.event_topic_id=events.event_topic_id
    ->             straight_join event_type force index(primary) ON event_type.event_type_id = event_topic.event_type_id
    ->             straight_join locations force index(primary) ON locations.location_id=events.location_id
    ->             straight_join city force index(primary) ON city.city_id=locations.city_id
    ->             WHERE event_date > NOW()
    ->             GROUP BY events.event_topic_id, city.city_id;

顺便说一句,使用连接缓冲区并不好。这意味着您需要改进或参考正确的索引。

【讨论】:

以上是关于为啥 MySQL 查询使用连接缓冲区?的主要内容,如果未能解决你的问题,请参考以下文章

buffer的相关小知识

如何使用连接缓冲区(块嵌套循环)错误修复MySql的LEFT JOIN?

MySQL 内存占用总是太高,你需要立即进行这些操作……

MySQL性能的五大配置参数(内存参数)以及mysql内存占用过多优化

MySql优化

mysql的缓冲查询和非缓冲查询