20180509MySQL5.7 新特性之虚拟列的使用
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了20180509MySQL5.7 新特性之虚拟列的使用相关的知识,希望对你有一定的参考价值。
摘要
在mysql 5.7中,支持俩种的Generated Column,即Virtual Generated Column和Stored Generated Column,前者只将Generated Column 保存在数据字典中(表的元数据),并不会将这一列数据持久化到磁盘上;后者会将Generated Column 持久化到磁盘上,而不是每次读取的时候计算所得。很明显,后者存放了可以通过已有的数据计算得的数据,需要更多的磁盘空间,与Virtual Column相比并没有优势,因此,MySQL5.7中,不指定Generated Column的类型的时候,默认是Virtual Generated Column。
- 如果需要Stored Generated Column的话,可能在Virtual Genterated Column上建立索引更加合适。
语法
<type> [ GENERATED ALWAYS ] AS ( <expression> ) [ VIRTUAL|STORED ] [ UNIQUE [ KEY ] ] [ NOT NULL ] [COLUMN <text> ]
实际应用
- 表结构
mysql> show create table fen_simpic \G
*************************** 1. row ***************************
Table: fen_simpic
Create Table: CREATE TABLE `fen_simpic` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`group` int(11) NOT NULL COMMENT ‘截图的视频帖号‘,
`item` int(2) NOT NULL COMMENT ‘截图的顺序号‘,
`mh` char(144) DEFAULT NULL COMMENT ‘截图的汉明哈希值‘,
`dct` bigint(20) unsigned DEFAULT NULL COMMENT ‘截图的dct哈希值‘,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT ‘记录生成时间‘,
PRIMARY KEY (`id`),
KEY `created_at` (`created_at`),
KEY `group` (`group`,`item`),
) ENGINE=InnoDB AUTO_INCREMENT=2599837 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql>
2.慢SQL和执行计划
mysql> explain select `group`, `item` , dct , mh, bit_count(dct^17228540329887592107) as dist from fen_simpic force index(created_at) where created_at<"2018-05-08 21:44:09" and created_at>"2018-04-09 10:15:50.463238" and `group` not in (120381696,120381705,120381709,120381714,120381718,120381736,120381747,120381753,120381763,120381776,120381787,120381808,120381820,120381837,120381857,120381861,120382022,120381776) and (`item`>=3 and `item`<=5) having dist<=26 order by dist limit 5000;
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+----------------------------------------------------+
| 1 | SIMPLE | fen_simpic | NULL | range | created_at | created_at | 4 | NULL | 1071840 | 5.55 | Using index condition; Using where; Using filesort |
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+----------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql>
3.请求耗时
mysql> show profile for query 52;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.008504 |
| checking permissions | 0.000009 |
| Opening tables | 0.000028 |
| init | 0.000049 |
| System lock | 0.000012 |
| optimizing | 0.000017 |
| statistics | 0.000107 |
| preparing | 0.000025 |
| Sorting result | 0.000006 |
| executing | 0.000003 |
| Sending data | 0.000010 |
| Creating sort index | 1.088568 |
| end | 0.000011 |
| query end | 0.000013 |
| closing tables | 0.000010 |
| freeing items | 0.000270 |
| logging slow query | 0.000060 |
| cleaning up | 0.000018 |
+----------------------+----------+
18 rows in set, 1 warning (0.00 sec)
4.创建虚拟列
mysql> alter table fen_simpic add column dist tinyint(1) generated always as (bit_count(dct^17228540329887592107)) virtual;
mysql> alter table fen_simpic add index idx_dist(dist);
mysql> show create table fen_simpic \G
*************************** 1. row ***************************
Table: fen_simpic
Create Table: CREATE TABLE `fen_simpic` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`group` int(11) NOT NULL COMMENT ‘截图的视频帖号‘,
`item` int(2) NOT NULL COMMENT ‘截图的顺序号‘,
`mh` char(144) DEFAULT NULL COMMENT ‘截图的汉明哈希值‘,
`dct` bigint(20) unsigned DEFAULT NULL COMMENT ‘截图的dct哈希值‘,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT ‘记录生成时间‘,
`dist` tinyint(1) GENERATED ALWAYS AS (bit_count((`dct` ^ 17228540329887592107))) VIRTUAL,
PRIMARY KEY (`id`),
KEY `created_at` (`created_at`),
KEY `group` (`group`,`item`),
KEY `idx_dist` (`dist`)
) ENGINE=InnoDB AUTO_INCREMENT=2599837 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql>
5.执行SQL
mysql> explain select `group`, `item` , dct , mh, dist from fen_simpic force index(idx_dist) where created_at<"2018-05-08 21:44:09" and created_at>"2018-04-09 10:15:50.463238" and `group` not in (120381696,120381705,120381709,120381714,120381718,120381736,120381747,120381753,120381763,120381776,120381787,120381808,120381820,120381837,120381857,120381861,120382022,120381776) and (`item`>=3 and `item`<=5) having dist<=26 order by dist limit 5000;
+----+-------------+------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | fen_simpic | NULL | index | NULL | idx_dist | 2 | NULL | 2502423 | 0.62 | Using where |
+----+-------------+------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql>
6.请求耗时
mysql> show profile for query 57;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000133 |
| checking permissions | 0.000009 |
| Opening tables | 0.000029 |
| init | 0.000049 |
| System lock | 0.000012 |
| optimizing | 0.000016 |
| statistics | 0.000029 |
| preparing | 0.000023 |
| Sorting result | 0.000006 |
| executing | 0.000003 |
| Sending data | 0.212587 |
| end | 0.000013 |
| query end | 0.000012 |
| closing tables | 0.000012 |
| freeing items | 0.000279 |
| cleaning up | 0.000018 |
+----------------------+----------+
16 rows in set, 1 warning (0.00 sec)
7.进一步改进的SQL
mysql> explain select t1.`group`, t1.`item` , t1.dct , t1.dist from fen_simpic t1 inner join (select id,dist from fen_simpic force index(idx_dist) where created_at<"2018-05-08 21:44:09" and created_at>"2018-04-09 10:15:50.463238" and `group` not in (120381696,120381705,120381709,120381714,120381718,120381736,120381747,120381753,120381763,120381776,120381787,120381808,120381820,120381837,120381857,120381861,120382022,
+----+-------------+------------+------------+--------+---------------+----------+---------+-------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+--------+---------------+----------+---------+-------+---------+----------+-------------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 5000 | 100.00 | NULL |
| 1 | PRIMARY | t1 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | t2.id | 1 | 100.00 | NULL |
| 2 | DERIVED | fen_simpic | NULL | index | NULL | idx_dist | 2 | NULL | 2502423 | 0.62 | Using where |
+----+-------------+------------+------------+--------+---------------+----------+---------+-------+---------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
mysql>
8.进一步改进的SQL的耗时
mysql> show profile for query 58;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.005367 |
| checking permissions | 0.000007 |
| checking permissions | 0.000005 |
| Opening tables | 0.000032 |
| init | 0.000081 |
| System lock | 0.000013 |
| optimizing | 0.000015 |
| optimizing | 0.000015 |
| statistics | 0.000031 |
| preparing | 0.000023 |
| Sorting result | 0.000010 |
| statistics | 0.000026 |
| preparing | 0.000011 |
| executing | 0.000009 |
| Sending data | 0.000009 |
| executing | 0.000002 |
| Sending data | 0.201685 |
| end | 0.000012 |
| query end | 0.000013 |
| closing tables | 0.000005 |
| removing tmp table | 0.000008 |
| closing tables | 0.000009 |
| freeing items | 0.000340 |
| cleaning up | 0.000028 |
+----------------------+----------+
24 rows in set, 1 warning (0.00 sec)
总结
- 在原生的SQL中刚刚开始有使用force index(created_at) 主要是因为在进行所有过滤的时候,过滤的数据一般超过30%左右就会进行全文扫描,不会使用索引。所以才会使用强制索引,还有就是在选择索引的时候会选择选择率比较高的索引。
- 在进行SQL耗时分析的时候,可以比较明显的看出耗时大部分都是在Create sort index上面,因为排序使用的是dist,这个列在表中实际上是不存在的,所以会在计算完之后再创建排序索引。
- 虚拟列在类似与这种计算后的值进行排序和过滤有很大的帮助。
- 在优化之后进行进一步的SQL改写的目的,其实是为了减少返回的数据量。
引用
http://www.cnblogs.com/raichen/p/5227449.html
以上是关于20180509MySQL5.7 新特性之虚拟列的使用的主要内容,如果未能解决你的问题,请参考以下文章