从不使用以 DATETIME 作为复合键第一部分的主键索引
Posted
技术标签:
【中文标题】从不使用以 DATETIME 作为复合键第一部分的主键索引【英文标题】:Primary key index with a DATETIME as first part of the compound key is never used 【发布时间】:2011-12-30 13:28:38 【问题描述】:我在将日期时间(甚至日期)作为主键的第一部分进行索引时遇到问题。
我使用 mysql 5.5
这是我的两张桌子:
-- This is my standard table with dateDim as a dateTime
CREATE TABLE `stats` (
`dateDim` datetime NOT NULL,
`accountDim` mediumint(8) unsigned NOT NULL,
`execCodeDim` smallint(5) unsigned NOT NULL,
`operationTypeDim` tinyint(3) unsigned NOT NULL,
`junkDim` tinyint(3) unsigned NOT NULL,
`ipCountryDim` smallint(5) unsigned NOT NULL,
`count` int(10) unsigned NOT NULL,
`amount` bigint(20) NOT NULL,
PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
-- Here is a copy with datDim as an integer
CREATE TABLE `stats_todays` (
`dateDim` int(11) unsigned NOT NULL,
`accountDim` mediumint(8) unsigned NOT NULL,
`execCodeDim` smallint(5) unsigned NOT NULL,
`operationTypeDim` tinyint(3) unsigned NOT NULL,
`junkDim` tinyint(3) unsigned NOT NULL,
`ipCountryDim` smallint(5) unsigned NOT NULL,
`count` int(10) unsigned NOT NULL,
`amount` bigint(20) NOT NULL,
PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
我用完全相同的数据填充两个表(接近 10 000 000)
但是:
stats 表对 dateDim 使用 DATETIME stats_todays 使用 un INTEGER 和 TO_DAYS() 作为 dateDim我的问题是:为什么当索引的第一部分是日期时间时 MySQL 不使用主键??? 这很奇怪,因为使用相同的数据但与 INTEGER 和 TO_DAYS(dateDim) 合并后,相同的请求会动摇......
统计表(和日期时间)示例:
SELECT *
FROM `stats`
WHERE
dateDim = '2014-04-03 00:00:00'
AND accountDim = 4
AND execCodeDim = 9
AND operationTypeDim = 1
AND junkDim = 5
AND ipCountryDim = 3
=> 1 result (4.5sec)
Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE stats ALL NULL NULL NULL NULL 8832329 Using where
对另一个表 stats_todays 的相同请求(使用 INTEGER 和 TO_DAYS())
EXPLAIN SELECT *
FROM `stats_todays`
WHERE
dateDim = TO_DAYS('2014-04-03 00:00:00')
AND accountDim = 4
AND execCodeDim = 9
AND operationTypeDim = 1
AND junkDim = 5
AND ipCountryDim = 3
=> Result 1 row (0.0003 sec)
Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE stats_todays const PRIMARY PRIMARY 13 const,const,const,const,const,const 1
如果您阅读了完整的帖子,您就会明白这不是一个低基数问题,因为请求使用完全相同的基数与 INTEGER dateDim 字段......
以下是一些高级细节:
SELECT COUNT( DISTINCT dateDim )
FROM stats_todays
UNION ALL
SELECT COUNT( DISTINCT dateDim )
FROM stats;
Result:
COUNT(DISTINCT dateDim)
2192
2192
这里是索引描述:
SHOW INDEXES FROM `stats`
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
stats 0 PRIMARY 1 dateDim A 6921 NULL NULL BTREE
stats 0 PRIMARY 2 accountDim A 883232 NULL NULL BTREE
stats 0 PRIMARY 3 execCodeDim A 8832329 NULL NULL BTREE
stats 0 PRIMARY 4 operationTypeDim A 8832329 NULL NULL BTREE
stats 0 PRIMARY 5 junkDim A 8832329 NULL NULL BTREE
stats 0 PRIMARY 6 ipCountryDim A 8832329 NULL NULL BTREE
SHOW INDEXES FROM `stats_todays`
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
stats_todays 0 PRIMARY 1 dateDim A 7518 NULL NULL BTREE
stats_todays 0 PRIMARY 2 accountDim A 4022582 NULL NULL BTREE
stats_todays 0 PRIMARY 3 execCodeDim A 8045164 NULL NULL BTREE
stats_todays 0 PRIMARY 4 operationTypeDim A 8045164 NULL NULL BTREE
stats_todays 0 PRIMARY 5 junkDim A 8045164 NULL NULL BTREE
stats_todays 0 PRIMARY 6 ipCountryDim A 8045164 NULL NULL BTREE
SELECT dateDim, COUNT(*) FROM stats GROUP BY dateDim WITH ROLLUP
表示有 2192 个不同的日期,并且重新分区很顺利(按日期大约 3000 - 4000 行) 表中有 8 831 990 行 其他表也一样 我尝试使用 COVERING INDEX(将 * 替换为所有 PK 列)=> 没有任何改变 我试过 force|use index => 没有任何改变 与日期字段相同,而不是日期时间 与 INDEX 或 UNIQUE 相同,而不是主键【问题讨论】:
这确实很奇怪。如果您使用date
而不是 datetime
会发生同样的情况吗?
是的,完全一样
如果你运行WHERE dateDim = DATE('2014-04-03 00:00:00')
?
通过重新排序 pk 可以正常工作。但实际上,我想在 where 子句中只使用 dateDim 和 accountDim 提出请求。我将所有 pk 字段用于案例研究...
WHERE dateDim = DATE('2014-04-03 00:00:00') => 没有任何改变
【参考方案1】:
这是 5.5.x 中的一个错误。见here
这表明您的查询应该是
SELECT *
FROM `stats`
WHERE
dateDim = CAST('2014-04-03 00:00:00' as datetime)
AND accountDim = 4
AND execCodeDim = 9
AND operationTypeDim = 1
AND junkDim = 5
AND ipCountryDim = 3
【讨论】:
【参考方案2】:自int版本的表格
CREATE TABLE `stats_todays` (
`dateDim` int(11) unsigned NOT NULL,
`accountDim` mediumint(8) unsigned NOT NULL,
`execCodeDim` smallint(5) unsigned NOT NULL,
`operationTypeDim` tinyint(3) unsigned NOT NULL,
`junkDim` tinyint(3) unsigned NOT NULL,
`ipCountryDim` smallint(5) unsigned NOT NULL,
`count` int(10) unsigned NOT NULL,
`amount` bigint(20) NOT NULL,
PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
在查询方面工作正常,您应该让 dateDim 包含日期时间字符串的UNIX_TIMESTAMP()。您的查询看起来更像这样:
SELECT *
FROM `stats`
WHERE
dateDim = UNIX_TIMESTAMP('2014-04-03 00:00:00')
AND accountDim = 4
AND execCodeDim = 9
AND operationTypeDim = 1
AND junkDim = 5
AND ipCountryDim = 3
【讨论】:
以上是关于从不使用以 DATETIME 作为复合键第一部分的主键索引的主要内容,如果未能解决你的问题,请参考以下文章