Mysql + 大表 = 慢查询？

Posted 2023-04-14

技术标签:

【中文标题】Mysql + 大表 = 慢查询？【英文标题】：Mysql + big tables = slow queries? 【发布时间】：2012-05-07 15:35:39 【问题描述】：

mysql 上的大表存在一些性能问题：该表有 3800 万行，大小为 3GB。我想通过测试 2 列来选择：我尝试了许多索引（每列一个索引，一个 2 列索引），但我的查询仍然很慢：如下所示，获取 1644 行需要超过 4 秒：

SELECT `twstats_twwordstrend`.`id`, `twstats_twwordstrend`.`created`, `twstats_twwordstrend`.`freq`, `twstats_twwordstrend`.`word_id` FROM `twstats_twwordstrend` WHERE (`twstats_twwordstrend`.`word_id` = 1001 AND `twstats_twwordstrend`.`created` > '2011-11-07 14:01:34' );
...
...
...
1644 rows in set (4.66 sec)

EXPLAIN SELECT `twstats_twwordstrend`.`id`, `twstats_twwordstrend`.`created`, `twstats_twwordstrend`.`freq`, `twstats_twwordstrend`.`word_id` FROM `twstats_twwordstrend` WHERE (`twstats_twwordstrend`.`word_id` = 1001 AND `twstats_twwordstrend`.`created` > '2011-11-07 14:01:34' );
+----+-------------+----------------------+-------+-----------------------------------------------------+-----------------------+---------+------+------+-------------+
| id | select_type | table                | type  | possible_keys                                       | key                   | key_len | ref  | rows | Extra       |
+----+-------------+----------------------+-------+-----------------------------------------------------+-----------------------+---------+------+------+-------------+
|  1 | SIMPLE      | twstats_twwordstrend | range | twstats_twwordstrend_4b95d890,word_id_created_index | word_id_created_index | 12      | NULL | 1643 | Using where |
+----+-------------+----------------------+-------+-----------------------------------------------------+-----------------------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> describe twstats_twwordstrend;
+---------+----------+------+-----+---------+----------------+
| Field   | Type     | Null | Key | Default | Extra          |
+---------+----------+------+-----+---------+----------------+
| id      | int(11)  | NO   | PRI | NULL    | auto_increment |
| created | datetime | NO   |     | NULL    |                |
| freq    | double   | NO   |     | NULL    |                |
| word_id | int(11)  | NO   | MUL | NULL    |                |
+---------+----------+------+-----+---------+----------------+
4 rows in set (0.00 sec)

mysql> show index from twstats_twwordstrend;
+----------------------+------------+-------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table                | Non_unique | Key_name                      | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------------+------------+-------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| twstats_twwordstrend |          0 | PRIMARY                       |            1 | id          | A         |    38676897 |     NULL | NULL   |      | BTREE      |         |               |
| twstats_twwordstrend |          1 | twstats_twwordstrend_4b95d890 |            1 | word_id     | A         |      655540 |     NULL | NULL   |      | BTREE      |         |               |
| twstats_twwordstrend |          1 | word_id_created_index         |            1 | word_id     | A         |      257845 |     NULL | NULL   |      | BTREE      |         |               |
| twstats_twwordstrend |          1 | word_id_created_index         |            2 | created     | A         |    38676897 |     NULL | NULL   |      | BTREE      |         |               |
+----------------------+------------+-------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.03 sec)

我还发现只取表中很远的一行非常慢：

mysql> SELECT `twstats_twwordstrend`.`id`, `twstats_twwordstrend`.`created`, `twstats_twwordstrend`.`freq`, `twstats_twwordstrend`.`word_id` FROM `twstats_twwordstrend` limit 10000000,1;
+----------+---------------------+--------------------+---------+
| id       | created             | freq               | word_id |
+----------+---------------------+--------------------+---------+
| 10000001 | 2011-09-09 15:59:18 | 0.0013398539559188 |   41295 |
+----------+---------------------+--------------------+---------+
1 row in set (1.73 sec)

...而且表格开头不慢：

mysql> SELECT `twstats_twwordstrend`.`id`, `twstats_twwordstrend`.`created`, `twstats_twwordstrend`.`freq`, `twstats_twwordstrend`.`word_id` FROM `twstats_twwordstrend` limit 1,1;
+----+---------------------+---------------------+---------+
| id | created             | freq                | word_id |
+----+---------------------+---------------------+---------+
|  2 | 2011-06-16 10:59:06 | 0.00237777777777778 |       2 |
+----+---------------------+---------------------+---------+
1 row in set (0.00 sec)

该表使用 Innodb 引擎。如何加快大表的查询速度？

【问题讨论】：

如果你不需要Innodb's functionality，基本上可以使用Myisam 而不是Innodb。您的 'far away' 参数无效，因为选择 LIMIT 100000,1 实际上选择了 100000 行，然后发送下一行。尝试选择 ID 2 或 ID 10000001 - 两者都会立即出现。此表中存储的记录通常如何更改？是否有很多删除/更新，或主要是插入？ @korenak ：对我来说，它在两种情况下都选择了所有行，但在所选偏移量处只返回一个 【参考方案1】：

您可以做的主要事情是添加索引。

在 where 子句中使用列时，请确保它具有索引。您创建的列中没有。

包含 created 列的多索引本质上不是 created 的索引，因为 created 不是多索引中的第一个。

使用多索引时，您几乎应该总是将具有较高基数的列放在首位。因此，将索引设置为：(created, word_id)、(word_id) 会给您带来显着的提升。

【讨论】：

+1 正确。 (word_id,created) 上应该有一个索引，以使查询获得最大收益。它不在我的例子中，但我已经使用创建列的索引以及多索引（已创建，word_id）进行了测试...... @Eric：我和你有类似的问题，你是怎么解决这个问题的？甚至我的桌子大小也有 1000 万左右。查询大约需要 1.2 秒，这是不希望的，预计它会在 200 毫秒内。我也按顺序用多列索引，因为查询和结果仍然很糟糕。【参考方案2】：

带有LIMIT 10000000,1 的查询总是很慢，因为它需要获取超过 1000 万行（它会忽略除最后一行之外的所有行）。如果您的应用程序需要定期进行此类查询，请考虑重新设计。

表格没有“开始”和“结束”；它们本身并不是有序的。

在我看来，您需要在 (word_id, created) 上建立索引。

您绝对应该在具有生产级硬件的非生产服务器上对此进行性能测试。

顺便说一句，如今 3Gb 的数据库并不算大，它可以容纳除最小服务器之外的所有服务器上的 RAM（您运行的是 64 位操作系统，对，并且已经适当地调整了 innodb_buffer_pool？或者您的系统管理员这样做了？ )。

【讨论】：

关于您的限制/偏移说明：在显示列表时，假设从第 10000000 行到第 10000030 行：什么应该是快速查询？？ LIMIT 导致服务器读取，但忽略行。 LIMIT 1000 万，任何东西，都会很慢。如果您的应用程序需要这些类型的查询，那么它的设计很糟糕，需要修复。我想知道我怎样才能更快地进行这种查询，假设我每页有 24 个项目的分页，并且我有 1 百万行，我怎样才能更有效地做到这一点？谢谢

以上是关于Mysql + 大表 = 慢查询？的主要内容，如果未能解决你的问题，请参考以下文章