如何改进键表和另一个表之间的这种 JOIN 操作

Posted

技术标签:

【中文标题】如何改进键表和另一个表之间的这种 JOIN 操作【英文标题】:How can I impove this JOIN operation between key table and another 【发布时间】:2013-10-07 17:32:55 【问题描述】:

我的问题是我有两个不同的查询在不同的环境中运行良好 情况。

架构

  messages 
      message_id, entity_id, message, timestamp

   subscription
      user_id, entity_id

   users
      user_id

   entities
      entity_id

情况一:消息条目很多,至少有一个相关的订阅条目

情况 2:消息条目很少和/或相关的订阅条目很少或为零

我的两个查询是:

 SELECT messages.*
   FROM messages
   STRAIGHT_JOIN subscription ON subscription.entity_id = messages.entity_id
   WHERE subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

此查询在情况 1(0.000x 秒)中运行良好:大量消息条目,以及至少一个相关订阅条目。在情况 2 中,此查询将花费 1.7 秒以上。

 SELECT messages.*
   FROM messages
   INNER JOIN subscription ON subscription.entity_id = messages.entity_id
   WHERE subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

此查询在情况 2(0.000x 秒)中运行良好:消息条目很少和/或相关的订阅条目很少或为零。在情况 1 中,此查询将花费 1.3 秒以上。

是否有一个我可以使用的查询可以两全其美?如果没有,最好的方法是什么 处理这个案子?

索引:

( subscription.user_id, subscription.entity_id )
( subscription.entity_id )
( messages.entity_id, messages.timestamp )
( messages.timestamp )

解释信息

限制 50

| id | select_type | table             | type   | possible_keys                           | key           | key_len | ref                                    | rows | Extra       |
|  1 | SIMPLE      | messages          | index  | idx_timestamp                           | idx_timestamp | 4       | NULL                                   |   50 |             |
|  1 | SIMPLE      | subscription      | eq_ref | PRIMARY,entity_id,user_id               | PRIMARY       | 16      | const, messages.entity_id              |    1 | Using index |

没有限制

| id | select_type | table             | type   | possible_keys                           | key           | key_len | ref                                    |   rows   | Extra         |
|  1 | SIMPLE      | messages          | ALL    | entity_id_2,entity_id                   | NULL          | NULL    | NUL                                    |   255069 | Using filesort|
|  1 | SIMPLE      | subscription      | eq_ref | PRIMARY,entity_id,user_id               | PRIMARY       | 16      | const, messages.entity_id              |        1 | Using index   |

创建表语句:

约 5000 行

subscription | CREATE TABLE `subscription` (
  `user_id`   bigint(20) unsigned NOT NULL,
  `entity_id` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`user_id`,`entity_id`),
  KEY `entity_id` (`entity_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

约 255,000 行

messages | CREATE TABLE `messages` (
  `message_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `entity_id` bigint(20) unsigned NOT NULL,
  `message` varchar(255) NOT NULL DEFAULT '',
  `timestamp` int(10) unsigned NOT NULL,
  PRIMARY KEY (`message_id`),
  KEY `entity_id` (`entity_id`,`timestamp`),
  KEY `idx_timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 

【问题讨论】:

【参考方案1】:

尝试将您的WHERE 更改为AND

SELECT messages.*
   FROM messages
   STRAIGHT_JOIN subscription ON subscription.entity_id = messages.entity_id
        AND subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

SELECT messages.*
   FROM messages
   INNER JOIN subscription ON subscription.entity_id = messages.entity_id
           AND subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

或者可能是这样:

SELECT messages.*
FROM subscription 
STRAIGHT_JOIN messages ON subscription.entity_id = messages.entity_id
WHERE subscription.user_id = 1
ORDER BY messages.timestamp DESC 
LIMIT 50

【讨论】:

感谢您的建议,但它们似乎无法解决问题。

以上是关于如何改进键表和另一个表之间的这种 JOIN 操作的主要内容,如果未能解决你的问题,请参考以下文章

如何建立两个excel工作表之间的关系

sql怎么设置外键

SQL中有主外键的两表到底那这是主表

SparkSQL的3种Join实现

SparkSQL的3种Join实现

SparkSQL的3种Join实现