使用 NOT IN ( SELECT ... ) 时查询执行速度很慢

Posted

技术标签:

【中文标题】使用 NOT IN ( SELECT ... ) 时查询执行速度很慢【英文标题】:Query execute so slow when using NOT IN ( SELECT ... ) 【发布时间】:2016-11-14 09:34:42 【问题描述】:

我有一个可行的 mysql 查询,但是当我使用 NOT IN 添加一些条件时,我发现它很慢。这里有什么建议可以用其他词代替“NOT IN”并且仍然得到相同的结果吗?感谢您的帮助!

我的查询:

SELECT 
    CustomerId 
FROM 
    mktg_account ma  
    LEFT JOIN mktg_account_customer mac
        on mac.AccountId = ma.AccountId   
WHERE
    IsPurge = 1 
    AND mac.CustomerId NOT IN (SELECT mac.CustomerId 
                               FROM mktg_account ma  
                               LEFT JOIN mktg_account_customer mac
                                   ON mac.AccountId = ma.AccountId 
                               where IsPurge =0)  
    AND mac.CustomerId NOT IN (SELECT CustomerId 
                               FROM mktg_unit_booking 
                               WHERE DeadlineDate > Now()
                               AND IsDeleted <> 1 
                               AND IsApproved=1)  
    AND mac.CustomerId NOT IN (SELECT mr.CustomerId 
                               FROM mktg_reservation mr 
                               LEFT JOIN mktg_reservation_customer mrc 
                                   on mr.ReservationId = mrc.ReservationId 
                               WHERE IsDeleted <> 1 
                               AND IsApproved=1  
                               AND DeadlineDate > Now())  
    AND mac.CustomerId NOT IN (SELECT CustomerId 
                               FROM mktg_customer 
                               WHERE IsDeleted = 1 
                               OR IsApproved <> 1 )  
    AND IsDeleted <> 1 
    AND IsApproved = 1
GROUP BY 
    ma.TreeId, mac.CustomerId

相关表:

CREATE TABLE IF NOT EXISTS mktg_account (
  AccountId int(10) unsigned NOT NULL AUTO_INCREMENT,
  AccountNo varchar(30) NOT NULL DEFAULT '',
  AccountStatus varchar(1) DEFAULT NULL,
  TreeId int(10) unsigned NOT NULL DEFAULT '0',
  SalesDate datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PurchasePrice double NOT NULL DEFAULT '0',
  SalesPersonId int(10) unsigned DEFAULT NULL,
  FinancialTypeId int(10) unsigned DEFAULT NULL,
  BillingCustomerId int(10) unsigned DEFAULT NULL,
  EventId int(11) NOT NULL DEFAULT '0',
  CategoryId int(11) NOT NULL DEFAULT '0',
  RealEstateAgentId int(11) NOT NULL DEFAULT '0',
  BusinessSourceId int(10) unsigned DEFAULT NULL,
  BusinessSourceOthers varchar(300) DEFAULT NULL,
  SalesPromotionId int(10) unsigned DEFAULT NULL,
  Remarks text,
  AgentName varchar(100) NOT NULL DEFAULT '',
  AgentCompany varchar(100) NOT NULL DEFAULT '',
  AgentContact varchar(100) DEFAULT NULL,
  AgentRemarks text,
  CreatedDateTime datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  CreatedBy int(10) unsigned NOT NULL DEFAULT '0',
  PurgedDateTime datetime DEFAULT NULL,
  PurgedBy int(10) unsigned DEFAULT NULL,
  PurgedIP varchar(20) DEFAULT NULL,
  IsDeleted tinyint(3) unsigned NOT NULL DEFAULT '0',
  IsApproved tinyint(3) unsigned NOT NULL DEFAULT '1',
  IsPurge tinyint(3) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (AccountId)
)

CREATE TABLE IF NOT EXISTS `mktg_account_customer` (
  `AccountId` int(10) unsigned NOT NULL DEFAULT '0',
  `CustomerId` int(10) unsigned NOT NULL DEFAULT '0',
  `IsNominee` varchar(1) NOT NULL DEFAULT 'N',
  `SortIdx` int(10) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`AccountId`,`CustomerId`),
  KEY `FK_mktg_agreement_customer_1` (`CustomerId`),
  KEY `AccountId` (`AccountId`)
)

CREATE TABLE IF NOT EXISTS `mktg_unit_booking` (
  `BookingId` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `UnitId` int(10) unsigned NOT NULL DEFAULT '0',
  `ProjectLevelId` int(10) unsigned NOT NULL DEFAULT '0',
  `ProductType` varchar(20) NOT NULL DEFAULT '',
  `SalesPersonId` int(10) unsigned DEFAULT NULL,
  `ReserveDate` datetime DEFAULT NULL,
  `DeadlineDate` datetime DEFAULT NULL,
  `CustomerId` int(10) unsigned NOT NULL DEFAULT '0',
  `Remark` varchar(250) DEFAULT NULL,
  `CreatedDateTime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `CreatedBy` int(10) unsigned NOT NULL DEFAULT '0',
  `CreatedIP` varchar(20) NOT NULL DEFAULT '',
  `IsDeleted` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `IsApproved` tinyint(3) unsigned NOT NULL DEFAULT '1',
  PRIMARY KEY (`BookingId`),
  KEY `UnitId` (`UnitId`),
  KEY `IsDeleted` (`IsDeleted`),
  KEY `IsApproved` (`IsApproved`),
  KEY `CustomerId` (`CustomerId`),
  KEY `DeadlineDate` (`DeadlineDate`)
)

CREATE TABLE IF NOT EXISTS `mktg_reservation` (
  `ReservationId` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `ProjectLevelId` int(10) unsigned NOT NULL DEFAULT '0',
  `SalesPersonId` int(10) unsigned NOT NULL DEFAULT '0',
  `ReserveDate` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `DeadlineDate` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `CustomerId` int(10) unsigned NOT NULL DEFAULT '0',
  `Remark` varchar(500) NOT NULL DEFAULT '',
  `SolicitorId` int(10) unsigned DEFAULT NULL,
  `CreatedDateTime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `CreatedBy` int(10) unsigned NOT NULL DEFAULT '0',
  `CreatedIP` varchar(20) NOT NULL DEFAULT '',
  `IsDeleted` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `IsApproved` tinyint(3) unsigned NOT NULL DEFAULT '1',
  PRIMARY KEY (`ReservationId`),
  KEY `CustomerId` (`CustomerId`)
)

CREATE TABLE IF NOT EXISTS `mktg_reservation_customer` (
  `ReservationCustomerId` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `ReservationId` int(10) unsigned NOT NULL DEFAULT '0',
  `CustomerId` int(10) unsigned NOT NULL DEFAULT '0',
  `IsNominee` varchar(1) NOT NULL DEFAULT 'N',
  PRIMARY KEY (`ReservationCustomerId`),
  KEY `ReservationId` (`ReservationId`),
  KEY `CustomerId` (`CustomerId`)
)

CREATE TABLE IF NOT EXISTS `mktg_customer` (
  `CustomerId` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `CustomerCode` varchar(20) NOT NULL DEFAULT '',
  `CustomerSeriesNo` varchar(20) NOT NULL DEFAULT '',
  `CustomerFirstName` varchar(45) DEFAULT NULL,
  `CustomerSurname` varchar(45) DEFAULT NULL,
  `ChristianName` varchar(45) DEFAULT NULL,
  `CustomerName` varchar(100) NOT NULL DEFAULT '',
  `BusinessSourceId` int(10) unsigned DEFAULT NULL,
  `ContactMode` varchar(45) DEFAULT NULL,
  `MobilePhone` varchar(45) DEFAULT NULL,
  `Email` varchar(300) DEFAULT NULL,
  `ICNo` varchar(20) DEFAULT NULL,
  `Salutation` varchar(50) NOT NULL DEFAULT '',
  `DateOfBirth` datetime DEFAULT NULL,
  `Gender` varchar(2) NOT NULL DEFAULT '',
  `Occupation` varchar(100) DEFAULT NULL,
  `CorrespondenceTypeId` int(10) unsigned DEFAULT NULL,
  `FinanceSourceId` int(10) unsigned DEFAULT NULL,
  `CustomerGroupId` int(11) DEFAULT NULL,
  `CustomerCategoryId` int(10) unsigned DEFAULT NULL,
  `MailingAddress` varchar(200) NOT NULL DEFAULT '',
  `MailingPostCode` varchar(10) DEFAULT NULL,
  `MailingState` varchar(100) DEFAULT NULL,
  `MailingCountry` varchar(100) DEFAULT NULL,
  `ReceiveMail` varchar(1) DEFAULT NULL,
  `CreatedDateTime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `CreatedBy` int(10) unsigned NOT NULL DEFAULT '0',
  `CreatedIP` varchar(20) NOT NULL DEFAULT '',
  `IsDeleted` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `IsApproved` tinyint(3) unsigned NOT NULL DEFAULT '1',
  PRIMARY KEY (`CustomerId`)
)

我的 EXPLAIN 表:

更新的查询:

SELECT CustomerId FROM mktg_account ma  
left join mktg_account_customer mac  on mac.AccountId = ma.AccountId   
where IsPurge =1 and not exists (SELECT mac.CustomerId FROM mktg_account ma  
left join mktg_account_customer mac  on mac.AccountId = ma.AccountId 
where IsPurge =0)  
and not exists (SELECT CustomerId FROM mktg_unit_booking where DeadlineDate > Now()  and IsDeleted <> 1 and IsApproved=1)  
and not exists (SELECT mr.CustomerId FROM mktg_reservation mr 
left join mktg_reservation_customer mrc on mr.ReservationId = mrc.ReservationId where IsDeleted <> 1 
and IsApproved=1  and DeadlineDate > Now())  
and not exists (SELECT CustomerId FROM mktg_customer where IsDeleted = 1 or IsApproved <> 1 )  
and IsDeleted <> 1 and IsApproved = 1
GROUP BY ma.TreeId ,mac.CustomerId

【问题讨论】:

【参考方案1】:

也许重新考虑一下您的查询。看到您在 mktg_account 表上将 AccountID 作为 PRIMARY KEY,应该没有重复的 accountid,从而使初始子查询毫无意义:

SELECT CustomerId FROM mktg_account ma  
left join mktg_account_customer mac  on mac.AccountId = ma.AccountId   
where IsPurge =1

您只需要查询。

这似乎也毫无意义:

and mac.CustomerId not in (SELECT CustomerId FROM mktg_customer where IsDeleted = 1 or IsApproved <> 1 )

当你的下一行做同样的事情,尽管来自另一个表。

在没有看到数据的情况下,我会尝试用连接替换所需的子查询

【讨论】:

【参考方案2】:

有两种方法可以改进查询;我不知道哪个更好。两者都替换x NOT IN ( SELECT ... )

A 计划:

NOT EXISTS ( SELECT * ... )

B计划:

LEFT JOIN ... WHERE .. IS NULL

关于其他话题...

mktg_account_customer 中,KEY AccountId (AccountId) 是冗余的,可以删除。否则,表格设计得很好(假设ENGINE=InnoDB) 标志上的索引(例如,KEY IsApproved (IsApproved))很少有用。以标志开头的复合索引可能很有用。 可能值得从表中删除已清除、已删除、未批准和超出期限的行。 JOINing时,请使用别名。示例:不清楚这来自where IsPurge =0 的哪个表。 当JOIN 工作相同时不要说LEFTLEFT 表示“正确”表可能缺少您希望拥有的行,如 NULLs。我很确定带有where IsPurge =0 的那个会表现相同——如果缺少该行,IsPurge 将是NULL,因此不是0

编辑:也许这是EXISTS方法:

SELECT  CustomerId
    FROM  mktg_account ma
    left join  mktg_account_customer mac  ON mac.AccountId = ma.AccountId
    where  ma.IsPurge =1
      and  ma.IsDeleted <> 1
      and  ma.IsApproved = 1
      and  not exists 
        ( SELECT  *
            FROM       mktg_account ma2
            left join  mktg_account_customer mac2  ON mac2.AccountId = ma2.AccountId
            where  IsPurge =0
              AND  mac.CustomerId = mac2.CustomerId
        )
      and  not exists 
        ( SELECT  *
            FROM  mktg_unit_booking
            where  DeadlineDate > Now()
              and  IsDeleted <> 1
              and  IsApproved=1
              AND  CustomerId = mac.CustomerId
        )
      and  not exists 
        ( SELECT  *
            FROM       mktg_reservation mr
            left join  mktg_reservation_customer mrc
                   ON mr.ReservationId = mrc.ReservationId
            where  IsDeleted <> 1
              and  IsApproved=1
              and  DeadlineDate > Now()
              AND  mr.CustomerId= mac.CustomerId
        )
      and  not exists 
        ( SELECT  *
            FROM  mktg_customer
            where ( IsDeleted = 1  or  IsApproved <> 1 )
              AND  CustomerId = mac.CustomerId
        )
    GROUP BY  ma.TreeId, mac.CustomerId

【讨论】:

我尝试将 NOT IN 替换为 NOT EXISTS 但抛出错误。它表明 (#1064 - 您的 SQL 语法有错误;请查看与您的 MySQL 服务器版本相对应的手册,了解在第 3 行的 'EXISTS (SELECT mac.CustomerId FROM mktg_account ma left join mktg_account_cust' 附近使用的正确语法) 需要一些上下文...它应该类似于AND NOT EXISTS(SELECT * FROM ...) 嗨。我刚刚在我的问题中更新了我的查询。你能帮我看看它是否是使用NOT EXISTS的正确方法吗?谢谢!

以上是关于使用 NOT IN ( SELECT ... ) 时查询执行速度很慢的主要内容,如果未能解决你的问题,请参考以下文章

在使用加载数据流步骤的猪中,使用(使用 PigStorage)和不使用它有啥区别?

今目标使用教程 今目标任务使用篇

Qt静态编译时使用OpenSSL有三种方式(不使用,动态使用,静态使用,默认是动态使用)

MySQL db 在按日期排序时使用“使用位置;使用临时;使用文件排序”

使用“使用严格”作为“使用强”的备份

Kettle java脚本组件的使用说明(简单使用升级使用)