将 NOT IN 查询转换为更好的性能

Posted

技术标签:

【中文标题】将 NOT IN 查询转换为更好的性能【英文标题】:Convert NOT IN query to better performance 【发布时间】:2014-01-31 09:35:12 【问题描述】:

我使用的是 mysql 5.0,我需要微调这个查询。谁能告诉我在这方面我能做些什么调整?

SELECT DISTINCT(alert_master_id) FROM alert_appln_header 
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
SELECT DISTINCT(alert_master_id) FROM alert_details 
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
UNION
SELECT DISTINCT(alert_master_id) FROM alert_sara_header 
WHERE sara_master_id IN 
(SELECT alert_sara_master_id FROM alert_sara_lines 
WHERE end_date IS NULL) AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;

【问题讨论】:

对不起,我不知道如何在这里格式化。和 m 急需。 【参考方案1】:

我要做的第一件事是rewrite the subqueries as joins:

SELECT      h.alert_master_id

FROM        alert_appln_header h

       JOIN schedule_config c
         ON c.schedule_name = 'Purging_Config'

  LEFT JOIN alert_details d
         ON d.alert_master_id = h.alert_master_id
        AND d.end_date IS NULL
        AND d.created_date < CURRENT_DATE - INTERVAL c.parameters DAY

  LEFT JOIN (
              alert_sara_header s
         JOIN alert_sara_lines  l
           ON l.alert_sara_master_id = s.sara_master_id
            )
         ON s.alert_master_id = h.alert_master_id
        AND s.end_date IS NULL
        AND s.created_date < CURRENT_DATE - INTERVAL c.parameters DAY

WHERE       h.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
        AND d.alert_master_id IS NULL
        AND s.alert_master_id IS NULL

GROUP BY    h.alert_master_id

LIMIT       5000

如果之后仍然很慢,请重新检查您的索引策略。我建议索引:

alert_appln_header(alert_master_id,created_date) schedule_config(schedule_name) alert_details(alert_master_id,end_date,created_date) alert_sara_header(sara_master_id,alert_master_id,end_date,created_date) alert_sara_lines(alert_sara_master_id)

【讨论】:

【参考方案2】:

好的,这可能只是在黑暗中的一个镜头,但我认为你不需要那么多DISTINCT

SELECT DISTINCT(alert_master_id) FROM alert_appln_header 
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
     -- removed distinct here --
    SELECT alert_master_id FROM alert_details 
    WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
    UNION
     -- removed distinct here --
    SELECT alert_master_id FROM alert_sara_header 
    WHERE sara_master_id IN 
        (SELECT alert_sara_master_id FROM alert_sara_lines 
        WHERE end_date IS NULL) 
    AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;

由于使用DISTINCT 非常昂贵,请尽量避免使用它。在第一个 WHERE 子句中,您正在检查某些 result 中的 NOT ids,因此在该 result 中是否有一些 @987654327 无关紧要@ 出现不止一次。

【讨论】:

谢谢先生,第一个不同是我的错误,但我做了两个以减少子查询的大小并使 IN 运算符更快,我不确定我是否正确。

以上是关于将 NOT IN 查询转换为更好的性能的主要内容,如果未能解决你的问题,请参考以下文章

Python将两个字典合并成一个字典

java如何将一个InputStream写入文件?

单程将两个图像作为输入,将两个图像作为输出?

将 JSON 文件数据填充到 Array 中,然后将 Array 输入到 mmenu 插件中

Android Button 将图像置于中心,将文本置于底部

将字节数组输入流拷贝成字节数组输出流,将ByteArrayInputStream转成ByteArrayOutputStream