将 NOT IN 查询转换为更好的性能
Posted
技术标签:
【中文标题】将 NOT IN 查询转换为更好的性能【英文标题】:Convert NOT IN query to better performance 【发布时间】:2014-01-31 09:35:12 【问题描述】:我使用的是 mysql 5.0,我需要微调这个查询。谁能告诉我在这方面我能做些什么调整?
SELECT DISTINCT(alert_master_id) FROM alert_appln_header
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
SELECT DISTINCT(alert_master_id) FROM alert_details
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
UNION
SELECT DISTINCT(alert_master_id) FROM alert_sara_header
WHERE sara_master_id IN
(SELECT alert_sara_master_id FROM alert_sara_lines
WHERE end_date IS NULL) AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;
【问题讨论】:
对不起,我不知道如何在这里格式化。和 m 急需。 【参考方案1】:我要做的第一件事是rewrite the subqueries as joins:
SELECT h.alert_master_id
FROM alert_appln_header h
JOIN schedule_config c
ON c.schedule_name = 'Purging_Config'
LEFT JOIN alert_details d
ON d.alert_master_id = h.alert_master_id
AND d.end_date IS NULL
AND d.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
LEFT JOIN (
alert_sara_header s
JOIN alert_sara_lines l
ON l.alert_sara_master_id = s.sara_master_id
)
ON s.alert_master_id = h.alert_master_id
AND s.end_date IS NULL
AND s.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
WHERE h.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
AND d.alert_master_id IS NULL
AND s.alert_master_id IS NULL
GROUP BY h.alert_master_id
LIMIT 5000
如果之后仍然很慢,请重新检查您的索引策略。我建议索引:
alert_appln_header(alert_master_id,created_date)
schedule_config(schedule_name)
alert_details(alert_master_id,end_date,created_date)
alert_sara_header(sara_master_id,alert_master_id,end_date,created_date)
alert_sara_lines(alert_sara_master_id)
【讨论】:
【参考方案2】:好的,这可能只是在黑暗中的一个镜头,但我认为你不需要那么多DISTINCT
。
SELECT DISTINCT(alert_master_id) FROM alert_appln_header
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
-- removed distinct here --
SELECT alert_master_id FROM alert_details
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
UNION
-- removed distinct here --
SELECT alert_master_id FROM alert_sara_header
WHERE sara_master_id IN
(SELECT alert_sara_master_id FROM alert_sara_lines
WHERE end_date IS NULL)
AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;
由于使用DISTINCT
非常昂贵,请尽量避免使用它。在第一个 WHERE
子句中,您正在检查某些 result 中的 NOT
ids
,因此在该 result 中是否有一些 @987654327 无关紧要@ 出现不止一次。
【讨论】:
谢谢先生,第一个不同是我的错误,但我做了两个以减少子查询的大小并使 IN 运算符更快,我不确定我是否正确。以上是关于将 NOT IN 查询转换为更好的性能的主要内容,如果未能解决你的问题,请参考以下文章
将 JSON 文件数据填充到 Array 中,然后将 Array 输入到 mmenu 插件中
Android Button 将图像置于中心,将文本置于底部
将字节数组输入流拷贝成字节数组输出流,将ByteArrayInputStream转成ByteArrayOutputStream