SQL Not IN 查询需要时间处理类似的表

Posted 2023-03-25

技术标签:

【中文标题】SQL Not IN 查询需要时间处理类似的表【英文标题】：SQL Not IN query is taking time with similar table 【发布时间】：2017-02-23 05:09:40 【问题描述】：

假设 table_1 和 table_2 我有 2 个表 table_1 有 56 列和 120 万条记录我的查询就像

table_1 喜欢

RollNumber | Subject | G         | Part | Status  
------------------------------------------------  
1          | 1       | 1         | 1    |  1  
1          | 1       | 1         | 2    |  1  
1          | 2       | 1         | 1    |  1  
1          | 2       | 1         | 2    |  5  
1          | 3       | 1         | 1    |  0  
1          | 3       | 1         | 2    |  1  
2          | 1       | 2         | 1    |  1  
2          | 1       | 2         | 2    |  1  
2          | 2       | 2         | 1    |  1  
2          | 2       | 2         | 2    |  1  
2          | 3       | 2         | 1    |  1  
2          | 3       | 2         | 2    |  1 
3          | 1       | 2         | 1    |  1  
3          | 1       | 2         | 2    |  1  
3          | 2       | 2         | 1    |  1  
3          | 2       | 2         | 2    |  1  
3          | 3       | 2         | 1    |  0  
3          | 3       | 2         | 2    |  1

我想要所有状态为 0 的 table_1 中的所有 RollNumber（按第 2 列和第 3 列分组），但不希望状态也为 5（或 1 以外）的学生

我试过了

select * from table_1 as t1  
inner join table_2 as t2  
on  t1.column2 = t2.column2 and t1.column3 = t2.column3 and t1.column4 = t2.column4  
where t1.column1 not in  
     (select column1 from table_1 where status = 5)

这是我的 qhole 查询中最里面的查询我也试过 EXCEPT 子句两个查询都需要很长时间才能执行

【问题讨论】：

为什么要使用子查询来过滤status？请显示查询的execution plan。您是否为column1、column2、column3 和status 添加了索引？ @kennytm 我已编辑问题，请立即查看，谢谢 【参考方案1】：

从 SQL Server 2008 开始，您可以使用 count() over() 计算给定组中具有特定值的总行数。

在这种情况下，您需要计算每个组的 status <> 1 数量，并仅选择属于计数为 0 的组的行。

select * from (
    select * , 
      count(case when status <> 1 then 1 end) over(partition by RollNumber, G) c
    from table_1
) t where c = 0

【讨论】：

这是有效的，，但问题是我必须在 C# 中转换为 LINQ，顺便谢谢如果您需要帮助将查询转换为 LINQ，我建议您打开一个新问题 :) 我希望这个查询也用于 mysql【参考方案2】：

您可以使用EXISTS 代替NOT IN. 这会更快，因为会有boolean 比较而不是string 比较。

select * from table_1 as t1  
inner join table_2 as t2  
on t1.column1 = t2.column1 and t1.column2 = t2.column2 and t1.column3 = t2.column3  
where not EXISTS  
     (select 1 from table_1 where status = 5 and t1.column3 = table_1.column3)

【讨论】：

【参考方案3】：

尝试使用NOT EXISTS 而不是NOT IN

SELECT * 
FROM table_1 AS t1  
INNER JOIN table_2 AS t2  
ON t1.column1 = t2.column1 AND t1.column2 = t2.column2 AND t1.column3 = t2.column3  
WHERE NOT EXISTS(
                    SELECT 1
                    FROM table_1
                    WHERE status=5 AND column3=t1.column3
                                                           )

【讨论】：

以上是关于SQL Not IN 查询需要时间处理类似的表的主要内容，如果未能解决你的问题，请参考以下文章