如果一个列值仅与另一列中的一个值相关联,则过滤掉行

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如果一个列值仅与另一列中的一个值相关联,则过滤掉行相关的知识,希望对你有一定的参考价值。

A   B   C   D   E
1981    a   b   CY3 2
1981    c   l   CY3 1
1981    f   r   CY3 5
1255    ee  ee  CY3 1
1255    ff  ff  CY3 1
1387    g   g   CY5 2
1387    h   h   CY5 10
1387    P   h   CY5 C7

我得到这个表与以下查询(ORACLE pl / sql):

SELECT A,B,C,D,COUNT(*) AS E
FROM TAB1  t1 INNER JOIN TAB2 t2 ON t1.A = t2.B
             INNER JOIN TAB3 t3 ON t1.A = t3.C
GROUP BY A,B,C,D
ORDER BY D ASC, A DESC;

我想通过编辑上面的查询得到以下结果:

A   B   C   D   E
1981    a   b   CY3 2
1981    c   l   CY3 1
1981    f   r   CY3 5
1255    ee  ee  CY3 1
1255    ff  ff  CY3 1

我想过滤掉最后三行,因为列D有一个值(即CY5),它只与A列中的一个单独值相关联(即1387),而CY3与两个不同的值相关联(即1981和12550所以我想保留这个)。

任何人都可以帮助我或指出类似的问题吗?

答案

大多数DBMS支持窗口聚合:

select *
from
 (
    SELECT A,B,C,D,COUNT(*) AS E,
       MIN(A) OVER (PARTITION BY D) AS minA,-- minimum A for all rows with the same D
       MAX(A) OVER (PARTITION BY D) AS maxA -- maximum A for all rows with the same D
    FROM TAB1  t1 INNER JOIN TAB2 t2 ON t1.A = t2.B
                 INNER JOIN TAB3 t3 ON t1.A = t3.C
    GROUP BY A,B,C,D
 ) as dt
where minA <> maxA -- there must be at least 2 different values (usually cheaper than COUNT(DISTINCT)
ORDER BY D ASC, A DESC;

编辑:

对于Oracle,您希望返回不同值的数量:

select *
from
 (
    SELECT A,B,C,D,COUNT(*) AS E,
       COUNT(DISTINCT A) OVER (PARTITION BY D) AS countA
    FROM TAB1  t1 INNER JOIN TAB2 t2 ON t1.A = t2.B
                 INNER JOIN TAB3 t3 ON t1.A = t3.C
    GROUP BY A,B,C,D
 ) dt -- you don't need the alias in Oracle, but STandard SQL requires it
where countA > 1
ORDER BY D ASC, A DESC;
另一答案

使用相关子查询使用exists

SELECT A,B,C,D,COUNT(*) AS E
FROM TABLESS t1
where exists (select 1 from TABLESS t2 where t1.D=t2.D having count(A)>1)
GROUP BY A,B,C,D
ORDER BY D ASC, A DESC
另一答案

看一看!

    DECLARE @TEST AS TABLE
(A VARCHAR(100),B VARCHAR(100),C VARCHAR(100),D VARCHAR(100))

INSERT INTO @TEST VALUES
('1981','A','B','CY3'),
('1981','A','B','CY3'),
('1981','C','L','CY3'),
('1981','F','R','CY3'),
('1981','F','R','CY3'),
('1981','F','R','CY3'),
('1981','F','R','CY3'),
('1981','F','R','CY3'),
('1255','EE','EE','CY3'),
('1255','FF','FF','CY3'),
('1387','G','G','CY5'),
('1387','G','G','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5')

SELECT DATA.* FROM 
(
SELECT T.D,COUNT(T.A) AS DISTINCT_RECORD FROM (
SELECT DISTINCT D,A FROM @TEST) T
GROUP BY T.D
HAVING COUNT(T.A)>1
) 

CRITERIA LEFT JOIN 

(
SELECT A,B,C,D,COUNT(*) AS E
FROM @TEST
GROUP BY A,B,C,D
) 

DATA ON CRITERIA.D=DATA.D
另一答案
with s (a, b, c, d, e) as (
select 1981, 'a'  , 'b' , 'CY3', 2  from dual union all
select 1981, 'c'  , 'l' , 'CY3', 1  from dual union all
select 1981, 'f'  , 'r' , 'CY3', 5  from dual union all
select 1255, 'ee' , 'ee', 'CY3', 1  from dual union all
select 1255, 'ff' , 'ff', 'CY3', 1  from dual union all
select 1387, 'g'  , 'g' , 'CY5', 2  from dual union all
select 1387, 'h'  , 'h' , 'CY5', 10 from dual union all
select 1387, 'P'  , 'h' , 'CY5', 17 from dual)
select a, b, c, d, e
from 
    (select s.*, count(distinct a) over (partition by d) cnt_dict
     from s
    )
where cnt_dict > 1;

         A B  C  D            E
---------- -- -- --- ----------
      1255 ee ee CY3          1
      1255 ff ff CY3          1
      1981 f  r  CY3          5
      1981 c  l  CY3          1
      1981 a  b  CY3          2

以上是关于如果一个列值仅与另一列中的一个值相关联,则过滤掉行的主要内容,如果未能解决你的问题,请参考以下文章

MySQL - 选择列值仅为0的行,按另一列分组?

在 Pandas 中模糊搜索列

仅返回一列中的日期与另一列中的日期最接近的行?

是否有一种方法可以根据其他列减去列值?

如何使用 mySQL 返回另一列中每个值的最常见列值?

将一列中的文本与另一列匹配(vlookup + like)