Oracle开发者中级第6课(并集差集和交集)实验

Posted dingdingfish

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Oracle开发者中级第6课(并集差集和交集)实验相关的知识,希望对你有一定的参考价值。

概述

本实验参考DevGym中的实验指南

创建环境

创建2张表:

create table my_brick_collection (
  colour varchar2(10),
  shape  varchar2(10),
  weight integer
);

create table your_brick_collection (
  height integer,
  width  integer,
  depth  integer,
  colour varchar2(10),
  shape  varchar2(10)
);

insert into my_brick_collection values ( 'red', 'cube', 10 );
insert into my_brick_collection values ( 'blue', 'cuboid', 8 );
insert into my_brick_collection values ( 'green', 'pyramid', 20 );
insert into my_brick_collection values ( 'green', 'pyramid', 20 );
insert into my_brick_collection values ( null, 'cuboid', 20 );

insert into your_brick_collection values ( 2, 2, 2, 'red', 'cube' );
insert into your_brick_collection values ( 2, 2, 2, 'blue', 'cube' );
insert into your_brick_collection values ( 2, 2, 8, null, 'cuboid' );

commit;

数据如下,这两个表的列数量不同:

SQL> select * from my_brick_collection;

   COLOUR      SHAPE    WEIGHT
_________ __________ _________
red       cube              10
blue      cuboid             8
green     pyramid           20
green     pyramid           20
          cuboid            20

SQL> select * from your_brick_collection;
   HEIGHT    WIDTH    DEPTH    COLOUR     SHAPE
_________ ________ ________ _________ _________
        2        2        2 red       cube
        2        2        2 blue      cube
        2        2        8           cuboid

Union

union 运算符将两个或多个表合并为一个结果集。 要使用它,表必须在每个位置具有相同数量的具有匹配数据类型的列。
所以以下的SQL报错:

SQL>
select * from my_brick_collection
union 
select * from your_brick_collection;

Error at Command Line : 1 Column : 1
Error report -
SQL Error: ORA-01789: query block has incorrect number of result columns
01789. 00000 -  "query block has incorrect number of result columns"
*Cause:
*Action:

以下SQL正确:

SQL>
select colour, shape from my_brick_collection
union 
select colour, shape from your_brick_collection;

   COLOUR      SHAPE
_________ __________
blue      cube
blue      cuboid
green     pyramid
red       cube
          cuboid

Union All

my_brick_collection表中有2个重复的colour, shape,因此以下SQL的结果小于my_brick_collection表的行数:

SQL> select distinct * from my_brick_collection;

   COLOUR      SHAPE    WEIGHT
_________ __________ _________
red       cube              10
blue      cuboid             8
green     pyramid           20
          cuboid            20

默认的union会去掉重复行,要保留重复行,需要使用union all:

select colour, shape from my_brick_collection
union all 
select colour, shape from your_brick_collection;

   COLOUR      SHAPE
_________ __________
red       cube
blue      cuboid
green     pyramid
green     pyramid
          cuboid
red       cube
blue      cube
          cuboid

8 rows selected.

实际上,默认的union等价于以下SQL:

select distinct * from (
  select colour, shape from my_brick_collection
  union all 
  select colour, shape from your_brick_collection
);

两个练习的答案如下,说明union和union all的优先级比order by高,也就是order by会对union的最终结果排序:

select colour from my_brick_collection
union
select colour from your_brick_collection
order by colour;


select shape from my_brick_collection
union all
select shape from your_brick_collection order by shape;

如果希望子查询排序,可以使用inline view,但CTE不行,例如:

select shape from my_brick_collection
union all
select * from (select shape from your_brick_collection order by shape) ;

Set Difference

差集操作返回一个表中存在而不在另一个表中的所有行,这可以用not exists实现。

但列多了写起来就比较麻烦,而且由于null处理不当,很容易出错。例如:

SQL>
select colour, shape from your_brick_collection ybc
where  not exists (
  select null from my_brick_collection mbc
  where  ybc.colour = mbc.colour
  and    ybc.shape = mbc.shape
);

   COLOUR     SHAPE
_________ _________
blue      cube
          cuboid

在以上结果中,第2行,也就是包含null的行不应该在结果集中,因为null和null是不能比较的。

正确的写法如下:

select colour, shape from your_brick_collection ybc
where  not exists (
  select null from my_brick_collection mbc
  where  ( ybc.colour = mbc.colour or 
           ( ybc.colour is null and mbc.colour is null ) 
         )
  and    ( ybc.shape = mbc.shape or
           ( ybc.shape is null and mbc.shape is null ) 
         )
);

   COLOUR    SHAPE
_________ ________
blue      cube

Minus (差集)

用集合操作就简单多了,而且不容易出错:

select colour, shape from my_brick_collection
minus
select colour, shape from your_brick_collection

   COLOUR      SHAPE
_________ __________
blue      cuboid
green     pyramid

但还有另一个微妙的区别。 与 union 一样,minus对输出进行了distinct操作。如果你希望看到所有不同的行,还是要用not exists:

select colour, shape from my_brick_collection mbc
where  not exists (
  select null from your_brick_collection ybc
  where  ( ybc.colour = mbc.colour or ( ybc.colour is null and mbc.colour is null ) )
  and    ybc.shape = mbc.shape
);

   COLOUR      SHAPE
_________ __________
blue      cuboid
green     pyramid
green     pyramid

Intersect(交集)

用not exists也可以实现交集,但注意null的处理:

select colour, shape from your_brick_collection ybc
where  exists (
  select null from my_brick_collection mbc
  where  ( ybc.colour = mbc.colour or ( ybc.colour is null and mbc.colour is null ) )
  and    ybc.shape = mbc.shape
);

   COLOUR     SHAPE
_________ _________
red       cube
          cuboid

intersect集合操作简单多了:

select colour, shape from your_brick_collection
intersect
select colour, shape from my_brick_collection;

   COLOUR     SHAPE
_________ _________
red       cube
          cuboid

Finding the Difference Between Two Tables (Symmetric Difference)

查找两个表之间的差异,也就是对称差异,需要做以下的操作:

  1. 使用minus查找在表一中而不是表二中的行
  2. 使用minus查找在表二中而不是表一中的行
  3. 将这两个操作的输出用union(all)结合起来

例如:

select * from ( 
  select colour, shape from your_brick_collection
  minus
  select colour, shape from my_brick_collection
) union all (
  select colour, shape from my_brick_collection
  minus
  select colour, shape from your_brick_collection
);

   COLOUR      SHAPE
_________ __________
blue      cube
blue      cuboid
green     pyramid

如果还想看的更清楚,可以把表名附上:

select * from ( 
  select colour, shape, 'your_brick' from your_brick_collection
  minus
  select colour, shape, 'your_brick' from my_brick_collection
) union all (
  select colour, shape, 'my_brick' from my_brick_collection
  minus
  select colour, shape, 'my_brick' from your_brick_collection
);

COLOUR     SHAPE      'YOUR_BRIC
---------- ---------- ----------
blue       cube       your_brick
blue       cuboid     my_brick  
green      pyramid    my_brick  

利用集合操作,以下是另一种方法:

  1. 将两个表中的所有行与 union (all) 合并
  2. 使用intersect找到两个表中都存在的值
  3. 从第一个查询minus第二个查询

SQL如下:

select * from ( 
  select colour, shape from your_brick_collection
  union all
  select colour, shape from my_brick_collection
) minus (
  select colour, shape from my_brick_collection
  intersect
  select colour, shape from your_brick_collection
);

   COLOUR      SHAPE
_________ __________
blue      cube
blue      cuboid
green     pyramid

Symmetric Difference with Group By

前两种方法都有一些缺点,😦。 首先,您必须两次读取两个表中的所有行。 其次minus和intersect都去除了重复值(应用了distinct)。 所以你只能在一张表中看到这些值。 不是所有的行。

为了演示此问题,我们在your_brick_collection表中插入1条重复行,即红色立方体:

insert into your_brick_collection values ( 4, 4, 4, 'red', 'cube' );

红色立方体在your_brick_collection中有两行,在my_brick_collection中有1行,因此应该出现在对称差异结果集中。但是却没有:

select * from ( 
  select colour, shape from your_brick_collection
  minus
  select colour, shape from my_brick_collection
) union all (
  select colour, shape from my_brick_collection
  minus
  select colour, shape from your_brick_collection
);

   COLOUR      SHAPE
_________ __________
blue      cube
blue      cuboid
green     pyramid

正确的写法如下,而且只需要每个表只需读一次:

select colour, shape, sum ( your_bricks ), sum ( my_bricks ) 
from (
  select colour, shape, 1 your_bricks, 0 my_bricks
  from   your_brick_collection
  union all
  select colour, shape, 0 your_bricks, 1 my_bricks 
  from   my_brick_collection
)
group  by colour, shape
having sum ( your_bricks ) <> sum ( my_bricks );

   COLOUR      SHAPE    SUM(YOUR_BRICKS)    SUM(MY_BRICKS)
_________ __________ ___________________ _________________
red       cube                         2                 1
blue      cube                         1                 0
green     pyramid                      0                 2
blue      cuboid                       0                 1

下面这个SQL可以看到更详细的对称差异信息:

select colour, shape, 
       case
         when sum ( your_bricks ) < sum ( my_bricks ) then 'ME'
         when sum ( your_bricks ) > sum ( my_bricks ) then 'YOU'
         else 'EQUAL'
       end who_has_extra,
       abs ( sum ( your_bricks ) - sum ( my_bricks ) ) how_many
from (
  select colour, shape, 1 your_bricks, 0 my_bricks
  from   your_brick_collection
  union all
  select colour, shape, 0 your_bricks, 1 my_bricks 
  from   my_brick_collection
)
group  by colour, shape;

   COLOUR      SHAPE    WHO_HAS_EXTRA    HOW_MANY
_________ __________ ________________ ___________
red       cube       YOU                        1
blue      cube       YOU                        1
          cuboid     EQUAL                      0
green     pyramid    ME                         2
blue      cuboid     ME                         1

环境清理

drop table my_brick_collection;
drop table your_brick_collection;

以上是关于Oracle开发者中级第6课(并集差集和交集)实验的主要内容,如果未能解决你的问题,请参考以下文章

oracle中取交集差集和并集的语法

Hive 差集交集并集

SQL集合运算:差集、交集、并集

java找到两个list的交集并集差集

Excel 怎样求差集,并集和交集还有容错

用java编写程序,求集合的并集、交集和差集