Full Join多个表与Union All多个表

Posted 浊酒南街

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Full Join多个表与Union All多个表相关的知识,希望对你有一定的参考价值。

目录

1. 问题描述

在Hive中(其他类似SQL,比如PostgreSQL可能也存在此问题),当对多张表(3张及以上)进行full join时,会存在每张表的主键都是唯一,但当full join后,会发现主键可能有重复。

2. 问题复现

2.1. 插入数据

with   temp1 as (
select  '1' as id ,'张三' as name
union all 
select  '2' as id ,'李四' as name
union all 
select  '3' as id ,'王五' as name
),
temp2 as (
select  '1' as id ,'深圳' as city
union all 
select  '3' as id ,'北京' as city
union all 
select  '4' as id ,'上海' as city
),
temp3 as (
select  '1' as id ,'5000' as salary
union all 
select  '4' as id ,'10000' as salary
union all 
select  '5' as id ,'12000' as salary
)

2.2. 查询SQL以及问题

select
    coalesce(a.id, b.id, c.id) as id
    , a.name
    , b.city
    , c.salary
from temp1 as a
 
full join temp2 as b
on a.id = b.id
 
full join temp3 as c
on a.id = c.id

当执行如上查询SQL时,会发现其中 id = 4 的数据有重复,如下图所示:

3. 问题原因

之所以会出现这样的问题,是因为是以a表为主表,但a表中只有 id 为 1、2、3 的数据,但在b表中有id为4,c表中也有id为4,此时full join时,a表会以空值和b表、c表中id为4的数据join,这样关联后的表中就会出现2条id为4的数据;

4. 问题解决

在后续的表full join时,不单单使用第一张表的主键full join(因为是full join,所以肯定会存在第一张表为null,而其他表有值的数据),而使用 coalesce 方法对所有前面已经full join了的主键进行条件关联,如下代码:
方法1:

select
    coalesce(a.id, b.id, c.id) as id
    , a.name
    , b.city
    , c.salary
from temp1 as a
 
full join temp2 as b
on a.id = b.id
 
full join temp3 as c
on coalesce(a.id, b.id) = c.id

结果如下:

方法2:

select 
    temp.id
    , temp.name
    , temp.city
    , c.salary
from  
(select
    coalesce(a.id, b.id) as id
    , a.name 
    , b.city 
from temp1 as a
 
full join temp2 as b
on a.id = b.id) temp
full join 
temp3 as c
on temp.id = c.id


方法3:

select 
temp.id 
,temp1.name
,temp2.city
,temp3.salary
from 
(select id 
from  
(select 
id
from  temp1
union all 
select 
id
from  temp2
union all 
select 
id
from  temp3
) tt
group by id
) temp 
left  join temp1  
on temp.id = temp1.id
left  join temp2  
on temp.id = temp2.id
left  join temp3 
on temp.id = temp3.id

以上是关于Full Join多个表与Union All多个表的主要内容,如果未能解决你的问题,请参考以下文章

mysql数据库多个表union all查询并排序的结果为啥错误

mysql数据库多个表union all查询并排序的结果为啥错误

mysql union all和union的区别

left join ,right join ,inner join,outer join,union all,union有啥区别?怎么用?

多个表上的多个 FULL OUTER JOIN

oracle sql union all 合并多列