FULL OUTER JOIN 将表与 PostgreSQL 合并

Posted

技术标签:

【中文标题】FULL OUTER JOIN 将表与 PostgreSQL 合并【英文标题】:FULL OUTER JOIN to merge tables with PostgreSQL 【发布时间】:2017-11-17 18:44:00 【问题描述】:

在this post, 之后,当我将@Vao Tsun 给出的答案应用于更大的数据集时,我仍然遇到问题,这次由 4 个表而不是上面提到的相关帖子中的 2 个表组成。

这是我的数据集:

-- Table 'brcht' (empty)

insee  | annee  | nb
-------+--------+-----


-- Table 'cana'

insee  | annee  | nb
-------+--------+-----
036223 |   2017 |   1
086001 |   2016 |   2


-- Table 'font' (empty)

insee  | annee  | nb
-------+--------+-----


-- Table 'nr'

insee  | annee  | nb
-------+--------+-----
036223 |   2013 |   1
036223 |   2014 |   1
086001 |   2013 |   1
086001 |   2014 |   2
086001 |   2015 |   4
086001 |   2016 |   2

这里是查询:

SELECT
 COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
 COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
 COALESCE(brcht.nb,0) AS brcht,  
 COALESCE(cana.nb,0) AS cana,
 COALESCE(font.nb,0) AS font,
 COALESCE(nr.nb,0) AS nr,
 COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total

FROM public.brcht
  FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
  FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
  FULL OUTER JOIN public.nr   ON font.insee = nr.insee AND font.annee = nr.annee

ORDER BY COALESCE(brcht.insee, cana.insee, font.insee, nr.insee), COALESCE(brcht.annee, cana.annee, font.annee, nr.annee);

在结果中,insee='086001' 仍然有两行而不是一行(见下文)。我需要为每个insee 获取一行,在此示例中,两个2 值应位于同一行,total 列显示4 值。

再次感谢您的帮助!


以下是轻松创建上述表格的 SQL 脚本:

CREATE TABLE public.brcht (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.cana (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.font (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.nr (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);

INSERT INTO public.cana (insee, annee, nb) VALUES ('036223', 2017, 1), ('086001', 2016, 2);
INSERT INTO public.nr(insee, annee, nb) VALUES ('036223', 2013, 1), ('036223', 2014, 1), ('086001', 2013, 1), ('086001', 2014, 2), ('086001', 2015, 4), ('086001', 2016, 2);

【问题讨论】:

【参考方案1】:

受到其他答案的启发,但可能组织得更好:

SELECT *, 
       brcht + cana + font + nr AS total 
FROM   (SELECT insee, 
               annee, 
               SUM(Coalesce(brcht.nb, 0)) brcht, 
               SUM(Coalesce(cana.nb, 0))  cana, 
               SUM(Coalesce(font.nb, 0))  font, 
               SUM(Coalesce(nr.nb, 0))    nr 
        FROM   brcht 
               full outer join cana USING (insee, annee) 
               full outer join font USING (insee, annee) 
               full outer join nr USING (insee, annee) 
        GROUP  BY insee, 
                  annee) t 
ORDER  BY insee, 
          annee; 

给予:

 insee  | annee | brcht | cana | font | nr | total 
--------+-------+-------+------+------+----+-------
 036223 |  2013 |     0 |    0 |    0 |  1 |     1
 036223 |  2014 |     0 |    0 |    0 |  1 |     1
 036223 |  2017 |     0 |    1 |    0 |  0 |     1
 086001 |  2013 |     0 |    0 |    0 |  1 |     1
 086001 |  2014 |     0 |    0 |    0 |  2 |     2
 086001 |  2015 |     0 |    0 |    0 |  4 |     4
 086001 |  2016 |     0 |    2 |    0 |  2 |     4
(7 rows)

【讨论】:

非常清楚,谢谢!不知道 USING 的连接语句。【参考方案2】:

您需要在您现在使用的查询上对 bigint 列执行 GROUP BY 和 SUM()。

select
    insee, annee
    , sum(brcht) brcht
    , sum(cana) cana
    , sum(font) font
    , sum(nr) nr
    , sum(total) total
from (
    SELECT
     COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
     COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
     COALESCE(brcht.nb,0) AS brcht,  
     COALESCE(cana.nb,0) AS cana,
     COALESCE(font.nb,0) AS font,
     COALESCE(nr.nb,0) AS nr,
     COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total

    FROM public.brcht
      FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
      FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
      FULL OUTER JOIN public.nr   ON font.insee = nr.insee AND font.annee = nr.annee
      ) d
group by
    insee, annee

【讨论】:

【参考方案3】:

尝试:

t=# SELECT
 COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
 COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
 COALESCE(brcht.nb,0) AS brcht,
 COALESCE(cana.nb,0) AS cana,
 COALESCE(font.nb,0) AS font,
 COALESCE(nr.nb,0) AS nr,
 COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total
FROM public.brcht
  FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
  FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
  FULL OUTER JOIN public.nr   ON cana.insee = nr.insee AND cana.annee = nr.annee
ORDER BY COALESCE(brcht.insee, cana.insee, font.insee, nr.insee), COALESCE(brcht.annee, cana.annee, font.annee, nr.annee);
 insee  | annee | brcht | cana | font | nr | total
--------+-------+-------+------+------+----+-------
 036223 |  2013 |     0 |    0 |    0 |  1 |     1
 036223 |  2014 |     0 |    0 |    0 |  1 |     1
 036223 |  2017 |     0 |    1 |    0 |  0 |     1
 086001 |  2013 |     0 |    0 |    0 |  1 |     1
 086001 |  2014 |     0 |    0 |    0 |  2 |     2
 086001 |  2015 |     0 |    0 |    0 |  4 |     4
 086001 |  2016 |     0 |    2 |    0 |  2 |     4
(7 rows)

在您的示例中,您加入nr 对抗font,而您可能想加入它对抗cana?..

也请在此处查看: https://www.postgresql.org/docs/current/static/queries-table-expressions.html#QUERIES-JOIN

在没有括号的情况下,JOIN 子句从左到右嵌套

更新

解释逻辑: 尝试select * from public.brcht,添加其他表一,一 出现“更正确”表中的列,因此当您运行所有四个连接时,您会得到:

t=# select * 
FROM public.brcht 
FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
FULL OUTER JOIN public.nr   ON font.insee = nr.insee AND font.annee = nr.annee
t-# ;
 insee | annee | nb | insee  | annee | nb | insee | annee | nb | insee  | annee | nb
-------+-------+----+--------+-------+----+-------+-------+----+--------+-------+----
       |       |    | 036223 |  2017 |  1 |       |       |    |        |       |
       |       |    | 086001 |  2016 |  2 |       |       |    |        |       |
       |       |    |        |       |    |       |       |    | 036223 |  2013 |  1
       |       |    |        |       |    |       |       |    | 036223 |  2014 |  1
       |       |    |        |       |    |       |       |    | 086001 |  2013 |  1
       |       |    |        |       |    |       |       |    | 086001 |  2014 |  2
       |       |    |        |       |    |       |       |    | 086001 |  2015 |  4
       |       |    |        |       |    |       |       |    | 086001 |  2016 |  2
(8 rows)

所以第 8 列是 font.annee(请注意 - 它到处都是 null) - 你用 nr.insee 加入它 - 没有匹配 - 所以完全连接需要前三个表中的所有行加入和 nr 表中的所有行- 你得到 8 行

【讨论】:

你为什么要加入nr 对抗cana?我不明白加入 4 个表的方式...在我的示例中,我首先加入 brchtcana,然后加入 canafont,然后 fontnr。对我来说,这样进行似乎是合乎逻辑的。有没有一种合乎逻辑的方式将表格连接在一起? @wiltomap 试图解释。请注意,如果您不使用 () 连接发生从左到右,那么最后一个连接将连接之前在 NULL 列上的整个集合 - 你从 (brcht,cana,font) 和所有来自 nr 获得所有内容(所有 - 因为它们没有共同点用于连接的列上的值)。希望这是有道理的 - 解释不是我最好的技能 好的,我明白了,谢谢!问题是 4 个表的内容会定期更改,因此我无法继续根据此调整连接...我需要一种将表连接在一起的方法,以适应任何表的内容。 然后使用括号嵌套连接,这样每个下一个连接都将在“合并”值上

以上是关于FULL OUTER JOIN 将表与 PostgreSQL 合并的主要内容,如果未能解决你的问题,请参考以下文章

oracle的full outer join如何排除掉空值

sql MS SQL Full Outer Join

两个 INNER JOIN 的 FULL OUTER JOIN

oracle 内连接(inner join)外连接(outer join)全连接(full join)

SQL的JOIN语法解析(inner join, left join, right join, full outer join的区别)

FULL OUTER JOIN 在这里真的是一件坏事吗?