识别多个表 Snowflake SQL 之间的唯一列

Posted

技术标签:

【中文标题】识别多个表 Snowflake SQL 之间的唯一列【英文标题】:Identifying unique columns between multiple table Snowflake SQL 【发布时间】:2021-06-23 14:58:35 【问题描述】:

在执行“UNION ALL”时,我必须识别多个表之间的唯一列。

例如:Employee_1、Employee_2、Employee_3

Table1: Employee_1

|Emp_Id|Joined_Mnth|Store_NO | Marked_YR | 
+------+-----------+---------+-----------+
| 1    |March      |100020   | 2018      |
| 2    |April      |120004   | 2018      |
| 3    |January    |100032   | 2019      |
| 4    |October    |231009   | 2019      |

Table2: Employee_2

|Emp_Id|Store_NO | Marked_YR | 
+------+---------+-----------+
| 1    |100020   | 2018      |
| 2    |120004   | 2018      |

Table3: Employee_3

|Emp_Id|Joined_Mnth|Store_NO | Closed_YR | 
+------+-----------+---------+-----------+
| 1    |March      |100020   | 2020      |
| 2    |April      |120004   | 2018      |
| 7    |January    |100032   | 2021      |
| 8    |October    |231009   | 2019      |

视图中的输出:

CREATE OR REPLACE VIEW Employee AS
SELECT Emp_Id,Store_NO FROM Employee_1
UNION ALL
SELECT Emp_Id,Store_NO FROM Employee_2
UNION ALL
SELECT Emp_Id,Store_NO FROM Employee_3

|Emp_Id|Store_NO |  ==> Common columns between Employee_1,Employee_2,Employee_3
+------+---------+
| 1    |100020   |
| 2    |120004   |
| 3    |100032   |
| 4    |231009   |
| 1    |100020   |
| 2    |120004   |
| 1    |100020   |
| 2    |120004   |
| 7    |100032   |
| 8    |231009   |

如何识别上述所有表之间的公共列?

【问题讨论】:

【参考方案1】:

你可以使用information_schema.columns:

select column_name
from information_schema.columns
where table_name in ('Employee_1', 'Employee_2', 'Employee_3')
group by column_name
having count(*) = 3;

【讨论】:

【参考方案2】:

您可以使用INFORMATION_SCHEMA.COLUMNS 获取列列表:

SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME ILIKE ANY ('Employee_1', 'Employee_2', 'Employee_3')
GROUP BY COLUMN_NAME
HAVING COUNT(*) = 3;

编辑:

可以使用NATURAL FULL JOIN组合多个表(感谢Lukas Eder,related post):

CREATE OR REPLACE VIEW Employee AS
SELECT *
FROM (SELECT *, 'Employee_1' AS tab FROM Employee_1) sub1
NATURAL FULL JOIN (SELECT *, 'Employee_2' AS tab FROM Employee_2) sub2
NATURAL FULL JOIN (SELECT *, 'Employee_3' AS tab FROM Employee_3) sub3;

这是一种OUTER UNION CORRESPONDING 实现。

【讨论】:

有没有办法避免不匹配的列? @K.Tom 使用NATURAL FULL JOIN,没有。

以上是关于识别多个表 Snowflake SQL 之间的唯一列的主要内容,如果未能解决你的问题,请参考以下文章

雪花算法(SnowFlake)

比较 JSON 值并识别差异 -Snowflake SQL

漫画:什么是SnowFlake算法?

SQL SERVER数据库 唯一索引 非唯一索引 聚集索引 非聚集索引 之间区别

在 Snowflake 中处理多个 SQL 语句的存储过程

Snowflake 中的 SQL Server 等效表类型是啥