如何在 SQL Redshift 中对表进行分区后比较两列的值

Posted

技术标签:

【中文标题】如何在 SQL Redshift 中对表进行分区后比较两列的值【英文标题】:How to compare two columns values after partitioning the table in SQL Redshift 【发布时间】:2020-05-27 17:09:00 【问题描述】:

我在一个表中有 4 列(Person_ID、Account_Name、Account_ID、Account_IDs)。对于每个人 id,我想查找该 person_ID 的 Account_IDs 列中不存在的所有 Account_ID。下面是一个示例表:

    Person_id Account_Name Account_ID Account_IDs
--------------------------------------------------
123 Name3   ,000000ihi4MAQ, ,000000TF5MAHZ,000000TF5MAQQ,000000grVA6AM,000000ihi4MAQ,000016ILMhAO,
123 Name2   ,000016ILMhAO,  ,000000TF5MAHZ,000000TF5MAQQ,000000grVA6AM,000000ihi4MAQ,000016ILMhAO,
123 Name1   ,000000grVA6AM, ,000000TF5MAHZ,000000TF5MAQQ,000000grVA6AM,000000ihi4MAQ,000016ILMhAO,
123 Name4   ,000000TF5MAQQ, ,000000TF5MAHZ,000000TF5MAQQ,000000grVA6AM,000000ihi4MAQ,000016ILMhAO,
123 Name5   ,000000TF5MAHZ, ,000000TF5MAHZ,000000TF5MAQQ,000000grVA6AM,000000ihi4MAQ,000016ILMhAO,
124 Name2   ,000016ILMhAO,  ,000000frVA6AM,000016ILMhAO,
124 Name7   ,000024ILMhAO,  ,000000frVA6AM,000016ILMhAO,
124 Name8   ,000000frVA7XZ, ,000000frVA6AM,000016ILMhAO,
124 Name5   ,000000TF5MAHZ, ,000000frVA6AM,000016ILMhAO,
124 Name6   ,000000frVA6AM, ,000000frVA6AM,000016ILMhAO,
125 Name11  ,000000frXC6A,  ,000000frVA6BC,000024ILMhJZ,000000frXC6A,000024YTMhA,
125 Name9   ,000000frVA6BC, ,000000frVA6BC,000024ILMhJZ,000000frXC6A,000024YTMhA,
125 Name10  ,000024ILMhJZ,  ,000000frVA6BC,000024ILMhJZ,000000frXC6A,000024YTMhA,
125 Name12  ,000024YTMhA,   ,000000frVA6BC,000024ILMhJZ,000000frXC6A,000024YTMhA,
125 Name13  ,000024IXThJY,  ,000000frVA6BC,000024ILMhJZ,000000frXC6A,000024YTMhA,

所以从这个示例中,答案应该是:

Person_ID   Account_ID      Account_Name
-----------------------------------------

   124      000000TF5MAHZ     Name5
   124      000024ILMhAO      Name7
   124      000000frVA7XZ     Name8
   125      000024IXThJY      Name13

我不明白如何在对表进行分区后比较两列中的值。

提前感谢您的帮助。

【问题讨论】:

您的列的数据类型是什么?您真的将列表存储为字符串吗?如果是这样,请修复您的数据模型! 【参考方案1】:

假设Account_IDs 是一个包含逗号分隔值的字符串,使用:

WHERE Account_IDs NOT LIKE '%' || Account_ID || '%'

【讨论】:

以上是关于如何在 SQL Redshift 中对表进行分区后比较两列的值的主要内容,如果未能解决你的问题,请参考以下文章

Redshift Spectrum 使用两个日期字段对表进行分区

在 Hive 中对表进行分区和分桶有啥区别?

如何在Sql中对表进行分组?

如何在包含其他几个查询的 sql 查询中对表进行连接?

如何在 PDO 语句中对表名进行参数化? [复制]

如何在sqlite中对表中的行进行排名?