当键为空时,Bigquery 合并连接
Posted
技术标签:
【中文标题】当键为空时,Bigquery 合并连接【英文标题】:Bigquery coalesce join when key is null 【发布时间】:2016-07-09 18:38:06 【问题描述】:表一
+---------+-----------+--------+
| user_id | email | action |
+---------+-----------+--------+
| 1 | aa@aa.com | open |
+---------+-----------+--------+
| 2 | null | click |
+---------+-----------+--------+
| 3 | ac@ac.com | click |
+---------+-----------+--------+
| 4 | ad@ad.com | open |
+---------+-----------+--------+
表 2
+---------+-----------+--------+
| user_id | email | event |
+---------+-----------+--------+
| 1 | aa@aa.com | sent |
+---------+-----------+--------+
| null | ac@ac.com | none |
+---------+-----------+--------+
| 2 | ab@ab.com | sent |
+---------+-----------+--------+
| 4 | ad@ad.com | sent |
+---------+-----------+--------+
我想基于 t1.user_id = t2.user_id 加入,但是当 key 为空时,加入 t1.email = t2.email
我尝试了多种方式在 bigquery 中加入: 1.)在 COALESCE(t1.user_id,t1.email) = COALESCE(t2.user_id, t2.email) 2.)在 t2.user_id 不为空时打开,然后 t1.user_id = t2.user_id 否则 t1.email = t2.email 结束
都不行。如何做到这一点?
【问题讨论】:
SQL 有“IS DISTINCT”和“IS NOT DISTINCT”,但我在 BigQuery 中看不到。 【参考方案1】:我会将这样的连接拆分为两个单独的: 首先 - 通过 user_id 加入
SELECT *
FROM table1 AS t1
JOIN table2 AS t2
ON t1.user_id = t2.user_id
第二次 - 通过电子邮件加入第一次加入时错过的那些 id
SELECT *
FROM (
SELECT * FROM table1
WHERE user_id NOT IN (
SELECT t1.user_id
FROM table1 AS t1
JOIN table2 AS t2
ON t1.user_id = t2.user_id
)
) t1
JOIN (
SELECT * FROM table2
WHERE user_id NOT IN (
SELECT t1.user_id
FROM table1 AS t1
JOIN table2 AS t2
ON t1.user_id = t2.user_id
)
) t2
ON t1.email = t2.email
【讨论】:
谢谢。我真的希望它使用 COALESCE。根据链接,它似乎是可能的。但是,我猜 BigQuery 不支持? ***.com/questions/5304184/… per cloud.google.com/bigquery/query-reference#query-grammar : join_predicate: field_from_one_side_of_the_join = field_from_the_other_side_of_the_join [ AND ...]以上是关于当键为空时,Bigquery 合并连接的主要内容,如果未能解决你的问题,请参考以下文章
sql [BigQuery - Facebook产品目录]查询para obtenerelcatálogodeproductos de Kichink。 #facebook #bigqu