BigQuery MERGE 查询不适用于空的 REPEATED 字段
Posted
技术标签:
【中文标题】BigQuery MERGE 查询不适用于空的 REPEATED 字段【英文标题】:BigQuery MERGE query not working with empty REPEATED field 【发布时间】:2021-02-04 06:53:20 【问题描述】:我正在尝试对具有以下架构的表运行 MERGE
查询:
[
"mode":"REQUIRED",
"name":"container_ref",
"type":"STRING"
,
"mode":"REQUIRED",
"name":"receive_date",
"type":"DATE"
,
"mode":"REQUIRED",
"name":"vendor_id",
"type":"INTEGER"
,
"fields":[
"mode":"NULLABLE",
"name":"code",
"type":"STRING"
,
"mode":"NULLABLE",
"name":"quantity",
"type":"INTEGER"
],
"mode":"REPEATED",
"name":"products",
"type":"RECORD"
,
"mode":"REQUIRED",
"name":"max_fill_ratio",
"type":"FLOAT"
]
我正在尝试运行以下查询:
MERGE `project.dataset.table` T
USING (
WITH container_data AS (
SELECT
'yongkang_210222' AS container_ref
, DATE('2021-02-22') AS receive_date
, 51695 AS vendor_id
, 0.5 AS max_fill_ratio
, [] AS products
)
SELECT * FROM container_data
) S
ON T.container_ref = S.container_ref
WHEN MATCHED THEN
UPDATE SET T.products = S.products, T.max_fill_ratio = S.max_fill_ratio
WHEN NOT MATCHED THEN
INSERT (container_ref, receive_date, vendor_id, products, max_fill_ratio)
VALUES(container_ref, receive_date, vendor_id, products, max_fill_ratio)
WHEN NOT MATCHED BY SOURCE AND receive_date = '2021-02-22' THEN
DELETE
如您所见,在源数据中,products
是一个空数组。我从查询中收到以下错误:
ARRAY 类型的值不能分配给 T.products,它有 type ARRAY
>
如果在 products
未设置为空数组的情况下运行相同的查询,则它可以正常工作。例如:
MERGE `project.dataset.table` T
USING (
WITH container_data AS (
SELECT
'yongkang_210222' AS container_ref
, DATE('2021-02-22') AS receive_date
, 51695 AS vendor_id
, 0.5 AS max_fill_ratio
, [STRUCT('01757' AS code, 10 AS quantity),STRUCT('03831' AS code, 20 AS quantity)] AS products
)
SELECT * FROM container_data
) S
ON T.container_ref = S.container_ref
WHEN MATCHED THEN
UPDATE SET T.products = S.products, T.max_fill_ratio = S.max_fill_ratio
WHEN NOT MATCHED THEN
INSERT (container_ref, receive_date, vendor_id, products, max_fill_ratio)
VALUES(container_ref, receive_date, vendor_id, products, max_fill_ratio)
WHEN NOT MATCHED BY SOURCE AND receive_date = '2021-02-22' THEN
DELETE
我无法弄清楚为什么MERGE
查询中的空数组会导致此错误。我已经尝试使用产品的空数组插入和更新表格,它工作正常。例如,这个查询可以正常工作:
INSERT INTO `project.dataset.table`
(container_ref, receive_date, vendor_id, max_fill_ratio, products)
VALUES (
'yongkang_210222'
, DATE('2021-02-22')
, 51695
, 0.5
, []
)
这开始感觉像是一个 BigQuery 错误。有人知道为什么会这样吗?
【问题讨论】:
【参考方案1】:我不知道为什么,但是在数组之前添加ARRAY<STRUCT<code STRING, quantity INT64>>
已经解决了这个问题:
MERGE `project.dataset.table` T
USING (
WITH container_data AS (
SELECT
'yongkang_210222' AS container_ref
, DATE('2021-02-22') AS receive_date
, 51695 AS vendor_id
, 0.5 AS max_fill_ratio
, ARRAY<STRUCT<code STRING, quantity INT64>> [] AS products
)
SELECT * FROM container_data
) S
ON T.container_ref = S.container_ref
WHEN MATCHED THEN
UPDATE SET T.products = S.products, T.max_fill_ratio = S.max_fill_ratio
WHEN NOT MATCHED THEN
INSERT (container_ref, receive_date, vendor_id, products, max_fill_ratio)
VALUES(container_ref, receive_date, vendor_id, products, max_fill_ratio)
WHEN NOT MATCHED BY SOURCE AND receive_date = '2021-02-22' THEN
DELETE
现在它对于空数组和非空数组都可以正常工作。如果有人可以提供解释,我很想了解为什么这是必要的。
【讨论】:
以上是关于BigQuery MERGE 查询不适用于空的 REPEATED 字段的主要内容,如果未能解决你的问题,请参考以下文章
BigQuery - 时间戳偏移不适用于 insertAll / Streaming?
JSON_EXTRACT 不适用于 BigQuery 中的布尔值