UNNEST/在雪花中展平

Posted

技术标签:

【中文标题】UNNEST/在雪花中展平【英文标题】:UNNEST/Flatten in snowflake 【发布时间】:2022-01-21 20:26:50 【问题描述】:

我有一个要转换为雪花的 mysql 查询。

MySQL 查询:

WITH t AS (
    select id,
        date,
        copt,
        split(copt, '|') [ 1 ] as "abc",
        split(copt, '|') [ 2 ] as "def",
        split(copt, '|') [ 3 ] as "xyz",
    from tablename
    where id in (
            123,
            456,
            789,
        )
        and date >= dateadd('day', -6, to_date('2021-12-17'))
        and date <= date '2021-12-17'
        and copt like '%|%|%|%|%|%|%|%|%|%|%'
)
SELECT t.id,
    t.date,
    catId,
    REPLACE(productId, '_', '') as productId,
FROM t
    CROSS JOIN UNNEST(
        split(t."abc", '_'),
        split(t."def", '_'),
        split(t."xyz", '_'),
    ) as x(catId, productId, quantity)
where productid != ''
order by id

我尝试将UNNEST() 替换为FLATTEN(),但没有成功。

谁能帮我把这个查询从 MySQL 转换成 Snowflake?

【问题讨论】:

【参考方案1】:

MySQL 似乎没有 UNNEST 支持,所以我无法阅读相关手册。

您还没有提供任何示例输入和您期望的输出,但假设您 abcdefxzy 是独立大小的数组,并且您想要以下 SQL 应该的所有排列工作:

WITH cte_t(id,date, abc, def, xyz, productid) AS (
    SELECT * FROM VALUES
    (1, '2022-01-23'::date, 'a_b_c_d', 'd_e_f', 'x_y_w', 'not empty'),
    (2, '2022-01-23'::date, 'a_d', 'd_f', 'y_w', 'not empty'),
    (3, '2022-01-23'::date, 'a_c_d', 'f', 'y1_y2_w', 'not empty')
)
SELECT
    t.id,
    t.date,
    a.value AS catid,
    b.value AS productid,
    c.value AS quantity
FROM cte_t AS t
    ,LATERAL SPLIT_TO_TABLE(t.abc, '_') a
    ,LATERAL SPLIT_TO_TABLE(t.def, '_') b 
    ,LATERAL SPLIT_TO_TABLE(t.xyz, '_') c
WHERE productid != ''

给予:

ID DATE CATID PRODUCTID QUANTITY
1 2022-01-23 a d x
1 2022-01-23 b d x
1 2022-01-23 c d x
1 2022-01-23 d d x
1 2022-01-23 a e x
1 2022-01-23 b e x
1 2022-01-23 c e x
1 2022-01-23 d e x
1 2022-01-23 a f x
1 2022-01-23 b f x
1 2022-01-23 c f x
1 2022-01-23 d f x
1 2022-01-23 a d y
1 2022-01-23 b d y
1 2022-01-23 c d y
1 2022-01-23 d d y
1 2022-01-23 a e y
1 2022-01-23 b e y
1 2022-01-23 c e y
1 2022-01-23 d e y
1 2022-01-23 a f y
1 2022-01-23 b f y
1 2022-01-23 c f y
1 2022-01-23 d f y
1 2022-01-23 a d w
1 2022-01-23 b d w
1 2022-01-23 c d w
1 2022-01-23 d d w
1 2022-01-23 a e w
1 2022-01-23 b e w
1 2022-01-23 c e w
1 2022-01-23 d e w
1 2022-01-23 a f w
1 2022-01-23 b f w
1 2022-01-23 c f w
1 2022-01-23 d f w
2 2022-01-23 a d y
2 2022-01-23 d d y
2 2022-01-23 a f y
2 2022-01-23 d f y
2 2022-01-23 a d w
2 2022-01-23 d d w
2 2022-01-23 a f w
2 2022-01-23 d f w
3 2022-01-23 a f y1
3 2022-01-23 c f y1
3 2022-01-23 d f y1
3 2022-01-23 a f y2
3 2022-01-23 c f y2
3 2022-01-23 d f y2
3 2022-01-23 a f w
3 2022-01-23 c f w
3 2022-01-23 d f w

好的,这是我所期望的更多数据,但是你去吧。

如果这三个数组大小相同,那么您可以使用一个SPLIT_TO_TALBE 和两个SPLIT_PART:

WITH cte_t(id,date, abc, def, xyz, productid) AS (
    SELECT * FROM VALUES
    (1, '2022-01-23'::date, 'a_b_c', 'd_e_f', 'x_y_w', 'not empty')
)
SELECT
    t.id,
    t.date,
    a.value AS catid,
    split_part(t.def, '_', a.index) AS productid,
    split_part(t.xyz, '_', a.index) AS quantity
FROM cte_t AS t
    ,LATERAL SPLIT_TO_TABLE(t.abc, '_') a
WHERE productid != ''

给予

ID DATE CATID PRODUCTID QUANTITY
1 2022-01-23 a d x
1 2022-01-23 b e y
1 2022-01-23 c f w

【讨论】:

以上是关于UNNEST/在雪花中展平的主要内容,如果未能解决你的问题,请参考以下文章

在雪花中展平嵌套的 JSON

如何将数组字符串列展平为雪花中的行?

雪花中具有多个 JSON 对象的横向展平数组

雪花物化视图可以包含半结构化数据的展平吗?

雪花 XML 解析返回 NULL - 字段名称中的空格?

在雪花中使用 try_cast 处理非常长的数字