雪花中具有多个 JSON 对象的横向展平数组

Posted

技术标签:

【中文标题】雪花中具有多个 JSON 对象的横向展平数组【英文标题】:Lateral flatten array with multiple JSON objects in Snowflake 【发布时间】:2019-08-10 15:15:43 【问题描述】:

我有一个包含多个 JSON 对象的数组。位于表中的任何 JSON 数组中的最大元素数为 8。

以下是数组原始值的示例:

                              variants
----------------------------------------------------------------

[
      
        "id": 12388362846279,
        "inventory_quantity": 10,
        "sku": “sku1”
      ,
      
        "id": 12388391387207,
        "inventory_quantity": 31,
        "sku": “sku2”
      ,
      
        "id": 12394420142151,
        "inventory_quantity": 12,
        "sku": “sku3”
      ,
      
        "id": 12394426007623,
        "inventory_quantity": 4,
        "sku": “sku4”
      ,
      
        "id": 12394429022279,
        "inventory_quantity": 9,
        "sku": “sku5”
      ,
      
        "id": 12394431414343,
        "inventory_quantity": 15,
        "sku": “sku6”
      ,
      
        "id": 12394455597127,
        "inventory_quantity": 22,
        "sku": “sku7”
      ,
      
        "id": 12394459856967,
        "inventory_quantity": 0,
        "sku": “sku8”
      
    ]

我的查询尝试展平并解析数组以为每个对象返回一行:

select 
      variants[0]:sku,
      variants[0]:inventory_quantity,
      variants[1]:sku,
      variants[1]:inventory_quantity,
      variants[2]:sku,
      variants[2]:inventory_quantity,
      variants[3]:sku,
      variants[3]:inventory_quantity,
      variants[4]:sku,
      variants[4]:inventory_quantity,
      variants[5]:sku,
      variants[5]:inventory_quantity,
      variants[6]:sku,
      variants[6]:inventory_quantity,
      variants[7]:sku,
      variants[7]:inventory_quantity
from table
, lateral flatten(input => variants)

但是,我的输出返回重复/重复值:

+------+----+------+----+------+----+------+---+------+---+------+----+------+----+------+---+
| sku1 | 10 | sku2 | 31 | sku3 | 12 | sku4 | 4 | sku5 | 9 | sku6 | 15 | sku7 | 22 | sku8 | 0 |
+------+----+------+----+------+----+------+---+------+---+------+----+------+----+------+---+
| sku1 | 10 | sku2 | 31 | sku3 | 12 | sku4 | 4 | sku5 | 9 | sku6 | 15 | sku7 | 22 | sku8 | 0 |
+------+----+------+----+------+----+------+---+------+---+------+----+------+----+------+---+
| sku1 | 10 | sku2 | 31 | sku3 | 12 | sku4 | 4 | sku5 | 9 | sku6 | 15 | sku7 | 22 | sku8 | 0 |
+------+----+------+----+------+----+------+---+------+---+------+----+------+----+------+---+
| sku1 | 10 | sku2 | 31 | sku3 | 12 | sku4 | 4 | sku5 | 9 | sku6 | 15 | sku7 | 22 | sku8 | 0 |
+------+----+------+----+------+----+------+---+------+---+------+----+------+----+------+---+
| sku1 | 10 | sku2 | 31 | sku3 | 12 | sku4 | 4 | sku5 | 9 | sku6 | 15 | sku7 | 22 | sku8 | 0 |
+------+----+------+----+------+----+------+---+------+---+------+----+------+----+------+---+
| sku1 | 10 | sku2 | 31 | sku3 | 12 | sku4 | 4 | sku5 | 9 | sku6 | 15 | sku7 | 22 | sku8 | 0 |
+------+----+------+----+------+----+------+---+------+---+------+----+------+----+------+---+
| sku1 | 10 | sku2 | 31 | sku3 | 12 | sku4 | 4 | sku5 | 9 | sku6 | 15 | sku7 | 22 | sku8 | 0 |
+------+----+------+----+------+----+------+---+------+---+------+----+------+----+------+---+
| sku1 | 10 | sku2 | 31 | sku3 | 12 | sku4 | 4 | sku5 | 9 | sku6 | 15 | sku7 | 22 | sku8 | 0 |
+------+----+------+----+------+----+------+---+------+---+------+----+------+----+------+---+

我希望我的输出类似于以下内容:

+------+----+
| sku1 | 10 |
+------+----+
| sku2 | 31 |
+------+----+
| sku3 | 12 |
+------+----+
| sku4 | 4  |
+------+----+
| sku5 | 9  |
+------+----+
| sku6 | 15 |
+------+----+
| sku7 | 22 |
+------+----+
| sku8 | 0  |
+------+----+

【问题讨论】:

【参考方案1】:

使用 LATERAL FLATTEN 无需显式引用数组位置。数组的每个成员都成为自己的行。因此,要获得上面想要的结果,只需使用:

select v.value:sku::varchar, 
       v.value:inventory_quantity 
from table, 
lateral flatten(input => table.variants) v
;

如果表中的列在您希望在每一行中引用的数组之外,只需将它们包含在 SELECT 中即可。本质上,数组中的扁平行隐式地“连接”到表的非嵌套列......

【讨论】:

斯图尔特说的。如果您仍然想要 8 列,可能部分为空,您可以使用 select <what you had> ... from table @FaisalAl-Khalidi,如果 Stuart 的回答对您有所帮助,请接受并支持他的回答 :)

以上是关于雪花中具有多个 JSON 对象的横向展平数组的主要内容,如果未能解决你的问题,请参考以下文章

在雪花中展平嵌套的 JSON

如何将数组字符串列展平为雪花中的行?

带有 ORDER BY 的雪花 JSON 扁平化

雪花表中json数据的解析字段将多行插入到新的雪花表中

如何从雪花中的json对象数组中选择数据

雪花合并对象/json