如何在 Azure 流分析中展平嵌套的 json 数据
Posted
技术标签:
【中文标题】如何在 Azure 流分析中展平嵌套的 json 数据【英文标题】:How to flatten nested json data in Azure stream analytics 【发布时间】:2020-12-24 13:37:54 【问题描述】:我在编写查询以从 JSON 文件的数组中提取表时遇到问题。 我想展平三个数组,即 case_Time、details 和其他数组,并将它们全部放在一个普通的 SQL 表中。
示例 JSON 数据:
"case_Time": [
"v1": "1",
"v2": "0",
"v3": "0",
"date": "30 January ",
"dateymd": "2020-01-30",
"v4": "1",
"v5": "0",
"v6": "0"
,
"v1": "1",
"v2": "0",
"v3": "0",
"date": "31 January ",
"dateymd": "2020-01-31",
"v4": "1",
"v5": "0",
"v6": "0"
],
"details": [
"d1": "281844",
"d2": "10124024",
"d3": "146791",
"d4": "0",
"d5": "0",
"d6": "0",
"lastupdatedtime": "24/12/2020 09:12:24",
"d7": "2746",
"d8": "9692643",
"d9": "Total",
"notes": "some text"
,
"d1": "281944",
"d2": "1012",
"d3": "1791",
"d4": "0",
"d5": "0",
"d6": "0",
"lastupdatedtime": "25/12/2020 09:12:24",
"d7": "2746",
"d8": "96643",
"d9": "Total",
"notes": "some text"
],
"others": [
"p1": "",
"p2": "75.64",
"p3": "",
"p4": "",
"p5": "",
"p6": "",
"date": "13/03/2020",
"p7": "",
"p8": "1.20%",
"p9": "",
"p10": "83.33",
"p11": "5",
"p12": "5900",
"p13": "78"
,
"p1": "",
"p2": "75.64",
"p3": "",
"p4": "",
"p5": "",
"p6": "",
"date": "14/03/2020",
"p7": "",
"p8": "1.20%",
"p9": "",
"p10": "81.33",
"p11": "5",
"p12": "500",
"p13": "78"
]
我尝试了以下查询,但只获取第一个数组数据,如何展平剩余数组:
WITH Cases AS
(
SELECT
arrayElement.ArrayIndex,
arrayElement.ArrayValue as av
FROM input as event
CROSS APPLY GetArrayElements(event.case_Time) AS arrayElement
)
SELECT av.v1, av.v2, av.v3,av.date,av.dateymd, av.v4,av.v5,av.v6
INTO powerbi
FROM Cases
感谢任何帮助:)
【问题讨论】:
【参考方案1】:你可以Cross APPLY
你所有的数组,试试这样:
WITH Cases AS
(
SELECT
arrayElement.ArrayIndex as ai,
arrayElement.ArrayValue as av,
y.ArrayIndex as yi,
y.ArrayValue as dt,
z.ArrayIndex as zi,
z.ArrayValue as ot
FROM input as event
CROSS APPLY GetArrayElements(event.case_Time) AS arrayElement
CROSS APPLY GetArrayElements(event.details) AS y
CROSS APPLY GetArrayElements(event.others) AS z
)
SELECT av.v1, av.v2, av.v3,av.date,av.dateymd,av.v4,av.v5,av.v6,dt.d1,dt.d2,dt.d3,dt.d4,dt.d5,dt.d6,dt.lastupdatedtime,dt.d7,dt.d8,dt.d9,dt.notes,ot.p1,ot.p2,ot.p3,ot.p4,ot.p5,ot.p6,ot.p7,ot.p8,ot.p9,ot.p10,ot.p11,ot.p12,ot.p13,ot.date as tdate
FROM Cases
INTO powerbi
此查询将产生一个完整的叉积,因此您将获得 8 行。如果只想获取2行(对应索引),可以加Where ai = yi and yi = zi
【讨论】:
感谢我能够解压数组。以上是关于如何在 Azure 流分析中展平嵌套的 json 数据的主要内容,如果未能解决你的问题,请参考以下文章
使用 Azure Synapse pyspark 过滤器根据嵌套对象的数据类型展平嵌套的 json 对象
如何在 Azure 流分析查询中检查 null Json 属性?