如何使用Hive解析多个嵌套的JSON数组
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何使用Hive解析多个嵌套的JSON数组相关的知识,希望对你有一定的参考价值。
"base":
"code": "xm",
"name": "project"
,
"list": [
"ACode": "cp1",
"AName": "Product1",
"BList": [
"BCode": "gn1",
"BName": "Feature1"
,
"BCode": "gn2",
"BName": "Feature2"
]
,
"ACode": "cp2",
"AName": "Product2",
"BList": [
"BCode": "gn1",
"BName": "Feature1"
]
]
像这样的JSON,想要得到这个
| code | name | ACode | Aname | Bcode | Bname |
| ---- | ------- | ----- | -------- | ----- | -------- |
| xm | project | cp1 | Product1 | gn1 | Feature1 |
| xm | project | cp1 | Product1 | gn2 | Feature2 |
| xm | project | cp2 | Product2 | gn1 | Feature1 |
我尝试使用此
SELECT
code
, name
, get_json_object(t.list, '$.[*].ACode') AS ACode
, get_json_object(t.list, '$.[*].AName') AS AName
, get_json_object(t.list, '$.[*].BList[*].BCode') AS BCode
, get_json_object(t.list, '$.[*].BList[*].BName') AS BName
FROM
(
SELECT
get_json_object(t.value, '$.base.code') AS code
, get_json_object(t.value, '$.base.name') AS name
, get_json_object(t.value, '$.list') AS list
FROM
(
SELECT
'"base":"code":"xm","name":"project","list":["ACode":"cp1","AName":"Product1","BList":["BCode":"gn1","BName":"Feature1","BCode":"gn2","BName":"Feature2"],"ACode":"cp2","AName":"Product2","BList":["BCode":"gn1","BName":"Feature1"]]' as value
)
t
)
t
;
获取此
xm project ["cp1","cp2"] ["Product1","Product2"] ["gn1","gn2","gn1"] ["Feature1","Feature2","Feature1"]
但是我发现它将生成六行。似乎具有笛卡尔积。而且我尝试使用split(string,“ \,\ ”),但这将同时拆分内部层。因此,我希望获得帮助。
答案
我解决了!
SELECT
code
, name
, ai.ACode
, ai.AName
, p.BCode
, p.BName
FROM
(
SELECT
get_json_object(t.value, '$.base.code') AS code
, get_json_object(t.value, '$.base.name') AS name
, get_json_object(t.value, '$.list') AS list
FROM
(
SELECT
'"base":"code":"xm","name":"project","list":["ACode":"cp1","AName":"Product1","BList":["BCode":"gn1","BName":"Feature1","BCode":"gn2","BName":"Feature2"],"ACode":"cp2","AName":"Product2","BList":["BCode":"gn1","BName":"Feature1"]]' as value
)
t
)
t
lateral view explode(split(regexp_replace(regexp_extract(list,'^\\[(.+)\\]$',1),'\\\\]\\\\,\\', '\\\\]\\\\|\\|\\'),'\\|\\|')) list as a
lateral view json_tuple(a,'ACode','AName','BList') ai as ACode
, AName
, BList
lateral view explode(split(regexp_replace(regexp_extract(BList,'^\\[(.+)\\]$',1),'\\\\,\\', '\\\\|\\|\\'),'\\|\\|')) BList as b
lateral view json_tuple(b,'BCode','BName') p as BCode
, BName
以上是关于如何使用Hive解析多个嵌套的JSON数组的主要内容,如果未能解决你的问题,请参考以下文章