如何将字符串转换为hive中的struct数组并进行爆炸?
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何将字符串转换为hive中的struct数组并进行爆炸?相关的知识,希望对你有一定的参考价值。
我在hive中有以下格式的数据。表test(seq string, result string);
|seq | result |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|0001 | [{"offerId":"Default_XYZ","businessName":"Apple","businessGroup":"Default","businessIssue":"Default","interactionId":"-4930126168287369915","campaignID":"P-1","rank":"1"},{"offerId":"Default_NAV","businessName":"Orange","businessGroup":"Default","businessIssue":"Default","interactionId":"-7830126168223452134","campaignID":"P-1","rank":"2"}] |
输出应该是这样的
|seq | offerId | businessName | businsesGroup| businessIssue | interactionId | campaignId | rank |
----------------------------------------------------------------------------------------------------------------
|0001 | Default_XYZ | Apple | Default | Default | -4930126168287369915 | P-1 | 1 |
|0001 | Default_NAV | Orange | Default | Default | -7830126168223452134 | P-1 | 2 |
我试图将字符串转换为结构数组,但它不适用于直接CAST。
有什么帮助吗?
[编辑 - 尝试以下查询]
select sequenceNumber, offerId, businessName, rank from (
select sequenceNumber,
collect_list(oid['offerId']) as offerid_list
, collect_list(oid['businessName']) as businessName_list
,collect_list(oid['rank']) as rank_list
from (
select sequenceNumber,
str_to_map(translate(offer_Id,'{}','')) as oid
from test
lateral view explode (split(translate(result, '[]"',''),"\},")) oid as offer_id
) x
group by sequenceNumber
) y lateral view explode(offerid_list) olist as offerId
lateral view explode(businessName_list) olist as businessName
lateral view explode(rank_list) rlist as rank
答案
找到我的问题的一个解决方案:
select
seq,
split(split(results,",")[0],':')[1] as offerId,
split(split(results,",")[1],':')[1] as businessName,
split(split(results,",")[2],':')[1] as businessGroup,
split(split(results,",")[3],':')[1] as businessIssue,
split(split(results,",")[4],':')[1] as interactionId,
split(split(results,",")[5],':')[1] as campignId
regexp_replace(split(split(results,",")[6],":")[1], "[\]|}]", "") as rank
from
(
select seq,
split(translate(result), '"\[|]|""',''), "},") as r
from test
) t1
LATERAL VIEW explode(r) rr AS results
另一答案
你可以尝试get_json_object
功能。
select seq, get_json_object(result,'$[0].offerId') as offerId,
get_json_object(result,'$[0].businessName') as businessName,
get_json_object(result,'$[0].businsesGroup') as businsesGroup,
get_json_object(result,'$[0].businessIssue') as businessIssue,
get_json_object(result,'$[0].interactionId') as interactionId,
get_json_object(result,'$[0].campaignId') as campaignId,
get_json_object(result,'$[0].rank') as rank
from t
UNION ALL
select seq, get_json_object(result,'$[1].offerId') as offerId,
get_json_object(result,'$[1].businessName') as businessName,
get_json_object(result,'$[1].businsesGroup') as businsesGroup,
get_json_object(result,'$[1].businessIssue') as businessIssue,
get_json_object(result,'$[1].interactionId') as interactionId,
get_json_object(result,'$[1].campaignId') as campaignId,
get_json_object(result,'$[1].rank') as rank
from t
以上是关于如何将字符串转换为hive中的struct数组并进行爆炸?的主要内容,如果未能解决你的问题,请参考以下文章