Google BigQuery SQL:将多级 JSON (list +json+list+json) 解析为列

Posted

技术标签:

【中文标题】Google BigQuery SQL:将多级 JSON (list +json+list+json) 解析为列【英文标题】:Google BigQuery SQL: Parse multi level JSON (list +json+list+json) to Columns 【发布时间】:2021-03-15 11:42:22 【问题描述】:

我有两个级别的 Json 字符串:list[jsonlist[json]]:

UserID               Json_List
100     ["application_charge":"id":13409353813,"name":"Starter","api_client_id":2485321,"usage_charges":["id":48216805,"description":"Extras","price":"60.70"]]
200     ["application_charge":"id":13409353814,"name":"Starter","api_client_id":2485322,"usage_charges":["id":48216890,"description":"Extras","price":"80.79"]]

需要像表格一样的输出:

UserID  application_charge.id   name    api_client_id   usage_charges.id    description   price
100      13409353813          Starter    2485321        48216805           Extras         60.7
200      13409353814          Starter    2485322        48216890           Extras         80.79

我设法退出了第一步“application_charge”,但不明白如何进入下一步“usage_charges”

select  
  UserID,
  json_extract_scalar(json, '$.application_charge.id')  as id,
  json_extract_scalar(json, '$.application_charge.name')  as name,
  json_extract_scalar(json, '$.application_charge.api_client_id')  as api_client_id
from `be-prod-data.retailers_billing_data_production.shopify_application_charges`,
unnest(json_extract_array( json_list, '$')) json

如何从第二阶段提取数据??

【问题讨论】:

【参考方案1】:

再多一个unnest(json_extract_array(...

with mytable as (
  select 100 as userid, '["application_charge":"id":13409353813,"name":"Starter","api_client_id":2485321,"usage_charges":["id":48216805,"description":"Extras","price":"60.70"]]' as json_list union all
  select 200, '["application_charge":"id":13409353814,"name":"Starter","api_client_id":2485322,"usage_charges":["id":48216890,"description":"Extras","price":"80.79"]]'
)
select  
  UserID,
  json_extract_scalar(json, '$.application_charge.id')  as id,
  json_extract_scalar(json, '$.application_charge.name')  as name,
  json_extract_scalar(json, '$.application_charge.api_client_id')  as api_client_id,
  json_extract_scalar(nested, '$.id')  as usage_charges_id,
from mytable,
unnest(json_extract_array( json_list, '$')) json, unnest(json_extract_array(json, '$.usage_charges')) as nested

【讨论】:

以上是关于Google BigQuery SQL:将多级 JSON (list +json+list+json) 解析为列的主要内容,如果未能解决你的问题,请参考以下文章

Google 标准 SQL UDF - 写入 BigQuery

Google BigQuery SQL:如何将过程转换为返回表的函数?

BigQuery 中用于 Google Analytics 数据的标准 SQL 还是旧版 SQL?

Google Bigquery Legacy SQL - 如何将周数设置为星期一?

无法在 Google BigQuery 中将此旧版 SQL 转换为标准 SQL

将常规函数与 WINDOW 函数结合使用的 Google BigQuery SQL 问题