比较 JSON 值并识别差异 -Snowflake SQL
Posted
技术标签:
【中文标题】比较 JSON 值并识别差异 -Snowflake SQL【英文标题】:Compare JSON Values and identify the differences -Snowflake SQL 【发布时间】:2020-12-03 18:16:53 【问题描述】:我正在尝试比较每笔交易的两个 JSON 值的集合,并从下面提取特定值。我想提取以下值 cCode、dCode、hcps 和 mod..请您指导我了解雪花相同的 SQL 语法。第一个 JSON 由编码器完成,第二个 json 由审计员完成
1st json
[
"cCode": "7832",
"Date": "08/26/2020",
"ID": "511",
"description": "holos",
"dgoses": [
"description": "disease",
"dCode": "Y564",
"CodeAllId": "8921",
"messages": [
""
],
"sequenceNumber": 1
,
"description": "acute pain",
"dCode": "U3321",
"CodeAllId": "33213",
"messages": [
""
],
"sequenceNumber": 2
,
"description": "height",
"dCode": "U1111",
"CodeAllId": "33278",
"messages": [
""
],
"sequenceNumber": 3
,
"description": "PIDEMIA ",
"dCode": "H8811",
"CodeAllId": "90000",
"messages": [
""
],
"sequenceNumber": 4
],
"familyPlan": "",
"hcpc": 5,
"id": "",
"isEPS": false,
"mod": "67",
"originalUnitAmount": "8888",
"type": "CHARGE",
"unitAmount": "9000",
"vId": "90001"
,
"cCode": "900114",
"Date": "08/26/2020",
"ID": "523",
"description": "heart valve",
"dgoses": [
"description": "Fever",
"dCode": "J8923",
"CodeAllId": "892138",
"messages": [
""
],
"sequenceNumber": 1
],
"familyPlan": "",
"hcpc": 1,
"id": "",
"mod": "26",
"originalUnitAmount": "19039",
"type": "CHARGE",
"unitAmount": "1039",
"vId": "5113"
]
2nd JSON
[
""cCode"": ""78832"",
""Date"": ""08/26/2020"",
""ID"": ""511"",
""description"": ""holos"",
""dgoses"": [
""description"": ""disease"",
""dCode"": ""Y564"",
""CodeAllId"": ""8921"",
""messages"": [
""""
],
""sequenceNumber"": 1
,
""description"": ""acute pain"",
""dCode"": ""U3321"",
""CodeAllId"": ""33213"",
""messages"": [
""""
],
""sequenceNumber"": 2
,
""description"": ""height"",
""dCode"": ""U41111"",
""CodeAllId"": ""33278"",
""messages"": [
""""
],
""sequenceNumber"": 3
,
""description"": ""PIDEMIA "",
""dCode"": ""H8811"",
""CodeAllId"": ""90000"",
""messages"": [
""""
],
""sequenceNumber"": 4
],
""familyPlan"": """",
""hcpc"": 8,
""id"": """",
""isEPS"": false,
""mod"": ""67"",
""originalUnitAmount"": ""8888"",
""type"": ""CHARGE"",
""unitAmount"": ""9000"",
""vId"": ""90001""
,
""cCode"": ""900114"",
""Date"": ""08/26/2020"",
""ID"": ""523"",
""description"": ""heart valve"",
""dgoses"": [
""description"": ""Fever"",
""dCode"": ""J8923"",
""CodeAllId"": ""892138"",
""messages"": [
""""
],
""sequenceNumber"": 1
],
""familyPlan"": """",
""hcpc"": 1,
""id"": """",
""mod"": ""126"",
""originalUnitAmount"": ""19039"",
""type"": ""CHARGE"",
""unitAmount"": ""1039"",
""vId"": ""5113""
]
我正在寻找如下结果:
Billid ctextid cCode-Coder cCode-Auditor deletedccode added-ccode dCode-Coder
dCode-Auditor deleted-dcode added-dcode hcpc-coder hcpc-auditor deletedhcpc addedhcpc mod-coder mod-auditor deletedmod addedmod
7111 89321 7832,900114 78832,900114 7832 78832 Y564,U3321,U1111,H8811,J8923 Y564,U3321,U41111,H8811,J8923 U1111 U41111 5,1 8,1 5 8 67,26 67,126 26 126
谁能帮帮我
sql 试过
with cte4 as
(
select info:dCode as dtcode
from cte3, lateral flatten( input => saveinfo:dgoses )
)
select dCode from cte4, lateral flatten( input => dtcode )
这会直接导致使用错误:
我已经尝试过使用 SQL 服务器版本的代码,但我需要知道如何将 JSON 函数映射到 Snowflake SQL 版本..请您在这里帮忙..
with I as
( 选择 , dense_rank() over (order by Billid, Ctextid) as tid, dense_rank() over (partition by Billid, Ctextid order by Created) as n 来自##input1 ), 作为 ( 选择 I., mk.[key] as mk, m., dk.[key] as dk, d. 从我 交叉应用 openjson(info) mk 交叉应用 openjson(mk.value) 与 ( cCode nvarchar(max) '$.cCode', dgoses nvarchar(max) '$.dgoses' 作为 json ) 米 交叉应用 openjson(dgoses) dk 交叉应用 openjson(dk.value) 与 ( dCode nvarchar(max) '$.dCode' ) d ), C作为 ( 从 D 中选择 *,其中 n = 1 ), 一个作为 ( 从 D 中选择 *,其中 n = 2 ) 选择 比利德, 由编码, 文本标识,
(
select string_agg(cCode, ',') within group (order by mk)
from
(
select distinct cCode, mk
from C
where tid = t.tid
) d
) as cCodeCoder,
(
select string_agg(cCode, ',') within group (order by mk)
from
(
select distinct cCode, mk
from A
where tid = t.tid
) d
) as cCodeAuditor,
(
select string_agg(cCode, ',')
from
(
select cCode
from C
where tid = t.tid
except
select cCode
from A
where tid = t.tid
) d
) as deletedcCode,
(
select string_agg(cCode, ',')
from
(
select cCode
from A
where tid = t.tid
except
select cCode
from C
where tid = t.tid
) d
) as addedcCode,
(
select string_agg(dCode, ',') within group (order by mk, dk)
from
(
select distinct dCode, mk, dk
from C
where tid = t.tid
) d
) as dCodeCoder,
(
select string_agg(dCode, ',') within group (order by mk, dk)
from
(
select distinct dCode, mk, dk
from A
where tid = t.tid
) d
) as dCodeAuditor,
(
select string_agg(dCode, ',')
from
(
select dCode
from C
where tid = t.tid
except
select dCode
from A
where tid = t.tid
) d
) as deleteddCode,
(
select string_agg(dCode, ',')
from
(
select dCode
from A
where tid = t.tid
except
select dCode
from C
where tid = t.tid
) d
) as addeddCode
从我作为 t 其中 n = 1
谢谢, 阿伦
【问题讨论】:
你的表结构是什么?它们是否在两个不同列的同一行中?您的第二个 JSON 有额外的双引号。这不会在 JSON 变体中解析,所以这是复制中的工件吗? @GregPavlik:抱歉,第二个 JSON 中没有双引号。我在问题中编辑了相同的内容。json 位于两个不同的行中...表结构是 Billid(int ),Ctextid(int),info(Json),created(timestamp),createdby.(varchar)...第一个 json 由 coder 在较早创建的时间创建,第二个 JSON 由审核员在稍后创建时间..第二个 JSON(由审核员)是两者中正确且最后的一个...每对 Billid 和 ctextid 都存在 2 个 json .. JSON 存在于 info 列中 这里的任何输入都会有所帮助! 【参考方案1】:我不完全确定您如何需要数据,但您正在尝试获取“cCode、dCode、hcps 和 mod”(假设 hcps 实际上是 hcpc)。问题是 cCode、hcpc 和 mod 都在 JSON 的同一级别。 dCode 不是。它从其他属性向下嵌套一层,并且是一对多的关系。这可以展开为具有 1:MANY 关系的两个表,或者可以展开为一个重复 cCode、hcpc 和 mod 值的表。此示例显示了第二个选项:
-- I created a table named FOO and added your JSON as a variant
create temp table foo(v variant);
with
JSON(C_CODE, DGOSES, HCPC, "MOD") as
(
select "VALUE":cCode::int as C_CODE
,"VALUE":dgoses as DGOSES
,"VALUE":hcpc::int as HCPS
,"VALUE":mod::int as "MOD"
from foo, lateral flatten(v)
)
select C_CODE, HCPC, "MOD", "VALUE":dCode::string as D_CODE
from JSON, lateral flatten(DGOSES);
这会创建一个像这样的表:
C_CODE HCPC MOD D_CODE
7832 5 67 Y564
7832 5 67 U3321
7832 5 67 U1111
7832 5 67 HBB11
900114 1 26 J8923
【讨论】:
非常感谢格雷格!!..这很有帮助..我只是想知道我们是否能得到像 7832,7832,7832,7832,900114 5,5,5,5 这样的结果,1 (67,67,67,67,26)以上是关于比较 JSON 值并识别差异 -Snowflake SQL的主要内容,如果未能解决你的问题,请参考以下文章