Postgresql 10 jsonb 到多行表

Posted

技术标签:

【中文标题】Postgresql 10 jsonb 到多行表【英文标题】:Postgresql 10 jsonb to table with multiple rows 【发布时间】:2017-11-07 11:25:23 【问题描述】:

我有一个包含一些“用户”结构的 json 数组。这些 json 数组在 Postgresql 数据库中是 stroe,我想获取所有行(对于所有数组的所有行)。

数据样本:

我的数据库实例:

docker run --name postgresql-10 -e POSTGRES_PASSWORD=mysecretpassword -d postgres:10-alpine 

docker run -it --rm --link postgresql-10:postgres postgres:10-alpine psql -h postgres -U postgres

我的桌子:

CREATE TABLE "Reports" (
    "name" TEXT NOT NULL,
    "report" JSONB NOT NULL,
    "timestamp" TIMESTAMP NOT NULL,
    PRIMARY KEY ("name", "timestamp")
)
;

一些数据:

insert into "Reports" (timestamp, "name", "report")
  values ('2017-11-05'::timestamp, 
          'appA', 
          '
         "dateComputed": "2017-11-06 10:06:29 UTC",
         "name": "appA",
         "users": [
           
             "DATE": "2017-11-03",
             "EMPLID": "415",
             "NAME": "Smith"
               ,
               
             "DATE": "2017-11-03",
         "EMPLID": "4",
         "NAME": "Jane"
           ,
           
             "DATE": "2017-11-03",
             "EMPLID": "31",
             "NAME": "Doe"
           
         ]
      '::jsonb
         ) ;

insert into "Reports" (timestamp, "name", "report")
  values ('2017-11-04'::timestamp, 
          'appA', 
          '
         "dateComputed": "2017-11-04 11:34:13 UTC",
         "name": "appA",
         "users": [
               
             "DATE": "2017-11-03",
         "EMPLID": "4",
         "NAME": "Jane"
           ,
           
             "DATE": "2017-11-03",
             "EMPLID": "31",
             "NAME": "Doe"
           
         ]
      '::jsonb
         ) ;

insert into "Reports" (timestamp, "name", "report")
  values ('2017-11-01'::timestamp, 
          'appA', 
          '
         "dateComputed": "2017-11-01 02:32:49 UTC",
         "name": "appA",
         "users": [
               
             "DATE": "2017-11-01",
         "EMPLID": "415",
         "NAME": "Smith"
           ,
           
             "DATE": "2017-11-01",
             "EMPLID": "31",
             "NAME": "Doe"
           
         ]
      '::jsonb
         ) ;


insert into "Reports" (timestamp, "name", "report")
  values ('2017-11-03'::timestamp, 'appB', '["other": "useless"]'::jsonb) ;

我想要的是下表列出所有匹配“名称”为“AppA”的“报告”的用户:

+------------+-------+--------+
| DATE       | NAME  | EMPLID |
+------------+-------+--------+
| 2017-11-03 | Smith | 415    |
+------------+-------+--------+
| 2017-11-03 | Jane  | 4      |
+------------+-------+--------+
| 2017-11-03 | Doe   | 31     |
+------------+-------+--------+
| 2017-11-03 | Jane  | 4      |
+------------+-------+--------+
| 2017-11-03 | Doe   | 31     |
+------------+-------+--------+
| 2017-11-01 | Smith | 415    |
+------------+-------+--------+
| 2017-11-01 | Doe   | 31     |
+------------+-------+--------+

+------------+------------+-------+--------+
| timestamp  | DATE       | NAME  | EMPLID |
+------------+------------+-------+--------+
| 2017-11-05 | 2017-11-03 | Smith | 415    |
+------------+------------+-------+--------+
| 2017-11-05 | 2017-11-03 | Jane  | 4      |
+------------+------------+-------+--------+
| 2017-11-05 | 2017-11-03 | Doe   | 31     |
+------------+------------+-------+--------+
| 2017-11-04 | 2017-11-03 | Jane  | 4      |
+------------+------------+-------+--------+
| 2017-11-04 | 2017-11-03 | Doe   | 31     |
+------------+------------+-------+--------+
| 2017-11-03 | 2017-11-01 | Smith | 415    |
+------------+------------+-------+--------+
| 2017-11-03 | 2017-11-01 | Doe   | 31     |
+------------+------------+-------+--------+

当我只匹配一行时,我可以使用 jsonb_to_recordset 来获取与该行匹配的所有 json 行。 例如,通过创建视图过滤最新的时间戳列时:

CREATE INDEX "ReportsGIN" on "Reports" USING gin ("report") ;

CREATE VIEW "Reports_Latest_timestamp"
AS
SELECT  "name"
       , max("Reports"."timestamp") AS "timestamp_latest"
FROM "Reports"
GROUP BY "name"
;

CREATE VIEW "Reports_Latest"
AS
SELECT   "Reports"."name"
       , "Reports"."report"
       , "Reports"."timestamp"
FROM "Reports"
WHERE ("Reports"."timestamp" = (SELECT "Reports_Latest_timestamp"."timestamp_latest" FROM "Reports_Latest_timestamp" WHERE "Reports_Latest_timestamp"."name" = "Reports"."name"))
;

select *
from
jsonb_to_recordset
(
 (select report#>'users'
  from "Reports_Latest" 
  where "name" = 'appA' 
) 
) as x(
        "EMPLID" integer
      , "NAME" text
      , "DATE" timestamp with time zone

)
;

 EMPLID | NAME  |          DATE
--------+-------+------------------------
    415 | Smith | 2017-11-03 00:00:00+00
      4 | Jane  | 2017-11-03 00:00:00+00
     31 | Doe   | 2017-11-03 00:00:00+00
(3 rows)

jsonb_to_recordset 按预期工作。

如何使用 jsonb_to_recordset 列出所有“报告”行的行?

在“Reports_Latest”上显示“时间戳”的答案是(但仍然没有完整的“报告”行的线索):

select  t."timestamp"
      , r."EMPLID"
      , r."NAME"
      , r."DATE"
from
(
  select "timestamp", report#>'users'
  from "Reports_Latest" 
  where "name" = 'appA' 
) as t
, (
  select *
  from
    jsonb_to_recordset
    (
     (select report#>'users'
      from "Reports_Latest" 
      where "name" = 'appA' 
    ) 
    ) as x(
            "EMPLID" integer
          , "NAME" text
          , "DATE" timestamp with time zone

    )
  ) as r
;


      timestamp      | EMPLID | NAME  |          DATE
---------------------+--------+-------+------------------------
 2017-11-05 00:00:00 |    415 | Smith | 2017-11-03 00:00:00+00
 2017-11-05 00:00:00 |      4 | Jane  | 2017-11-03 00:00:00+00
 2017-11-05 00:00:00 |     31 | Doe   | 2017-11-03 00:00:00+00
(3 rows)

SQL Fiddle on Postgresql 9.6 to quick test


Breathe 提供的解决方案是:

select r."timestamp", x.*
from "Reports" as r
cross join lateral jsonb_to_recordset (r.report#>'users')
 as x(
        "EMPLID" integer
      , "NAME" text
      , "DATE" timestamp with time zone

)
where r."name" = 'appA' 
;

      timestamp      | EMPLID | NAME  |          DATE
---------------------+--------+-------+------------------------
 2017-11-05 00:00:00 |    415 | Smith | 2017-11-03 00:00:00+00
 2017-11-05 00:00:00 |      4 | Jane  | 2017-11-03 00:00:00+00
 2017-11-05 00:00:00 |     31 | Doe   | 2017-11-03 00:00:00+00
 2017-11-04 00:00:00 |      4 | Jane  | 2017-11-03 00:00:00+00
 2017-11-04 00:00:00 |     31 | Doe   | 2017-11-03 00:00:00+00
 2017-11-01 00:00:00 |    415 | Smith | 2017-11-01 00:00:00+00
 2017-11-01 00:00:00 |     31 | Doe   | 2017-11-01 00:00:00+00
(7 rows)

http://sqlfiddle.com/#!17/cd4df/9/0

【问题讨论】:

【参考方案1】:

本质上,您要做的是在单行“报告”中生成与用户一样多的记录。在您想要的表格结构中,前两列是“报告”中的前两列。所以你的查询需要:

select a.timestamp, a."name"
FROM "Reports" a

然后您想为每个记录创建一个“子集”。这可以通过将生成子集的函数应用于几乎所有行来实现。该子集由函数 jsonb_to_recordset() 生成,因此:

SELECT a.timestamp, a."name", b. *
FROM "Reports" a
CROSS JOIN lateral jsonb_to_recordset(a.report->'Users')
as b("EMPLID" integer
      , "NAME" text
      , "DATE" timestamp with time zone)

编辑:我添加了横向交叉连接

【讨论】:

我想从“Reports”中获取所有行,而不仅仅是“Reports_Latest” 虽然这段代码 sn-p 可以解决问题,但including an explanation 确实有助于提高帖子的质量。请记住,您是在为将来的读者回答问题,而这些人可能不知道您提出代码建议的原因。 我删除了“,b. *”。显示的错误是:错误:“;”处或附近的语法错误第 7 行:; ^ 你可以试试:SELECT * from Reports" a LEFT JOIN jsonb_to_recordset(a.report->'Users') as x("EMPLID" varchar, "NAME" varchar, "DATE" varchar) postgres=# SELECT * from "Reports" a LEFT JOIN jsonb_to_recordset(a.report->'Users') as x("EMPLID" varchar, "NAME" varchar, "DATE" varchar) postgres-# ; ERROR: syntax error at or near ";" LINE 2: ; ^

以上是关于Postgresql 10 jsonb 到多行表的主要内容,如果未能解决你的问题,请参考以下文章

用于比较 JSONB 值的 PostgreSQL 索引

使用 jsonb 数据类型的 PostgreSQL 插入查询

postgresql友好地返回不是'PANDAS'的多行表[重复]

定义表中不存在时删除 jsonb 键

PostgreSQL 查询 JSONB 字段中的对象数组

PostgreSQL - 向 JSONB 数组的每个对象添加键