使用 PostgreSQL 13 上的另一列使用 string_agg 订购 DISTINCT?
Posted
技术标签:
【中文标题】使用 PostgreSQL 13 上的另一列使用 string_agg 订购 DISTINCT?【英文标题】:Ordering DISTINCT with string_agg using another column on PostgreSQL 13? 【发布时间】:2021-02-26 09:33:40 【问题描述】:我有一个emails
表:
CREATE TABLE public.emails (
id bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY
(MAXVALUE 9223372036854775807),
name text not null
);
我有一个contacts
表:
CREATE TABLE public.contacts (
id bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY
(MAXVALUE 9223372036854775807),
email_id bigint NOT NULL,
full_name text NOT NULL,
ordering int not null
);
并记录如下:
insert into emails (name) VALUES ('dennis1');
insert into emails (name) VALUES ('dennis2');
insert into contacts (id, email_id, full_name, ordering) VALUES (5, 1, 'dennis1', 9);
insert into contacts (id, email_id, full_name, ordering) VALUES (6, 2, 'dennis1', 5);
insert into contacts (id, email_id, full_name, ordering) VALUES (7, 2, 'dennis5', 1);
insert into contacts (id, email_id, full_name, ordering) VALUES (8, 1, 'john', 2);
我的查询获取数据如下:
SELECT
"emails"."name",
STRING_AGG(DISTINCT CAST("contacts"."id" AS TEXT), ','
ORDER BY CAST("contacts"."id" AS TEXT)) AS "contact_ids"
FROM "emails"
INNER JOIN "contacts"
ON ("contacts"."email_id" = "emails"."id")
WHERE "emails"."id" > 0
GROUP BY "emails"."name"
ORDER BY "emails"."name" DESC LIMIT 50
实际结果
name contact_ids
dennis2 6,7
dennis1 5,8
预期结果
name contact_ids
dennis2 7,6
dennis1 8,5
我想根据ordering
列作为DESC
订购contact_ids
,但我不想获取ordering
列。只需使用它来订购联系人的 id。
如何根据ordering
列对contact_ids
的每个id
进行排序?
演示:https://dbfiddle.uk/?rdbms=postgres_12&fiddle=4d6851ec67b579608427bb399eae5891
【问题讨论】:
如果有订购栏,为什么要省略? 不相关,但是:ORDER BY CAST("contacts"."id" AS TEXT)
是个坏主意,因为这会将10
排在2
之前
【参考方案1】:
如果您在加入前聚合,则不需要 DISTINCT,您可以随意订购:
select em.name,
c.contact_ids
from emails em
join (
select email_id, string_agg(id::text, ',' order by ordering desc) as contact_ids
from contacts
group by email_id
) c on c.email_id = em.id
order by em.name desc
limit 50;
Online example
【讨论】:
谢谢。还有一件事。如果我想将另一列与逗号合并怎么办?假设我想要合并 user_id 字段/列?在这种情况下,由于组,您的结果给了我重复的 user_id。见这里:dbfiddle.uk/… @Dennis:好吧,显然你需要在派生表中包含那个额外的列,否则你不能将它用于连接。【参考方案2】:demo:db<>fiddle
我猜,省略排序列是因为问题,你不能在聚合中使用它 DISTINCT
。
所以,也许您可以在聚合之前执行DISTINCT
:
SELECT
"name",
STRING_AGG(CAST("id" AS TEXT), ','
ORDER BY "ordering") AS "contact_ids"
FROM (
SELECT DISTINCT ON ("contacts"."id")
"emails"."name",
"contacts"."id",
"contacts"."ordering"
FROM "emails"
INNER JOIN "contacts"
ON ("contacts"."email_id" = "emails"."id")
WHERE "emails"."id" > 0
ORDER BY "contacts"."id"
) s
GROUP BY "name"
ORDER BY "name" DESC LIMIT 50
【讨论】:
以上是关于使用 PostgreSQL 13 上的另一列使用 string_agg 订购 DISTINCT?的主要内容,如果未能解决你的问题,请参考以下文章
获取由 PySpark Dataframe 上的另一列分组的列的不同元素