Postgresql 根据单列或几列分组去重row_number() over() partition by

Posted 程序媛一枚~

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Postgresql 根据单列或几列分组去重row_number() over() partition by相关的知识,希望对你有一定的参考价值。

Postgresql 根据单列或几列分组去重row_number() over() partition by

一般用于单列或者几列需要去重后进行计算值的

count(distinct(eid)) 可以

比如有个例子,需要根据名称,城市去筛选覆盖的道路长度,以月因为建立了唯一索引是ok的,年时可能会有重复的,如何去重呢?用窗口函数:row_number() over() partition by
count(distinct(length)) 不行,因为很多道路数据本就有相同的长度

1. 效果图

可以看到 distinctCnt > Cnt说明有重复,点开string_agg的结果发现确实是有重复;, 这样计算其所对应的length值肯定偏大。

去重后效果图如下: 把所有的聚合条件都写在partition by后边。
可以看到后边的里程和也正常了不少。
试验发现pname有无差别不大,可能是因为构造的数据集小;,但其实是需要的;

以第一条数据去验证:

2. 源码

2.1 建表,构建数据

drop table if exists t_pa_cover;
create table if not exists t_pa_cover(
	pname text COLLATE pg_catalog."default" NOT NULL,
    upload_date varchar(12),
    city_code varchar(20) default '',
    link_pid varchar(20),
    link_length numeric default 0,
    create_time timestamp with time zone NOT NULL DEFAULT now(),
    constraint t_pa_cover_unique_key unique (pname,upload_date,city_code,link_pid)
);
COMMENT ON TABLE t_pa_cover IS '覆盖率中间表';
COMMENT ON COLUMN t_pa_cover.pname IS '名称';
COMMENT ON COLUMN t_pa_cover.upload_date IS '日期';
COMMENT ON COLUMN t_pa_cover.city_code IS '城市行政编码';
COMMENT ON COLUMN t_pa_cover.link_pid IS 'linkpid';
COMMENT ON COLUMN t_pa_cover.link_length IS 'linkpid长度m';
COMMENT ON COLUMN t_pa_cover.create_time IS '创建时间';

create index if not exists t_pa_cover_citycode on t_pa_cover(city_code);
create index if not exists t_pa_cover_pname on t_pa_cover(pname);
create index if not exists t_pa_cover_uploaddate on t_pa_cover(upload_date);

INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202201', '1101', 4721472607, 99.88);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202201', '1201', 4731620766, 64.96);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202201', '1301', 4725763511, 82.77);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202202', '1101', 4732413545, 23.63);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202202', '1201', 4733766774, 17.97);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202202', '1301', 4725763511, 82.77);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202203', '1101', 4732413545, 23.63);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202203', '1201', 4721472607, 99.88);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202203', '1301', 4733766774, 17.97);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202204', '1101', 4721472607, 99.88);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202204', '1201', 4738504835, 37.94);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202204', '1301', 4727435973, 39.05);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202205', '1101', 4737641033, 1.41);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202205', '1201', 4725763511, 82.77);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202205', '1301', 4727435973, 39.05);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202206', '1101', 4725763511, 82.77);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202206', '1201', 4737641033, 1.41);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202206', '1301', 4733766774, 17.97);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202207', '1101', 4725763511, 82.77);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202207', '1201', 4740662897, 86.96);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202207', '1301', 4719251580, 43.12);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202208', '1101', 4719251580, 43.12);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202208', '1201', 4727435973, 39.05);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202208', '1301', 4725763511, 82.77);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202209', '1101', 4741477663, 35.39);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202209', '1201', 4738504835, 37.94);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202209', '1301', 4740789027, 5.36);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202210', '1101', 4721472607, 99.88);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202210', '1201', 4733766774, 17.97);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202210', '1301', 4732413545, 23.63);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202211', '1101', 4719251580, 43.12);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202211', '1201', 4740789027, 5.36);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202211', '1301', 4719251580, 43.12);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202212', '1101', 4740789027, 5.36);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202212', '1201', 4740662897, 86.96);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('aa', '202212', '1301', 4721472607, 99.88);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202201', '1101', 4738492963, 10.75);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202201', '1201', 4736532327, 44.78);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202201', '1301', 4740856924, 39.60);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202202', '1101', 4739710021, 85.77);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202202', '1201', 4736532327, 44.78);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202202', '1301', 4712358476, 44.06);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202203', '1101', 4734479408, 25.51);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202203', '1201', 4738273045, 99.60);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202203', '1301', 4740856924, 39.60);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202204', '1101', 4735500946, 49.98);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202204', '1201', 4738273045, 99.60);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202204', '1301', 4736169127, 58.38);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202205', '1101', 4736797286, 26.90);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202205', '1201', 4716723755, 89.29);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202205', '1301', 4740856924, 39.60);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202206', '1101', 4738492963, 10.75);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202206', '1201', 4735500946, 49.98);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202206', '1301', 4712358476, 44.06);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202207', '1101', 4716723755, 89.29);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202207', '1201', 4740108020, 77.72);
INSERT INTO public.t_pa_cover(pname, upload_date, city_code, link_pid, link_length) VALUES ('bb', '202207', '1301', 4730167080, 0.11);
INSERT 转转转--oracle 去重并按时间排序取第一条

Sqlserver中ROW_NUMBER分组去重

Sqlserver中ROW_NUMBER分组去重

去重 ROW_NUMBER() OVER(PARTITION BY 分组字段 ORDER BY 排序字段) RN

Oracle 单列去重 显示单行所有列数据

根据多个条件将一列拆分为几列并分组