SQL如何去重？

Posted 2023-05-03

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了SQL如何去重？相关的知识，希望对你有一定的参考价值。

select 筛选出来重复的后
01 小明
02 小明
03 小明
04 小业
05 小业

如何修改为：
01 小明
02 小明2
03 小明3
04 小业
05 小业1

1、首先创建一个临时表，用于演示sqlserver语法中的去重关键字distinct的使用。本文以sqlserver数据库为例演示，

IF OBJECT_ID('tempdb..#tmp1') IS NOT NULL DROP TABLE #tmp1;

CREATE TABLE #tmp1(

Col1 varchar(50),

Col2 int

);

2、往临时表中插入几行测试数据，用于演示distinct的使用

insert into #tmp1(Col1, Col2) values('Code10', 10);

insert into #tmp1(Col1, Col2) values('Code20', 20);

insert into #tmp1(Col1, Col2) values('Code10', 10);

insert into #tmp1(Col1, Col2) values('Code5', 20);

3、查询临时表中所有的测试数据select * from #tmp1;

4、使用distinct查询出整个表所有字段值不重复的记录，select distinct * from #tmp1 。

5、distinct除了过滤整个表不重复的记录之外，还可以对指定列去重复，多个列使用逗号分开即可

select distinct Col1 from #tmp1;

select distinct Col1, Col2 from #tmp1;

6、如果想返回临时表中Col1列不重复的记录行数，该如何书写sql语句呢？使用下面的sql，从运行结果来看，并没有达到预期的效果

select distinct count(Col1) from #tmp1;

7、试着把distinct和count交换一个位置，从运行结果可以看出，这样写就可以达到预期的效果，Col1列的不重复行数正确地返回了。

select count(distinct Col1) from #tmp1;

参考技术A 对想要去除重复的列使用 group by 函数即可。

可以使用：select * from test group by tel;

这是最简单的一种情况，用关键字distinct就可以去掉

example： select distinct * from table(表名) where (条件)

CREATE TABLE 临时表 AS (select distinct * from 表名);
drop table 正式表;
insert into 正式表 (select * from 临时表);
drop table 临时表; 参考技术B

#测试环境：sql server 2008

1、sql：

with
base
as
(
select * ,ROW_NUMBER() over(partition by name order by id) as rowIndex from users
)
select id,trim(name)+trim(CAST( rowIndex as varchar)) as name from base

2、结果：

本回答被提问者采纳参考技术C 上面的回答可以，不过我觉得还有更简单的方法：
select aid, count(distinct uid) from 表名 group by aid
这是sqlserver 的写法。。。

sql语句去重

参考技术A sql
单表/多表查询去除重复记录
单表distinct
多表group
by
group
by
必须放在
order
by
和
limit之前，不然会报错
************************************************************************************
1、查找表中多余的重复记录，重复记录是根据单个字段（peopleId）来判断
select
*
from
people
where
peopleId
in
(select
peopleId
from
people
group
by
peopleId
having
count(peopleId)
>
1)
2、删除表中多余的重复记录，重复记录是根据单个字段（peopleId）来判断，只留有rowid最小的记录
delete
from
people
where
peopleId
in
(select
peopleId
from
people
group
by
peopleId
having
count(peopleId)
>
1)
and
rowid
not
in
(select
min(rowid)
from
people
group
by
peopleId
having
count(peopleId
)>1)
3、查找表中多余的重复记录（多个字段）
select
*
from
vitae
a
where
(a.peopleId,a.seq)
in
(select
peopleId,seq
from
vitae
group
by
peopleId,seq
having
count(*)
>
1)
4、删除表中多余的重复记录（多个字段），只留有rowid最小的记录
delete
from
vitae
a
where
(a.peopleId,a.seq)
in
(select
peopleId,seq
from
vitae
group
by
peopleId,seq
having
count(*)
>
1)
and
rowid
not
in
(select
min(rowid)
from
vitae
group
by
peopleId,seq
having
count(*)>1)
5、查找表中多余的重复记录（多个字段），不包含rowid最小的记录
select
*
from
vitae
a
where
(a.peopleId,a.seq)
in
(select
peopleId,seq
from
vitae
group
by
peopleId,seq
having
count(*)
>
1)
and
rowid
not
in
(select
min(rowid)
from
vitae
group
by
peopleId,seq
having
count(*)>1)
(二)
比方说
在A表中存在一个字段“name”，
而且不同记录之间的“name”值有可能会相同，
现在就是需要查询出在该表中的各记录之间，“name”值存在重复的项；
Select
Name,Count(*)
From
A
Group
By
Name
Having
Count(*)
>
1
如果还查性别也相同大则如下:
Select
Name,sex,Count(*)
From
A
Group
By
Name,sex
Having
Count(*)
>
1
(三)
方法一
declare
@max
integer,@id
integer
declare
cur_rows
cursor
local
for
select
主字段,count(*)
from
表名
group
by
主字段
having
count(*)
>；
1
open
cur_rows
fetch
cur_rows
into
@id,@max
while
@@fetch_status=0
begin
select
@max
=
@max
-1
set
rowcount
@max
delete
from
表名
where
主字段
=
@id
fetch
cur_rows
into
@id,@max
end
close
cur_rows
set
rowcount
0
方法二
＂重复记录＂有两个意义上的重复记录，一是完全重复的记录，也即所有字段均重复的记录，二是部分关键字段重复的记录，比如Name字段重复，而其他字段不一定重复或都重复可以忽略。
1、对于第一种重复，比较容易解决，使用
select
distinct
*
from
tableName
就可以得到无重复记录的结果集。
如果该表需要删除重复的记录（重复记录保留1条），可以按以下方法删除
select
distinct
*
into
#Tmp
from
tableName
drop
table
tableName
select
*
into
tableName
from
#Tmp
drop
table
#Tmp
发生这种重复的原因是表设计不周产生的，增加唯一索引列即可解决。
2、这类重复问题通常要求保留重复记录中的第一条记录，操作方法如下
假设有重复的字段为Name,Address，要求得到这两个字段唯一的结果集
select
identity(int,1,1)
as
autoID,
*
into
#Tmp
from
tableName
select
min(autoID)
as
autoID
into
#Tmp2
from
#Tmp
group
by
Name,autoID
select
*
from
#Tmp
where
autoID
in(select
autoID
from
#tmp2)
最后一个select即得到了Name，Address不重复的结果集（但多了一个autoID字段，实际写时可以写在select子句中省去此列）
(四)
查询重复
select
*
from
tablename
where
id
in
(select
id
from
tablename
group
by
id
having
count(id)
>
1
)
3、查找表中多余的重复记录（多个字段）
select
*
from
vitae
a
where
(a.peopleId,a.seq)
in
(select
peopleId,seq
from
vitae
group
by
peopleId,seq
having
count(*)
>
1)
运行会产生问题，where(a.peopleId,a.seq)这样的写发是通不过的！！！参考技术B sql语句通过DISTINCT关键字去重，
用于返回唯一不同的值。DISTINCT关键字需要搭配SELECT
语句使用，语法为SELECT
DISTINCT
列名称
FROM
表名称。如果指定了
SELECT
DISTINCT，那么
ORDER
BY
子句中的项就必须出现在选择列表中，否则会出现错误。

扩展资料：
distinct这个关键字用来过滤掉多余的重复记录只保留一条，但往往只用它来返回不重复记录的条数，而不是用它来返回不重记录的所有值。其原因是distinct只有用二重循环查询来解决，而这样对于一个数据量非常大的站来说，无疑是会直接影响到效率的。
distinct必须放在开头，distinct语句中select显示的字段只能是distinct指定的字段，其他字段是不可能出现的。

以上是关于SQL如何去重？的主要内容，如果未能解决你的问题，请参考以下文章