sql语句去重
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了sql语句去重相关的知识,希望对你有一定的参考价值。
参考技术A sql单表/多表查询去除重复记录
单表distinct
多表group
by
group
by
必须放在
order
by
和
limit之前,不然会报错
************************************************************************************
1、查找表中多余的重复记录,重复记录是根据单个字段(peopleId)来判断
select
*
from
people
where
peopleId
in
(select
peopleId
from
people
group
by
peopleId
having
count(peopleId)
>
1)
2、删除表中多余的重复记录,重复记录是根据单个字段(peopleId)来判断,只留有rowid最小的记录
delete
from
people
where
peopleId
in
(select
peopleId
from
people
group
by
peopleId
having
count(peopleId)
>
1)
and
rowid
not
in
(select
min(rowid)
from
people
group
by
peopleId
having
count(peopleId
)>1)
3、查找表中多余的重复记录(多个字段)
select
*
from
vitae
a
where
(a.peopleId,a.seq)
in
(select
peopleId,seq
from
vitae
group
by
peopleId,seq
having
count(*)
>
1)
4、删除表中多余的重复记录(多个字段),只留有rowid最小的记录
delete
from
vitae
a
where
(a.peopleId,a.seq)
in
(select
peopleId,seq
from
vitae
group
by
peopleId,seq
having
count(*)
>
1)
and
rowid
not
in
(select
min(rowid)
from
vitae
group
by
peopleId,seq
having
count(*)>1)
5、查找表中多余的重复记录(多个字段),不包含rowid最小的记录
select
*
from
vitae
a
where
(a.peopleId,a.seq)
in
(select
peopleId,seq
from
vitae
group
by
peopleId,seq
having
count(*)
>
1)
and
rowid
not
in
(select
min(rowid)
from
vitae
group
by
peopleId,seq
having
count(*)>1)
(二)
比方说
在A表中存在一个字段“name”,
而且不同记录之间的“name”值有可能会相同,
现在就是需要查询出在该表中的各记录之间,“name”值存在重复的项;
Select
Name,Count(*)
From
A
Group
By
Name
Having
Count(*)
>
1
如果还查性别也相同大则如下:
Select
Name,sex,Count(*)
From
A
Group
By
Name,sex
Having
Count(*)
>
1
(三)
方法一
declare
@max
integer,@id
integer
declare
cur_rows
cursor
local
for
select
主字段,count(*)
from
表名
group
by
主字段
having
count(*)
>;
1
open
cur_rows
fetch
cur_rows
into
@id,@max
while
@@fetch_status=0
begin
select
@max
=
@max
-1
set
rowcount
@max
delete
from
表名
where
主字段
=
@id
fetch
cur_rows
into
@id,@max
end
close
cur_rows
set
rowcount
0
方法二
"重复记录"有两个意义上的重复记录,一是完全重复的记录,也即所有字段均重复的记录,二是部分关键字段重复的记录,比如Name字段重复,而其他字段不一定重复或都重复可以忽略。
1、对于第一种重复,比较容易解决,使用
select
distinct
*
from
tableName
就可以得到无重复记录的结果集。
如果该表需要删除重复的记录(重复记录保留1条),可以按以下方法删除
select
distinct
*
into
#Tmp
from
tableName
drop
table
tableName
select
*
into
tableName
from
#Tmp
drop
table
#Tmp
发生这种重复的原因是表设计不周产生的,增加唯一索引列即可解决。
2、这类重复问题通常要求保留重复记录中的第一条记录,操作方法如下
假设有重复的字段为Name,Address,要求得到这两个字段唯一的结果集
select
identity(int,1,1)
as
autoID,
*
into
#Tmp
from
tableName
select
min(autoID)
as
autoID
into
#Tmp2
from
#Tmp
group
by
Name,autoID
select
*
from
#Tmp
where
autoID
in(select
autoID
from
#tmp2)
最后一个select即得到了Name,Address不重复的结果集(但多了一个autoID字段,实际写时可以写在select子句中省去此列)
(四)
查询重复
select
*
from
tablename
where
id
in
(select
id
from
tablename
group
by
id
having
count(id)
>
1
)
3、查找表中多余的重复记录(多个字段)
select
*
from
vitae
a
where
(a.peopleId,a.seq)
in
(select
peopleId,seq
from
vitae
group
by
peopleId,seq
having
count(*)
>
1)
运行会产生问题,where(a.peopleId,a.seq)这样的写发是通不过的!!! 参考技术B sql语句通过DISTINCT关键字去重,
用于返回唯一不同的值。DISTINCT关键字需要搭配SELECT
语句使用,语法为SELECT
DISTINCT
列名称
FROM
表名称。如果指定了
SELECT
DISTINCT,那么
ORDER
BY
子句中的项就必须出现在选择列表中,否则会出现错误。
扩展资料:
distinct这个关键字用来过滤掉多余的重复记录只保留一条,但往往只用它来返回不重复记录的条数,而不是用它来返回不重记录的所有值。其原因是distinct只有用二重循环查询来解决,而这样对于一个数据量非常大的站来说,无疑是会直接影响到效率的。
distinct必须放在开头,distinct语句中select显示的字段只能是distinct指定的字段,其他字段是不可能出现的。
COUNT分组条件去重的sql统计语句示例(mysql)
常规情况下的sql分组统计为:
select count(1) from 表 where 条件 group by 字段;
但是有时往往需要添加不同的条件已经去重的统计以上语句就不能满足需求。
解决方案为:
1.添加条件的统计方案:
COUNT(CASE WHEN 条件 THEN 1 ELSE NULL END) xxx GROUP BY 分组字段
2.添加条件并去重的统计方案:
COUNT(DISTINCT CASE WHEN 条件 THEN 去重字段 END) xxx GROUP BY 分组字段
综合示例:
SELECT dc.user_sources AS sources, COUNT(CASE WHEN dc.`count_type` IN (1,4) THEN 1 ELSE NULL END) AS djNum1, COUNT(CASE WHEN dc.`count_type` IN (2,5) THEN 1 ELSE NULL END) AS djNum2, COUNT(CASE WHEN dc.`count_type` IN (3,6) THEN 1 ELSE NULL END) AS djNum3, COUNT(DISTINCT CASE WHEN dc.`count_type` IN (1,4) THEN dc.`user_id` END) AS fwNum1, COUNT(DISTINCT CASE WHEN dc.`count_type` IN (2,5) THEN dc.`user_id` END) AS fwNum2, COUNT(DISTINCT CASE WHEN dc.`count_type` IN (3,6) THEN dc.`user_id` END) AS fwNum3, COUNT(DISTINCT CASE WHEN dc.`count_type` IN (2,5) THEN dc.`user_id` END) AS fwNumc4, COUNT(DISTINCT CASE WHEN dc.`count_type` IN (3,6) THEN dc.`user_id` END) AS fwNumc5 FROM `credit_dc_project_count` dc WHERE 1=1 AND dc.user_sources IN(‘wodong‘ , ‘qq‘ , ‘ydb_dkw‘ , ‘chh_12d‘ , ‘12d‘ , ‘jd_dkw‘ , ‘hds_dkw‘ , ‘ksd_12d‘ , ‘ttym_dkw‘ , ‘ios‘ , ‘dkwaaa‘ , ‘gzh‘ , ‘chaomi‘ , ‘mmd_12d‘ , ‘ydb_12d‘ , ‘hjsd_dkw‘ , ‘papadai‘ , ‘chd_dkw‘) GROUP BY dc.user_sources
以上是关于sql语句去重的主要内容,如果未能解决你的问题,请参考以下文章