sql 查询与 Row_Number 不同
Posted
技术标签:
【中文标题】sql 查询与 Row_Number 不同【英文标题】:sql query distinct with Row_Number 【发布时间】:2013-08-09 21:02:51 【问题描述】:我正在与 sql
中的 distinct 关键字作斗争。
我只想在一列中显示唯一 (distinct
) 值的所有行号,所以我尝试了:
SELECT DISTINCT id, ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM table
WHERE fid = 64
但是下面的代码给了我distinct
值:
SELECT distinct id FROM table WHERE fid = 64
但是当用Row_Number
尝试它时。
那么它就不起作用了。
【问题讨论】:
【参考方案1】:试试这个
SELECT distinct id
FROM (SELECT id, ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM table
WHERE fid = 64) t
或者使用RANK()
代替行号并选择记录DISTINCT rank
SELECT id
FROM (SELECT id, ROW_NUMBER() OVER (PARTITION BY id ORDER BY id) AS RowNum
FROM table
WHERE fid = 64) t
WHERE t.RowNum=1
这也返回不同的 ids
【讨论】:
【参考方案2】:使用这个:
SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS RowNum FROM
(SELECT DISTINCT id FROM table WHERE fid = 64) Base
并将查询的“输出”作为另一个查询的“输入”。
使用 CTE:
; WITH Base AS (
SELECT DISTINCT id FROM table WHERE fid = 64
)
SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS RowNum FROM Base
这两个查询应该是等价的。
技术上你可以
SELECT DISTINCT id, ROW_NUMBER() OVER (PARTITION BY id ORDER BY id) AS RowNum
FROM table
WHERE fid = 64
但是如果你增加DISTINCT字段的数量,你必须把所有这些字段都放在PARTITION BY
中,例如
SELECT DISTINCT id, description,
ROW_NUMBER() OVER (PARTITION BY id, description ORDER BY id) AS RowNum
FROM table
WHERE fid = 64
我什至希望你理解你在这里违反了标准命名约定,id
可能应该是一个主键,根据定义是唯一的,所以 DISTINCT
将毫无用处,除非你将查询与一些JOIN
s/UNION ALL
...
【讨论】:
【参考方案3】:试试这个:
;WITH CTE AS (
SELECT DISTINCT id FROM table WHERE fid = 64
)
SELECT id, ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM cte
WHERE fid = 64
【讨论】:
【参考方案4】:怎么样
;WITH DistinctVals AS (
SELECT distinct id
FROM table
where fid = 64
)
SELECT id,
ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM DistinctVals
SQL Fiddle DEMO
你也可以试试
SELECT distinct id, DENSE_RANK() OVER (ORDER BY id) AS RowNum
FROM @mytable
where fid = 64
SQL Fiddle DEMO
【讨论】:
【参考方案5】:这可以很简单,你已经很接近了
SELECT distinct id, DENSE_RANK() OVER (ORDER BY id) AS RowNum
FROM table
WHERE fid = 64
【讨论】:
这个比选择的答案好很多。【参考方案6】:This article covers an interesting relationship between ROW_NUMBER()
and DENSE_RANK()
(RANK()
函数没有特别处理)。当您需要在SELECT DISTINCT
语句上生成ROW_NUMBER()
时,ROW_NUMBER()
will produce distinct values before they are removed by the DISTINCT
keyword。例如。这个查询
SELECT DISTINCT
v,
ROW_NUMBER() OVER (ORDER BY v) row_number
FROM t
ORDER BY v, row_number
...可能会产生这个结果(DISTINCT
无效):
+---+------------+
| V | ROW_NUMBER |
+---+------------+
| a | 1 |
| a | 2 |
| a | 3 |
| b | 4 |
| c | 5 |
| c | 6 |
| d | 7 |
| e | 8 |
+---+------------+
而这个查询:
SELECT DISTINCT
v,
DENSE_RANK() OVER (ORDER BY v) row_number
FROM t
ORDER BY v, row_number
...在这种情况下产生你可能想要的东西:
+---+------------+
| V | ROW_NUMBER |
+---+------------+
| a | 1 |
| b | 2 |
| c | 3 |
| d | 4 |
| e | 5 |
+---+------------+
请注意,DENSE_RANK()
函数的 ORDER BY
子句将需要 SELECT DISTINCT
子句中的所有其他列才能正常工作。
比较所有三个函数
使用 PostgreSQL / Sybase / SQL 标准语法(WINDOW
子句):
SELECT
v,
ROW_NUMBER() OVER (window) row_number,
RANK() OVER (window) rank,
DENSE_RANK() OVER (window) dense_rank
FROM t
WINDOW window AS (ORDER BY v)
ORDER BY v
...你会得到:
+---+------------+------+------------+
| V | ROW_NUMBER | RANK | DENSE_RANK |
+---+------------+------+------------+
| a | 1 | 1 | 1 |
| a | 2 | 1 | 1 |
| a | 3 | 1 | 1 |
| b | 4 | 4 | 2 |
| c | 5 | 5 | 3 |
| c | 6 | 5 | 3 |
| d | 7 | 7 | 4 |
| e | 8 | 8 | 5 |
+---+------------+------+------------+
【讨论】:
【参考方案7】:使用DISTINCT
会在您添加字段时导致问题,并且它还可以掩盖您选择中的问题。像这样使用GROUP BY
作为替代方案:
SELECT id
,ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM table
where fid = 64
group by id
然后您可以像这样从您的选择中添加其他有趣的信息:
,count(*) as thecount
或
,max(description) as description
【讨论】:
赞成使用group by
。但我认为这里不需要partition by
。
@P5Coder ,你当然是对的。我已经修好了。当我把它放在那里时,我不知道我在想什么。【参考方案8】:
问题太老了,我的回答可能不会增加太多,但这是我的两分钱,可以让查询有点用处:
;WITH DistinctRecords AS (
SELECT DISTINCT [col1,col2,col3,..]
FROM tableName
where [my condition]
),
serialize AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY [colNameAsNeeded] ORDER BY [colNameNeeded]) AS Sr,*
FROM DistinctRecords
)
SELECT * FROM serialize
使用两个 cte 的用处在于,现在您可以在查询中更轻松地使用序列化记录,并且非常轻松地执行 count(*)
等。
DistinctRecords
将选择所有不同的记录,serialize
将序列号应用于不同的记录。之后,您可以将最终的序列化结果用于您的目的,而不会造成混乱。
Partition By
在大多数情况下可能不需要
【讨论】:
以上是关于sql 查询与 Row_Number 不同的主要内容,如果未能解决你的问题,请参考以下文章
在 SQL Server 2005 中使用 ROW_NUMBER() OVER () 对不同列进行排序的分页查询
使用 Row_number() OVER(partition BY..) 以及声明局部变量
Sql Query 在其中使用 row_number 时给出错误缺少运算符