T-sql 在字段更改时重置行号

Posted

技术标签:

【中文标题】T-sql 在字段更改时重置行号【英文标题】:T-sql Reset Row number on Field Change 【发布时间】:2012-11-04 12:14:14 【问题描述】:

类似于我最近的一篇文章“t-sql 顺序持续时间”,但不完全相同,我想根据 x 列(在我的例子中,列“who”)的变化来重置行号。

这是返回原始(ish)数据的小样本的第一个查询:

SELECT      DISTINCT chr.custno, 
            CAST(LEFT(CONVERT( VARCHAR(20),chr.moddate,112),10)+ ' ' + chr.modtime AS DATETIME)as  moddate, 
            chr.who     
FROM        <TABLE> chr 
WHERE       chr.custno = 581827
            AND LEFT(chr.who, 5) = 'EMSZC'
            AND chr.[description] NOT LIKE 'Recalled and viewed this customer'
ORDER BY    chr.custno

结果:

custno      moddate             who
581827      2012-11-08 08:38:00.000     EMSZC14
581827      2012-11-08 08:41:10.000     EMSZC14
581827      2012-11-08 08:53:46.000     EMSZC14
581827      2012-11-08 08:57:04.000     EMSZC14
581827      2012-11-08 08:58:35.000     EMSZC14
581827      2012-11-08 08:59:13.000     EMSZC14
581827      2012-11-08 09:00:06.000     EMSZC14
581827      2012-11-08 09:04:39.000     EMSZC49 Reset row number to 1
581827      2012-11-08 09:05:04.000     EMSZC49
581827      2012-11-08 09:06:32.000     EMSZC49
581827      2012-11-08 09:12:03.000     EMSZC49
581827      2012-11-08 09:12:38.000     EMSZC49
581827      2012-11-08 09:14:18.000     EMSZC49
581827      2012-11-08 09:17:35.000     EMSZC14 Reset row number to 1

第二步是添加行号(由于使用了DISTINCT这个词,我在第一个查询中没有这样做);所以……

WITH c1 AS (
        SELECT      DISTINCT chr.custno
                    CAST(LEFT(CONVERT( VARCHAR(20),chr.moddate,112),10)+ ' ' + chr.modtime AS DATETIME)as moddate,
                    chr.who
        FROM        <TABLE> chr 
        WHERE       chr.custno = 581827
                    AND LEFT(chr.who, 5) = 'EMSZC'
                    AND chr.[description] NOT LIKE 'Recalled and viewed this customer'
        )

SELECT  ROW_NUMBER() OVER (PARTITION BY custno ORDER BY custno, moddate, who) AS RowID, custno, moddate, who
FROM    c1

结果:

RowID   custno      moddate                      who
1       581827      2012-11-08 08:38:00.000     EMSZC14
2       581827      2012-11-08 08:41:10.000     EMSZC14
3       581827      2012-11-08 08:53:46.000     EMSZC14
4       581827      2012-11-08 08:57:04.000     EMSZC14
5       581827      2012-11-08 08:58:35.000     EMSZC14
6       581827      2012-11-08 08:59:13.000     EMSZC14
7       581827      2012-11-08 09:00:06.000     EMSZC14
8       581827      2012-11-08 09:04:39.000     EMSZC49 Reset row number to 1
9       581827      2012-11-08 09:05:04.000     EMSZC49
10      581827      2012-11-08 09:06:32.000     EMSZC49
11      581827      2012-11-08 09:12:03.000     EMSZC49
12      581827      2012-11-08 09:12:38.000     EMSZC49
13      581827      2012-11-08 09:14:18.000     EMSZC49
14      581827      2012-11-08 09:17:35.000     EMSZC14 Reset row number to 1

下一步是我卡住的地方:目标是在“谁”列中的值每次更改时将 RowID 重置为 1。下面的代码得到了一个“几乎就在那里”的结果(应该注意我从某个地方偷/借了这个代码,但现在我找不到该网站):

WITH c1 AS (
        SELECT      DISTINCT chr.custno,
                    CAST(LEFT(CONVERT( VARCHAR(20),chr.moddate,112),10)+ ' ' + chr.modtime AS DATETIME)as moddate,
                    chr.who
        FROM        <TABLE> chr 
        WHERE       chr.custno = 581827
                    AND LEFT(chr.who, 5) = 'EMSZC'
                    AND chr.[description] NOT LIKE 'Recalled and viewed this customer'
        )
, c1a AS    (
            SELECT  ROW_NUMBER() OVER (PARTITION BY custno ORDER BY custno, moddate, who) AS RowID, custno, moddate, who
            FROM    c1
            )

SELECT  x.RowID - y.MinID + 1 AS Row,
        x.custno, x.Touch, x.moddate, x.who      
FROM    (
            SELECT  custno, who, MIN(RowID) AS MinID
            FROM    c1a
            GROUP BY custno, who
        ) AS y
        INNER JOIN c1a x ON x.custno = y.custno AND x.who = y.who

结果:

Row custno      moddate                    who
1   581827      2012-11-08 08:38:00.000     EMSZC14
2   581827      2012-11-08 08:41:10.000     EMSZC14
3   581827      2012-11-08 08:53:46.000     EMSZC14
4   581827      2012-11-08 08:57:04.000     EMSZC14
5   581827      2012-11-08 08:58:35.000     EMSZC14
6   581827      2012-11-08 08:59:13.000     EMSZC14
7   581827      2012-11-08 09:00:06.000     EMSZC14
1   581827      2012-11-08 09:04:39.000     EMSZC49 Reset row number to 1 (Hooray! It worked!)
2   581827      2012-11-08 09:05:04.000     EMSZC49
3   581827      2012-11-08 09:06:32.000     EMSZC49
4   581827      2012-11-08 09:12:03.000     EMSZC49
5   581827      2012-11-08 09:12:38.000     EMSZC49
6   581827      2012-11-08 09:14:18.000     EMSZC49
14  581827      2012-11-08 09:17:35.000     EMSZC14 Reset row number to 1 (Crappies.)

期望的结果:

Row custno      moddate                     who
1   581827      2012-11-08 08:38:00.000     EMSZC14
2   581827      2012-11-08 08:41:10.000     EMSZC14
3   581827      2012-11-08 08:53:46.000     EMSZC14
4   581827      2012-11-08 08:57:04.000     EMSZC14
5   581827      2012-11-08 08:58:35.000     EMSZC14
6   581827      2012-11-08 08:59:13.000     EMSZC14
7   581827      2012-11-08 09:00:06.000     EMSZC14
1   581827      2012-11-08 09:04:39.000     EMSZC49 Reset row number to 1 
2   581827      2012-11-08 09:05:04.000     EMSZC49
3   581827      2012-11-08 09:06:32.000     EMSZC49
4   581827      2012-11-08 09:12:03.000     EMSZC49
5   581827      2012-11-08 09:12:38.000     EMSZC49
6   581827      2012-11-08 09:14:18.000     EMSZC49
1   581827      2012-11-08 09:17:35.000     EMSZC14 Reset row number to 1

感谢任何帮助。

【问题讨论】:

【参考方案1】:

如果您使用的是 SQL Server 2012,您可以使用 LAG 将值与前一行进行比较,您可以使用 SUM 和 OVER 记录更改。

with C1 as
(
  select custno,
         moddate,
         who,
         lag(who) over(order by moddate) as lag_who
  from chr
),
C2 as
(
  select custno,
         moddate,
         who,
         sum(case when who = lag_who then 0 else 1 end) 
            over(order by moddate rows unbounded preceding) as change 
  from C1
)
select row_number() over(partition by change order by moddate) as RowID,
       custno,
       moddate,
       who
from C2

SQL Fiddle

更新:

SQL Server 2005 的版本。它使用递归 CTE 和临时表作为中间存储您需要迭代的数据。

create table #tmp
(
  id int primary key,
  custno int not null,
  moddate datetime not null,
  who varchar(10) not null
);

insert into #tmp(id, custno, moddate, who)
select row_number() over(order by moddate),
       custno,
       moddate,
       who
from chr;

with C as
(
  select 1 as rowid,
         T.id,
         T.custno,
         T.moddate,
         T.who,
         cast(null as varchar(10)) as lag_who
  from #tmp as T
  where T.id = 1
  union all
  select case when T.who = C.who then C.rowid + 1 else 1 end,
         T.id,
         T.custno,
         T.moddate,
         T.who,
         C.who
  from #tmp as T
    inner join C
      on T.id = C.id + 1
)
select rowid,
       custno,
       moddate,
       who
from C
option (maxrecursion 0);

drop table #tmp;

SQL Fiddle

【讨论】:

不幸的是,我们仍在运行 2005,并计划在下个月升级到 2008R2。 LAG 函数将是解决这个问题的一个干净的方法。感谢您引起我的注意...只是迁移到 SQL Server 2012 的另一个论据。 非常酷的解决方案。我唯一关心的(实际上并没有成立)是使用 0 和 maxrecursion。这可能会导致无限循环,但在考虑我的数据结构时,可能与我所拥有的数据无关。谢谢!!! 这个LAG函数正是我需要的!谢谢!【参考方案2】:

我通过使用 Rank() 成功解决了这个问题:

SELECT RANK() OVER (PARTITION BY who ORDER BY custno, moddate) AS RANK

这返回了您想要的结果。我实际上发现这篇文章试图解决同样的问题。

【讨论】:

【参考方案3】:

代替:

PARTITION BY custno ORDER BY custno, moddate, who)

试试:

PARTITION BY custno, who ORDER BY custno, moddate)

【讨论】:

已经试过了(它似乎应该可以工作,不是吗?)但它仍然返回相同的结果集,最后一行编号 = 14。不过,请继续向我抛出想法。我只是缺少一些简单的步骤,我敢肯定。【参考方案4】:

我能想到的唯一解决方案是使用游标(呃)并经历 RBAR 过程。这不是一个优雅的解决方案,因为光标必须读取超过 1m 行。无赖。

【讨论】:

以上是关于T-sql 在字段更改时重置行号的主要内容,如果未能解决你的问题,请参考以下文章

当Oracle中的字段值更改时增加行号

在 pyspark 中,基于变量字段进行分组,并为特定值添加一个计数器(当变量更改时重置)

将重置计数器(在列的值更改时重置)添加到视图中的 PLSQL 行

在新页面上重置“运行总计字段”

如何通过以角度从另一个模板调用来重置文本字段

有没有办法在做= B1,= B2等时快速更改行号。数千行?