如何使用结果查询更新多个表?
Posted
技术标签:
【中文标题】如何使用结果查询更新多个表?【英文标题】:How to update several tables with a result query? 【发布时间】:2021-10-18 07:57:47 【问题描述】:我正在使用 SQL Server 2017,我需要清理重复行并更新包含我的字段的其他表中的所有行。
我有一张包含我的客户的表
USERID - Username
C79784F1-7254-4195-AF7F-66E651F3C995 | Robert
3C51AD27-21F1-4751-9931-7C66263B4708 | Robert
0D67A3E3-E7CF-4D95-935D-E077F4A6D315 | Bob
70A9552A-028B-4EA0-A309-4E93EEAB92E8 | William
1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78 | William
411BCC56-A4C9-4D9B-9D49-FA9255ECA968 | William
F0223C57-E3B2-4F94-9820-2D9A62A515D6 | Cathy
CREATE TABLE [dbo].[Users]
(
[UserID] [uniqueidentifier] NOT NULL,
[UserName] [nvarchar](260) NULL
);
INSERT INTO [dbo].[Users] (userid, username)
VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995','Robert');
INSERT INTO [dbo].[Users] (userid, username)
VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708','Robert');
INSERT INTO [dbo].[Users] (userid, username)
VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315','Bob');
INSERT INTO [dbo].[Users] (userid, username)
VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8','William');
INSERT INTO [dbo].[Users] (userid, username)
VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78','William');
INSERT INTO [dbo].[Users] (userid, username)
VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968','William');
INSERT INTO [dbo].[Users] (userid, username)
VALUES ('F0223C57-E3B2-4F94-9820-2D9A62A515D6','Cathy');
然后我有 7 个包含 userid
列的表和 1 个包含另一个名称列的表
CreatedById - CreationDate - Folders
C79784F1-7254-4195-AF7F-66E651F3C995 | 2018-02-24 | Folder1
3C51AD27-21F1-4751-9931-7C66263B4708 | 2019-10-12 | PAD
0D67A3E3-E7CF-4D95-935D-E077F4A6D315 | 2021-05-12 | IEF
70A9552A-028B-4EA0-A309-4E93EEAB92E8 | 2021-01-27 | WIP
1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78 | 2021-06-29 | OLD_ONE
411BCC56-A4C9-4D9B-9D49-FA9255ECA968 | 2021-01-21 | ToTest
CREATE TABLE [dbo].[catalog]
(
[CreatedById] [uniqueidentifier] NOT NULL,
[CreationDate] DATE NOT NULL,
[Folders] [nvarchar](425)
);
INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders)
VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995','2018-02-24','Folder1');
INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders)
VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708','2019-10-12','PAD');
INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders)
VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315','2021-05-12','IEF');
INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders)
VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8','2021-01-27','WIP');
INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders)
VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78','2021-06-29','OLD_ONE');
INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders)
VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968','2021-01-21','ToTest');
我的其他表:
CREATE TABLE table3 ([USERID] [uniqueidentifier] NOT NULL);
CREATE TABLE table4 ([USERID] [uniqueidentifier] NOT NULL);
CREATE TABLE table5 ([USERID] [uniqueidentifier] NOT NULL);
CREATE TABLE table6 ([USERID] [uniqueidentifier] NOT NULL);
INSERT INTO table3 (USERID) VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995');
INSERT INTO table3 (USERID) VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708');
INSERT INTO table3 (USERID) VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315');
INSERT INTO table3 (USERID) VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8');
INSERT INTO table3 (USERID) VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78');
INSERT INTO table3 (USERID) VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968');
INSERT INTO table4 (USERID) VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995');
INSERT INTO table4 (USERID) VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708');
INSERT INTO table4 (USERID) VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315');
INSERT INTO table4 (USERID) VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8');
INSERT INTO table4 (USERID) VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78');
INSERT INTO table4 (USERID) VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968');
INSERT INTO table5 (USERID) VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995');
INSERT INTO table5 (USERID) VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708');
INSERT INTO table5 (USERID) VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315');
INSERT INTO table5 (USERID) VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8');
INSERT INTO table5 (USERID) VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78');
INSERT INTO table5 (USERID) VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968');
INSERT INTO table6 (USERID) VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995');
INSERT INTO table6 (USERID) VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708');
INSERT INTO table6 (USERID) VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315');
INSERT INTO table6 (USERID) VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8');
INSERT INTO table6 (USERID) VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78');
INSERT INTO table6 (USERID) VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968');
我想清除重复项并在数据库中只保留一条记录。
首先,我创建了一个查询,它只给我重复的行并且只保留一条记录。
有了这条记录,我会更新table3、table4、table5、table6,
WITH singleUser AS
(
SELECT
a.UserName,
a.UserID
FROM
(SELECT
userid,
Username,
ROW_NUMBER() OVER (PARTITION BY username ORDER BY username ASC) AS rowNo,
COUNT(*) OVER (PARTITION BY username) AS c
FROM
dbo.users
WHERE
1 = 1
GROUP BY
userid, Username) a
WHERE
1 = 1
AND rowNo > 1
AND c = rowNo
)
然后我创建了一个查询,它为我提供了包含我的“用户 ID”列的所有表。
此查询将返回:table3、table4、table5、table6
WITH tableToUpdate AS
(
SELECT
TABLE_CATALOG AS 'Bdd',
TABLE_SCHEMA AS 'Schema',
TABLE_NAME AS 'TableName',
COLUMN_NAME AS 'ColumnName'
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
1 = 1
AND CASE
WHEN COLUMN_NAME = 'CreatedByID' THEN 1
WHEN COLUMN_NAME = 'UserID' THEN 1
ELSE 0
END = 1
)
最后我创建了我的合并查询
MERGE INTO dbo.catalog c
USING (SELECT
u.UserID AS UserIDUsers,
su.UserID AS UserIDSingleUser
FROM
dbo.Users u
JOIN
singleUser su ON su.Username = u.username
WHERE
1 = 1) S ON c.CreatedByID = s.UserIDUsers
WHEN MATCHED THEN
UPDATE
SET c.CreatedByID =S.UserIDSingleUser
我的合并结果:
CreatedById - CreationDate - Folders
C79784F1-7254-4195-AF7F-66E651F3C995 | 2018-02-24 | Folder1
C79784F1-7254-4195-AF7F-66E651F3C995 | 2019-10-12 | PAD
0D67A3E3-E7CF-4D95-935D-E077F4A6D315 | 2021-05-12 | IEF
70A9552A-028B-4EA0-A309-4E93EEAB92E8 | 2021-01-27 | WIP
70A9552A-028B-4EA0-A309-4E93EEAB92E8 | 2021-06-29 | OLD_ONE
70A9552A-028B-4EA0-A309-4E93EEAB92E8 | 2021-01-21 | ToTest
它工作得很好,但有没有办法让它自动化?
实际上我已经创建了 8 个查询,但只有合并部分发生了变化。
另外,在所有字段都更新后,如何删除dbo.users
表中的重复行?
感谢您的帮助。
【问题讨论】:
看来您真正需要做的是首先停止插入重复项;那么您就不必在解决数据问题并阻止它发生时使用自动化流程。 旁注,您正在使用删除重复项的查询具有“代码气味”。您有一个GROUP BY
,但没有非窗口聚合,并且您的两个查询中也有WHERE 1=1
; 1
不何时等于1
?
不幸的是我别无选择,数据在那里。现在您不能插入重复项,但您必须处理历史记录...然后我著名的WHERE 1=1
是为了方便,因为您可以在此之后使用AND...
添加附加条件,并且它对执行时间没有影响。 @Larnu
它现在在那里,是的,但是一旦你删除它,它就消失了。如果它再次出现,问题是您允许重复数据。删除重复项,然后修复设计;然后(再次)你不需要一个自动化的过程,因为重复不能再发生了。您不需要自动化流程来完成 1 次任务;一次性任务的重点是它(意味着)一次性。如果它不是 1 折,那么对于这样的事情,问题出在设计上,需要解决。大概,在这里,你需要一个UNIQUE CONSTRAINT
on Username
on dbo.users
。
设计问题,我们知道。在我们删除所有重复项之前,我们需要更新所有引用字段UserID
的表,否则我们将违反约束,而且,我不能使用 ````UNIQUE CONSTRAINT````` 直到我有重复的行。我正在开发一个开发环境,所以我可以手动完成所有事情,但是当我在生产中部署“这个补丁”时,它必须是自动化的。例如,我不想启动 8 个脚本(每个表 1 个),我想知道是否没有其他方法可以通过存储过程。
【参考方案1】:
我回来回答我自己的问题。几天后,我终于做到了。
事先我已经创建了一个来自我的 CTE 查询 (singleUser) 的表
CREATE OR ALTER PROCEDURE dbo.mergeUserID
AS
DECLARE @tableName nvarchar(50)
DECLARE @sql nvarchar(max)
DECLARE @columnName nvarchar(50)
BEGIN
DECLARE cursor_db CURSOR FOR
SELECT
TABLE_NAME AS 'TableName'
,COLUMN_NAME AS 'ColumnName'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE 1=1
AND CASE
WHEN COLUMN_NAME = 'CreatedByID' then 1
WHEN COLUMN_NAME = 'ModifiedByID' then 1
WHEN COLUMN_NAME = 'OwnerID'then 1
WHEN COLUMN_NAME = 'UserID' then 1
ELSE 0
END = 1
OPEN cursor_db
FETCH NEXT FROM cursor_db INTO @tableName, @columnName
WHILE @@FETCH_STATUS = 0
BEGIN
SET @sql ='MERGE INTO '
+ @tablename+ ' t USING (
SELECT
u.UserID as UserIDUsers
,su.UserID as UserIDSingleUser
FROM dbo.Users u
JOIN dbo.singleUser su on su.UserName = u.username
WHERE 1=1
)S ON t.'+@columnName+' = s.UserIDUsers
WHEN MATCHED THEN
UPDATE
SET t.'+@columnName+' = S.UserIDSingleUser;'
exec sp_executesql @sql
PRINT @sql
FETCH NEXT FROM cursor_db INTO @tableName, @columnName
END
CLOSE cursor_db
DEALLOCATE cursor_db
END;
GO
------------------------------------------------
DECLARE @RC nvarchar(max)
-- TODO: Set parameter values here.
EXECUTE @RC = [dbo].[mergeUserID]
PRINT @RC
GO
我不知道它是否编码良好,因为这是我第一次这样做。
例如,我在一些论坛上看到他们将;
放在 FETCH / CLOSE / DEALLOCATE ;
之后,而其他人则没有。
with semicolon
Microsoft without semicolon
所以谁对谁错,不知道?
【讨论】:
以上是关于如何使用结果查询更新多个表?的主要内容,如果未能解决你的问题,请参考以下文章