根据动态列查找匹配记录
Posted
技术标签:
【中文标题】根据动态列查找匹配记录【英文标题】:Find matching records based on dynamic columns 【发布时间】:2018-08-15 03:48:43 【问题描述】:我有一份宠物清单:
我需要从 Owner 表中为每个宠物找到正确的主人
为了正确地将每只宠物与主人匹配,我需要使用一个特殊的匹配表,如下所示:
因此,对于 PetID=2 的宠物,我需要根据三个字段找到匹配的所有者:
Pet.Zip = Owner.Zip
and Pet.OwnerName = Owner.Name
and Pet.Document = Owner.Document
在我们的示例中,它将像这样工作:
select top 1 OwnerID from owners
where Zip = 23456
and Name = 'Alex'
and Document = 'a.csv'
如果没有找到 OwnerID,我需要根据 2 个字段进行匹配(不使用优先级最高的字段)
在我们的例子中:
select top 1 OwnerID from owners where
Name = 'Alex'
and Document = 'a.csv'
由于没有找到记录,因此我们需要在较少的字段上进行匹配。在我们的示例中:
select top 1 OwnerID from owners where Document = 'a.csv'
现在,我们找到了 OwnerID = 6 的所有者。
现在我们需要用 ownerID = 6 更新宠物,然后我们可以处理下一个宠物。
我现在可以做到这一点的唯一方法是使用循环或游标 + 动态 SQL。
没有循环+动态sql可以实现吗?也许 STUFF + Pivot 不知何故?
sql 小提琴:http://sqlfiddle.com/#!18/10982/1/0
样本数据:
create table temp_builder
(
PetID int not null,
Field varchar(30) not null,
MatchTo varchar(30) not null,
Priority int not null
)
insert into temp_builder values
(1,'Address', 'Addr',4),
(1,'Zip', 'Zip', 3),
(1,'Country', 'Country', 2),
(1,'OwnerName', 'Name',1),
(2,'Zip', 'Zip',3),
(2,'OwnerName','Name', 2),
(2,'Document', 'Document', 1),
(3,'Country', 'Country', 1)
create table temp_pets
(
PetID int null,
Address varchar(100) null,
Zip int null,
Country varchar(100) null,
Document varchar(100) null,
OwnerName varchar(100) null,
OwnerID int null,
Field1 bit null,
Field2 bit null
)
insert into temp_pets values
(1, '123 5th st', 12345, 'US', 'test.csv', 'John', NULL, NULL, NULL),
(2, '234 6th st', 23456, 'US', 'a.csv', 'Alex', NULL, NULL, NULL),
(3, '345 7th st', 34567, 'US', 'b.csv', 'Mike', NULL, NULL, NULL)
create table temp_owners
(
OwnerID int null,
Addr varchar(100) null,
Zip int null,
Country varchar(100) null,
Document varchar(100) null,
Name varchar(100) null,
OtherField bit null,
OtherField2 bit null,
)
insert into temp_owners values
(1, '456 8th st', 45678, 'US', 'c.csv', 'Mike', NULL, NULL),
(2, '678 9th st', 45678, 'US', 'b.csv', 'John', NULL, NULL),
(3, '890 10th st', 45678, 'US', 'b.csv', 'Alex', NULL, NULL),
(4, '901 11th st', 23456, 'US', 'b.csv', 'Alex', NULL, NULL),
(5, '234 5th st', 12345, 'US', 'b.csv', 'John', NULL, NULL),
(6, '123 5th st', 45678, 'US', 'a.csv', 'John', NULL, NULL)
编辑:许多很棒的建议和回应让我不知所措。我已经对它们进行了测试,其中许多对我来说效果很好。不幸的是,我只能奖励一种解决方案。
【问题讨论】:
我不明白你的优先规则。为什么国家的优先级高于邮政编码? @TimBiegeleisen,对于宠物 #1,我需要尝试按地址、邮编、国家/地区、所有者名称进行匹配。如果不匹配,则按 Zip、Country、OwnerName,如果不匹配,则按 Country、OwnerName,如果不匹配则按 OwnerName。因此,我们从更具体转向不太具体。我想出了这个例子的列名只是为了简化目的 必须有动态SQL,否则,如何使用存储在另一列中的列名......?如果动态 SQL 不是一个选项,那么您的问题在 IMO 中没有答案。 @MichałTurczyn 我可以使用动态 sql,但我想尝试提高效率(我的表有超过 100 万条记录。如果可能,我想避免循环) 我已经编辑了您的问题并将示例数据从您的小提琴链接复制到问题的正文。我还为更多的受众添加了 [sql-server] 和 [tsql] 标签。 【参考方案1】:可以通过将用于比较的字段视为每个宠物的位集中的条目来避免使用游标、循环和动态 SQL。根据一个位条目(FieldRank 列)为每个优先级计算一个位集(FieldSetRank 列)。
必须对 Pets 和 Owner 表进行反透视,以便可以比较字段及其关联值。已匹配的每个字段和值都分配给相应的 FieldRank。然后根据匹配值 (MatchSetRank) 计算新的位集。仅返回匹配集 (MatchSetRank) 等于所需集 (FieldSetRank) 的记录。
查询执行最终排名以返回具有最高 MatchSetRank 的记录(在保持优先级条件的同时匹配最多列数的记录。 下面的 T-SQL 演示了这个概念。
;WITH CTE_Builder
AS
(
SELECT [PetID]
,[Field]
,[Priority]
,[MatchTo]
,POWER(2, [Priority] - 1) AS [FieldRank] -- Define the field ranking as bit set numbered item.
,SUM(POWER(2, [Priority] - 1)) OVER (PARTITION BY [PetID] ORDER BY [Priority] ROWS UNBOUNDED PRECEDING) FieldSetRank -- Sum all the bit set IDs to define what constitutes a completed field set ordered by priority.
FROM temp_builder
),
CTE_PetsUnpivoted
AS
( -- Unpivot pets table and assign Field Rank and Field Set Rank.
SELECT [PetsUnPivot].[PetID]
,[PetsUnPivot].[Field]
,[Builder].[MatchTo]
,[PetsUnPivot].[FieldValue]
,[Builder].[Priority]
,[Builder].[FieldRank]
,[Builder].[FieldSetRank]
FROM
(
SELECT [PetID], [Address], CAST([Zip] AS VARCHAR(100)) AS [Zip], [Country], [Document], [OwnerName]
FROM temp_pets
) [Pets]
UNPIVOT
(FieldValue FOR Field IN
([Address], [Zip], [Country], [Document], [OwnerName])
) AS [PetsUnPivot]
INNER JOIN [CTE_Builder] [Builder] ON [PetsUnPivot].PetID = [Builder].PetID AND [PetsUnPivot].Field = [Builder].Field
),
CTE_Owners
AS
(
-- Unpivot Owners table and join with unpivoted Pets table on field name and field value.
-- Next assign Pets field rank then calculated the field set rank (MatchSetRank) based on actual matches made.
SELECT [OwnersUnPivot].[OwnerID]
,[Pets].[PetID]
,[OwnersUnPivot].[Field]
,[Pets].Field AS [PetField]
,[Pets].FieldValue as PetFieldValue
,[OwnersUnPivot].[FieldValue]
,[Pets].[Priority]
,[Pets].[FieldRank]
,[Pets].[FieldSetRank]
,SUM([FieldRank]) OVER (PARTITION BY [Pets].[PetID], [OwnersUnPivot].[OwnerID] ORDER BY [Pets].[Priority] ROWS UNBOUNDED PRECEDING) MatchSetRank
FROM
(
SELECT [OwnerID], [Addr], CAST([Zip] AS VARCHAR(100)) AS [Zip], [Country], [Document], [Name]
FROM temp_owners
) [Owners]
UNPIVOT
(FieldValue FOR Field IN
([Addr], [Zip], [Country], [Document], [Name])
) AS [OwnersUnPivot]
INNER JOIN [CTE_PetsUnpivoted] [Pets] ON [OwnersUnPivot].[Field] = [Pets].[MatchTo] AND [OwnersUnPivot].[FieldValue] = [Pets].[FieldValue]
),
CTE_FinalRanking
AS
(
SELECT [PetID]
,[OwnerID]
-- -- Calculate final rank, if multiple matches have the same rank then multiple rows will be returned per pet.
-- Change the “RANK()” function to "ROW_NUMBER()" to only return on result per pet.
,RANK() OVER (PARTITION BY [PetID] ORDER BY [MatchSetRank] DESC) AS [FinalRank]
FROM CTE_Owners
WHERE [FieldSetRank] = [MatchSetRank] -- Only return records where the field sets calculated based on
-- actual matches is equal to desired field set ranks. This will
-- eliminate matches where the number of fields that meets the
-- criteria is the same but does not meet priority requirements.
)
SELECT [PetID]
,[OwnerID]
FROM CTE_FinalRanking
WHERE [FinalRank] = 1
【讨论】:
我喜欢这种方法;我想出了一个类似的方法。为了获得更快的速度,您可以将 UNPIVOTED 所有者字段数据存储在临时表中,并根据字段名称/字段值对其进行索引。 类似于我在下面发布的方法。关键是匹配列实际上是静态的。只有优先级是动态的。 感谢您的回答!看起来不错,但不幸的是静态列对我不起作用。这些是可以改变的。【参考方案2】:我会马上说以节省您的时间:
我的解决方案使用动态 SQL。 Michał Turczyn 正确地指出,当比较列的名称存储在数据库中时,您无法避免它。 我的解决方案使用循环。而且我坚信你不会用纯 SQL 查询来解决这个问题,它会在你声明的数据大小上运行得足够快(表有 > 1M 的记录)。您描述的逻辑本质上意味着迭代 - 从更大的匹配字段集到更低的集。 SQL 作为一种查询语言并不是为了涵盖这些棘手的场景而设计的。您可以尝试使用纯 SQL 查询来解决您的问题,但即使您设法构建这样的查询,它也会非常棘手、复杂和不清楚。我不喜欢这种解决方案。这就是为什么我什至没有深入这个方向。 另一方面,我的解决方案不需要创建临时表,这是一个优势。鉴于此,我的方法相当简单:
有一个外部循环从最大的匹配器集(所有匹配的字段)迭代到最小的匹配器集(一个字段)。在第一次迭代中,当我们还不知道有多少匹配器存储在宠物的数据库中时,我们读取并使用它们。在接下来的迭代中,我们将使用的匹配器数量减少 1(删除具有最高优先级的匹配器)。
内部循环遍历当前匹配器集并构建WHERE
子句,用于比较Pets
和Owners
表之间的字段。
当前查询已执行,如果某些所有者符合给定条件,我们将中断外循环。
下面是实现这个逻辑的代码:
DECLARE @PetId INT = 2;
DECLARE @MatchersLimit INT;
DECLARE @OwnerID INT;
WHILE (@MatchersLimit IS NULL OR @MatchersLimit > 0) AND @OwnerID IS NULL
BEGIN
DECLARE @CurrMatchFilter VARCHAR(max) = ''
DECLARE @Field VARCHAR(30)
DECLARE @MatchTo VARCHAR(30)
DECLARE @CurrMatchersNumber INT = 0;
DECLARE @GetMatchers CURSOR;
IF @MatchersLimit IS NULL
SET @GetMatchers = CURSOR FOR SELECT Field, MatchTo FROM temp_builder WHERE PetID = @PetId ORDER BY Priority ASC;
ELSE
SET @GetMatchers = CURSOR FOR SELECT TOP (@MatchersLimit) Field, MatchTo FROM temp_builder WHERE PetID = @PetId ORDER BY Priority ASC;
OPEN @GetMatchers;
FETCH NEXT FROM @GetMatchers INTO @Field, @MatchTo;
WHILE @@FETCH_STATUS = 0
BEGIN
IF @CurrMatchFilter <> '' SET @CurrMatchFilter = @CurrMatchFilter + ' AND ';
SET @CurrMatchFilter = @CurrMatchFilter + ('temp_pets.' + @Field + ' = ' + 'temp_owners.' + @MatchTo);
FETCH NEXT FROM @GetMatchers INTO @field, @matchTo;
SET @CurrMatchersNumber = @CurrMatchersNumber + 1;
END
CLOSE @GetMatchers;
DEALLOCATE @GetMatchers;
IF @CurrMatchersNumber = 0 BREAK;
DECLARE @CurrQuery nvarchar(max) = N'SELECT @id = temp_owners.OwnerID FROM temp_owners INNER JOIN temp_pets ON (' + CAST(@CurrMatchFilter AS NVARCHAR(MAX)) + N') WHERE temp_pets.PetID = ' + CAST(@PetId AS NVARCHAR(MAX));
EXECUTE sp_executesql @CurrQuery, N'@id int OUTPUT', @id=@OwnerID OUTPUT;
IF @MatchersLimit IS NULL
SET @MatchersLimit = @CurrMatchersNumber - 1;
ELSE
SET @MatchersLimit = @MatchersLimit - 1;
END
SELECT @OwnerID AS OwnerID, @MatchersLimit + 1 AS Matched;
性能考虑
在这种方法中执行的查询基本上有 2 个:
SELECT Field, MatchTo FROM temp_builder WHERE PetID = @PetId;
您应该在temp_builder
表中的PetID
字段上添加索引,此查询将执行得非常快。
SELECT @id = temp_owners.OwnerID FROM temp_owners INNER JOIN temp_pets ON (temp_pets.Document = temp_owners.Document AND temp_pets.OwnerName = temp_owners.Name AND temp_pets.Zip = temp_owners.Zip AND ...) WHERE temp_pets.PetID = @PetId;
这个查询看起来很吓人,因为它连接了两个大表 - temp_owners
和 temp_pets
。但是temp_pets
表被PetID
列过滤,应该只产生一条记录。因此,如果您在 temp_pets.PetID
列上有索引(并且您应该因为该列看起来像主键),则查询将导致扫描 temp_owners
表。即使对于超过 1M 行的表,这种扫描也不会花费太多时间。如果查询仍然太慢,您可以考虑为匹配器中使用的temp_owners
表的列添加索引(Addr
、Zip
等)。添加索引有缺点,比如更大的数据库和更慢的插入/更新操作。所以在给temp_owners
列添加索引之前,先检查一下没有索引的表的查询速度。
【讨论】:
【参考方案3】:我不确定最终结果是否正确,但我建议使用几个常用表表达式使用动态 SQL 生成一批更新语句(恐怕不能不使用动态 SQL 完成),然后使用 Exec(sql)
执行它们。
这种方法的好处是它不涉及循环或游标。
我生成的每个更新语句都在宠物和所有者表之间使用inner join
,使用所有者表所有者 ID 更新宠物表的所有者 ID,使用从构建器表到 on
的映射作为基础子句。
第一个 cte 负责从 builder 表中生成 on
子句,第二个 cte 负责生成更新语句。
最后,我将第二个 CTE 中的所有 SQL 语句选择到单个 nvarchar(max)
变量中并执行它。
我解决优先级问题的方法是为每组优先级生成一个更新语句,首先包括所有优先级,然后从下一个 SQL 语句中排除值,首先排除最高优先级,直到我留下一个 on
子句只映射一组列。
所以,首先要声明一个变量来保存生成的更新语句:
DECLARE @Sql nvarchar(max) = ''
现在,第一个 CTE 使用 cross apply
和 stuff
和 for xml
为每对 petId
和 Priority
生成 on
子句:
;WITH OnClauseCTE AS
(
SELECT DISTINCT PetId, Priority, OnClause
FROM temp_builder t0
CROSS APPLY
(
SELECT STUFF (
(
SELECT ' AND p.'+ Field +' = o.'+ MatchTo
FROM temp_builder t1
WHERE PetID = t0.PetId
AND Priority <= t0.Priority
FOR XML PATH('')
)
, 1, 5, '') As OnClause
) onClauseGenerator
)
第二个 CTE 为每个 petId
和 Priority
组合生成一个 UPDATE
语句:
, UpdateStatementCTE AS
(
SELECT PetId,
Priority,
'UPDATE p
SET OwnerID = o.OwnerID
FROM temp_pets p
INNER JOIN temp_owners o ON ' + OnClause + '
WHERE p.PetId = '+ CAST(PetId as varchar(10)) +'
AND p.OwnerID IS NULL; -- THIS IS CRITICAL!
' AS SQL
FROM OnClauseCTE
)
最后,从 UpdateStatementCTE 生成一批更新语句:
SELECT @Sql = @Sql + SQL
FROM UpdateStatementCTE
ORDER BY PetId, Priority DESC -- ORDER BY Priority is CRITICAL!
order by PetId
是为了提高可读性,当您打印出@Sql
的内容时。但是,order by
子句的 Priority DESC
部分是critical,因为我们希望先执行最高优先级,最后执行最低优先级。
现在,@Sql
包含以下内容(缩短):
UPDATE p
SET OwnerID = o.OwnerID
FROM temp_pets p
INNER JOIN temp_owners o ON p.Address = o.Addr AND p.Zip = o.Zip AND p.Country = o.Country AND p.OwnerName = o.Name
WHERE p.PetId = 1
AND p.OwnerID IS NULL;
...
UPDATE p
SET OwnerID = o.OwnerID
FROM temp_pets p
INNER JOIN temp_owners o ON p.OwnerName = o.Name
WHERE p.PetId = 1
AND p.OwnerID IS NULL;
...
UPDATE p
SET OwnerID = o.OwnerID
FROM temp_pets p
INNER JOIN temp_owners o ON p.OwnerName = o.Name AND p.Document = o.Document
WHERE p.PetId = 2
AND p.OwnerID IS NULL;
...
UPDATE p
SET OwnerID = o.OwnerID
FROM temp_pets p
INNER JOIN temp_owners o ON p.Country = o.Country
WHERE p.PetId = 3
AND p.OwnerID IS NULL;
如您所见,每个更新语句都表示在构建器表中,并且只有在前一个更新语句尚未更改所有者 ID 时才会更改所有者 ID,因为 where
子句的 AND p.OwnerID IS NULL
部分。
运行这批更新语句后,您的 temp_pets 表如下所示:
PetID Address Zip Country Document OwnerName OwnerID Field1 Field2
1 123 5th st 12345 US test.csv John 5 NULL NULL
2 234 6th st 23456 US a.csv Alex 6 NULL NULL
3 345 7th st 34567 US b.csv Mike 1 NULL NUL
You can see a live demo on rextester.
但是,请注意,您拥有的条件越少,从联接返回的记录就越多,从而使更新更可能不准确。
例如,对于 PetId 3,我得到 OwnerId 1,因为我唯一需要匹配记录的是 Country
列,这意味着它实际上可能是此示例数据中的每个 OwnerId
,因为每个人都有相同的US
在Country
列中的值。
在以下规则下,我无能为力。
【讨论】:
【参考方案4】:以下方法基于以下事实:选择和排序要匹配的列的不同组合的数量是有限的,并且可能远少于记录的数量。 对于 5 列,组合总数为 325,但由于不太可能使用所有可能的组合,因此实际数量可能少于 100。 与记录数相比(OP 提到>1M),尝试合并具有相同列组合的宠物是值得的。
以下 SQL 脚本的特点:
没有动态 SQL。 循环,但没有游标;迭代次数是有限的,并且不会随着记录的数量成比例地增长。 创建两个(索引)帮助表。 (请随意将它们设为临时表或表变量。)这大大加快了匹配过程 (INNER JOIN),但在填充表时确实会带来一些开销。 只有简单的 SQL 构造(没有枢轴,没有填充FOR XML
,甚至没有 CTE)。
仅依赖于关键列(PetID、OwnerID)、Priority 列和帮助表中的列的索引。不需要地址、邮编、国家、文档、名称的索引。
乍一看,查询似乎完全是多余的(在 OP 提出的少量样本数据上执行了 47 条 SQL 语句),但对于更大的表,优势应该会变得明显。最坏情况的时间复杂度应该是 O(n log n),这比许多替代方案要好得多。 但当然它仍然需要在实践中证明自己;我还没有使用大型数据集对其进行测试。
小提琴:http://sqlfiddle.com/#!18/53320/1
-- Adding indexes to OP's tables to optimize the queries that follow.
CREATE INDEX IX_PetID ON temp_builder (PetID)
CREATE INDEX IX_Priority ON temp_builder (Priority)
CREATE INDEX IX_PetID ON temp_pets (PetID)
CREATE INDEX IX_OwnerID ON temp_owners (OwnerID)
-- Helper table for pets. Each column has its own index.
CREATE TABLE PetKey (
PetID int NOT NULL PRIMARY KEY CLUSTERED,
KeyNames varchar(200) NOT NULL INDEX IX_KeyNames NONCLUSTERED,
KeyValues varchar(900) NOT NULL INDEX IX_KeyValues NONCLUSTERED
)
-- Helper table for owners. Each column has its own index.
CREATE TABLE OwnerKey (
OwnerID int NOT NULL PRIMARY KEY CLUSTERED,
KeyValues varchar(900) NULL INDEX IX_KeyValues NONCLUSTERED
)
-- For every pet, create a record in table PetKey.
-- (Unless the pet already belongs to someone.)
INSERT INTO PetKey (PetID, KeyNames, KeyValues)
SELECT PetID, '', ''
FROM temp_pets
WHERE OwnerID IS NULL
-- For every owner, create a record in table OwnerKey.
INSERT INTO OwnerKey (OwnerID, KeyValues)
SELECT OwnerID, ''
FROM temp_owners
-- Populate columns KeyNames and KeyValues in table PetKey.
-- Lowest priority (i.e. highest number in column Priority) comes first.
-- We use CHAR(1) as a separator character; anything will do as long as it does not occur in any column values.
-- Example: when a pet has address as prio 1, zip as prio 2, then:
-- KeyNames = 'Zip' + CHAR(1) + 'Address' + CHAR(1)
-- KeyValues = '12345' + CHAR(1) + 'John' + CHAR(1)
-- NULL is replaced by CHAR(2); can be any value as long as it does not match any owner's value.
DECLARE @priority int = 1
WHILE EXISTS (SELECT * FROM temp_builder WHERE Priority = @priority)
BEGIN
UPDATE pk
SET KeyNames = b.Field + CHAR(1) + KeyNames,
KeyValues = ISNULL(CASE b.Field
WHEN 'Address' THEN p.Address
WHEN 'Zip' THEN CAST(p.Zip AS varchar)
WHEN 'Country' THEN p.Country
WHEN 'Document' THEN p.Document
WHEN 'OwnerName' THEN p.OwnerName
END, CHAR(2)) +
CHAR(1) + KeyValues
FROM PetKey pk
INNER JOIN temp_pets p ON p.PetID = pk.PetID
INNER JOIN temp_builder b ON b.PetID = pk.PetID
WHERE b.Priority = @priority
SET @priority = @priority + 1
END
-- Loop through all distinct key combinations.
DECLARE @maxKeyNames varchar(200), @namesToAdd varchar(200), @index int
SELECT @maxKeyNames = MAX(KeyNames) FROM PetKey
WHILE @maxKeyNames <> '' BEGIN
-- Populate column KeyValues in table OwnerKey.
-- The order of the values is determined by the column names listed in @maxKeyNames.
UPDATE OwnerKey
SET KeyValues = ''
SET @namesToAdd = @maxKeyNames
WHILE @namesToAdd <> '' BEGIN
SET @index = CHARINDEX(CHAR(1), @namesToAdd)
UPDATE ok
SET KeyValues = KeyValues +
CASE LEFT(@namesToAdd, @index - 1)
WHEN 'Address' THEN o.Addr
WHEN 'Zip' THEN CAST(o.Zip AS varchar)
WHEN 'Country' THEN o.Country
WHEN 'Document' THEN o.Document
WHEN 'OwnerName' THEN o.Name
END +
CHAR(1)
FROM OwnerKey ok
INNER JOIN temp_owners o ON o.OwnerID = ok.OwnerID
SET @namesToAdd = SUBSTRING(@namesToAdd, @index + 1, 200)
END
-- Match pets with owners, based on their KeyValues.
UPDATE p
SET OwnerID = (SELECT TOP 1 ok.OwnerID FROM OwnerKey ok WHERE ok.KeyValues = pk.KeyValues)
FROM temp_pets p
INNER JOIN PetKey pk ON pk.PetID = p.PetID
WHERE pk.KeyNames = @maxKeyNames
-- Pets that were successfully matched are removed from PetKey.
DELETE FROM pk
FROM PetKey pk
INNER JOIN temp_pets p ON p.PetID = pk.PetID
WHERE p.OwnerID IS NOT NULL
-- For pets with no match, strip off the first (lowest priority) name and value.
SET @namesToAdd = SUBSTRING(@maxKeyNames, CHARINDEX(CHAR(1), @maxKeyNames) + 1, 200)
UPDATE pk
SET KeyNames = @namesToAdd,
KeyValues = SUBSTRING(KeyValues, CHARINDEX(CHAR(1), KeyValues) + 1, 900)
FROM PetKey pk
INNER JOIN temp_pets p ON p.PetID = pk.PetID
WHERE pk.KeyNames = @maxKeyNames
-- Next key combination.
SELECT @maxKeyNames = MAX(KeyNames) FROM PetKey
END
【讨论】:
【参考方案5】:这是一项艰巨的任务……我是这样做的:
首先,您需要添加一个表,该表将包含半where
子句,即基于temp_builder
表的准备使用条件。此外,由于您有 5 列,我假设最多可以有 5 个条件。这是表的创建:
CREATE TABLE [dbo].[temp_builder_with_where](
[petid] [int] NULL,
[priority1] [bit] NULL,
[priority2] [bit] NULL,
[priority3] [bit] NULL,
[priority4] [bit] NULL,
[priority5] [bit] NULL,
[whereClause] [varchar](200) NULL
)
--it's good to create index, for better performance
create clustered index idx on [temp_builder_with_where]([petid])
insert into temp_builder_with_where
select petid,[priority1],[priority2],[priority3],[priority4],[priority5],
'[pets].' + CAST(field as varchar(100)) + ' = [owners].' + CAST(matchto as varchar(100)) [whereClause]
from (
select petid, field, matchto, [priority],
1 Priority1,
case when [priority] > 1 then 1 else 0 end Priority2,
case when [priority] > 2 then 1 else 0 end Priority3,
case when [priority] > 3 then 1 else 0 end Priority4,
case when [priority] > 4 then 1 else 0 end Priority5
from temp_builder) [builder]
现在我们将遍历该表。你说这个表包含 8000 行,所以我选择了另一种方式:动态查询现在将只插入一个 petid
的结果。
为了做到这一点,我们需要表格来存储我们的结果:
CREATE TABLE [dbo].[TableWithNewId](
[petid] [int] NULL,
[ownerid] [int] NULL,
[priority] [int] NULL
)
现在动态 SQL 用于insert
语句:
declare @query varchar(1000) = ''
declare @i int, @max int
set @i = 1
select @max = MAX(petid) from temp_builder_with_where
while @i <= @max
begin
set @query = ''
select @query = @query + whereClause1 + whereClause2 + whereClause3 + whereClause4 + whereClause5 + ' union all ' from (
select 'insert into [MY_DATABASE].dbo.TableWithNewId select ' + CAST(petid as varchar(3)) + ' [petid], [owners].ownerid, 1 [priority] from temp_pets [pets], temp_owners [owners] where (' + [where_petid] + [where1] + ')' [whereClause1],
case when [where2] is null then '' else ' union all select ' + CAST(petid as varchar(3)) + ' [petid], [owners].ownerid, 2 [priority] from temp_pets [pets], temp_owners [owners] where (' + [where_petid] + [where2] + ')' end [whereClause2],
case when [where3] is null then '' else ' union all select ' + CAST(petid as varchar(3)) + ' [petid], [owners].ownerid, 3 [priority] from temp_pets [pets], temp_owners [owners] where (' + [where_petid] + [where3] + ')' end [whereClause3],
case when [where4] is null then '' else ' union all select ' + CAST(petid as varchar(3)) + ' [petid], [owners].ownerid, 4 [priority] from temp_pets [pets], temp_owners [owners] where (' + [where_petid] + [where4] + ')' end [whereClause4],
case when [where5] is null then '' else ' union all select ' + CAST(petid as varchar(3)) + ' [petid], [owners].ownerid, 5 [priority] from temp_pets [pets], temp_owners [owners] where (' + [where_petid] + [where5] + ')' end [whereClause5]
from (
select petid, 'petid = ' + CAST(petid as nvarchar(3)) [where_petid],
(select ' and ' + whereClause from temp_builder_with_where where petid = t.petid and priority1 = 1 for xml path(''),type).value('(.)[1]', 'varchar(500)') [where1],
(select ' and ' + whereClause from temp_builder_with_where where petid = t.petid and priority2 = 1 for xml path(''),type).value('(.)[1]', 'varchar(500)') [where2],
(select ' and ' + whereClause from temp_builder_with_where where petid = t.petid and priority3 = 1 for xml path(''),type).value('(.)[1]', 'varchar(500)') [where3],
(select ' and ' + whereClause from temp_builder_with_where where petid = t.petid and priority4 = 1 for xml path(''),type).value('(.)[1]', 'varchar(500)') [where4],
(select ' and ' + whereClause from temp_builder_with_where where petid = t.petid and priority5 = 1 for xml path(''),type).value('(.)[1]', 'varchar(500)') [where5]
from temp_builder_with_where [t]
where petid = @i
group by petid
) a
) a
--remove last union all
set @query = left(@query, len(@query) - 10)
exec (@query)
set @i = @i + 1
end
请记住,您必须将上述代码中的[MY_DATABASE]
替换为您的数据库名称
.
根据您的示例数据,这将是查询select * from TableWithNewId
的结果:
PetId|OwnerId|Priority
1 |6 |4
2 |4 |2
2 |4 |3
3 |1 |1
3 |2 |1
3 |3 |1
3 |4 |1
3 |5 |1
3 |6 |1
基于该结果,您现在可以根据最低优先级将OwnerId
分配给PetId
(好吧,您没有说明如何处理发现多个OwnerId
具有相同优先级的情况)。
【讨论】:
这看起来棒极了!我刚刚对其进行了测试,它似乎运行良好。不幸的是,第一个查询为我返回了 8000 个匹配项,当我执行查询的第二部分(生成@query)时,它需要很长时间。我等了 5 分钟,但它从未完成。但对于小型数据集,它可以工作。现在我需要弄清楚如何优化第二部分。如果您有任何想法,请告诉我:) @user194076 我更新了我的答案,你可以试一试。【参考方案6】:这可以在没有动态 sql 或循环的情况下完成。关键在于, 用于匹配宠物和主人的列是静态的。只有优先级是动态的。但是,性能很大程度上取决于您的数据。您必须自己进行测试并考虑您认为最好的方法。
下面的解决方案基本上可以找到与任何给定宠物匹配的所有所有者。然后过滤所有者以仅包括匹配优先级 1、或 1 & 2、或 1 & 2 & 3 等的所有者。最后找到匹配所有者的“最佳”,并使用此值更新宠物表.
我在查询中添加了一些解释性 cmets,但如果有任何不清楚的地方,请随时询问。
-- We start off by converting the priority values into int values that are suitable to add up to a bit array
-- I'll save those in a #Temp table to cut that piece of logic out of the final query
IF EXISTS(SELECT 1 FROM #TempBuilder)
BEGIN
DROP TABLE #TempBuilder
END
SELECT
PetID, Field, MatchTo,
CASE [Priority]
WHEN 1 THEN 16 -- Priority one goes on the 16-bit (10000)
WHEN 2 THEN 8 -- Priority two goes on the 8-bit (01000)
WHEN 3 THEN 4 -- Priority three goes on the 4-bit (00100)
WHEN 4 THEN 2 -- Priority four goes on the 2-bit (00010)
WHEN 5 THEN 1 END AS [Priority] -- Priority five goes on the 1-bit (00001)
INTO #TempBuilder
FROM dbo.temp_builder;
-- Then we pivot the match priorities to be able to join them on our pets
WITH PivotedMatchPriorities AS (
SELECT
PetId,
[Address], [Zip], [Country], [OwnerName], [Document]
FROM (SELECT PetId, Field, [Priority] FROM #TempBuilder) tb
PIVOT
(
SUM([Priority])
FOR [Field] IN ([Address], [Zip], [Country], [OwnerName], [Document])
)
AS PivotedMatchPriorities
),
-- Next we get (for each pet) all owners with ANY matching value
-- We want to filter the matching owners to find these that match priorities 1 (priority sum 10000, i.e. 16),
--- or match priorities 1 & 2 (priority sum 11000, i.e. 24)
--- or match priorities 1 & 2 & 3 (priority sum 11100, i.e. 28)
--- etc.
MatchingOwners AS (
SELECT o.*,
p.PetID,
pmp.[Address] AS AddressPrio,
pmp.Country AS CountryPrio,
pmp.Zip AS ZipPrio,
pmp.OwnerName AS OwnerPrio,
pmp.Document AS DocumentPrio,
CASE WHEN o.Addr = p.[Address] THEN ISNULL(pmp.[Address],0) ELSE 0 END
+ CASE WHEN o.Zip = p.Zip THEN ISNULL(pmp.Zip,0) ELSE 0 END
+ CASE WHEN o.Country = p.Country THEN ISNULL(pmp.Country,0) ELSE 0 END
+ CASE WHEN o.Document = p.Document THEN ISNULL(pmp.[Document],0) ELSE 0 END
+ CASE WHEN o.[Name] = p.OwnerName THEN ISNULL(pmp.OwnerName,0) ELSE 0 END AS MatchValue -- Calculate a match value for each matching owner
FROM dbo.temp_pets p
INNER JOIN dbo.temp_owners o
ON p.[Address] = o.Addr
OR p.Country = o.Country
OR p.Document = o.Document
OR p.OwnerName = o.[Name]
OR p.Zip = o.Zip
INNER JOIN PivotedMatchPriorities pmp ON pmp.PetId = p.PetId
),
-- Now we can get all owners that match the pet, along with a match value for each owner.
-- We want to rank the matching owners for each pet to allow selecting the best ranked owner
-- Note: In the demo data there are multiple owners that match petId 3 equally well. We'll pick a random one in such cases.
RankedValidMatches AS (
SELECT
PetID,
OwnerID,
MatchValue,
ROW_NUMBER() OVER (PARTITION BY PetID ORDER BY MatchValue DESC) AS OwnerRank
FROM MatchingOwners
WHERE MatchValue IN (16, 24, 28, 30, 31)
)
-- Finally we can get the best valid match per pet
--SELECT * FROM RankedValidMatches WHERE OwnerRank = 1
-- Or we can update our pet table to reflect our results
UPDATE dbo.temp_pets
SET OwnerID = rvm.OwnerID
FROM dbo.temp_pets tp
INNER JOIN RankedValidMatches rvm ON rvm.PetID = tp.PetID AND rvm.OwnerRank = 1
【讨论】:
【参考方案7】:我使用 UNPIVOT 编写了另一个版本,但以更简单的方式对行进行排名和过滤
;with
-- r: rules table
r as (select * from temp_builder),
-- o0: owners table with all fields unpivotable (varchar)
o0 as (SELECT [OwnerID], [Addr], CAST([Zip] AS VARCHAR(100)) AS [Zip], [Country], [Document], [Name] FROM temp_owners ),
-- o: owners table unpivoted
o as (
SELECT * FROM o0
UNPIVOT (FieldValue FOR Field IN ([Addr], [Zip], [Country], [Document], [Name])) AS p
),
-- p0: pets table with all fields unpivotable (varchar)
p0 as (SELECT [PetID], [Address], CAST([Zip] AS VARCHAR(100)) AS [Zip], [Country], [Document], [OwnerName] FROM temp_pets),
-- p: petstable unpivoted
p as (
SELECT * FROM p0
UNPIVOT (FieldValue FOR Field IN ([Address], [Zip], [Country], [Document], [OwnerName])) AS p
),
-- t: join up all data and keep only matching priority
d as (
select petid, ownerid, priority
from (
select r.*, o.ownerid, ROW_NUMBER() over (partition by r.petid, o.ownerid order by r.petid, o.ownerid, priority) calc_priority
from r
join p on (r.field = p.field) and (p.petid = r.petid)
join o on (r.matchto = o.field) and (p.fieldvalue=o.fieldvalue)
) x
where calc_priority=priority
),
-- g: group by the matching rows to know the best priority reached for each pet
g as (
select petid, max(priority) max_priority
from d
group by petid
)
-- output only the rows with best priority
select d.*
from d
join g on d.petid = g.petid and d.priority = g.max_priority
order by petid, ownerid, priority
这个版本的性能并不比@EdmondQuinton 高,(我投票给了他),我的慢了 5%,但我认为对于非专业用户来说更易于理解和维护。
【讨论】:
谢谢!不幸的是,静态枢轴对我不起作用。字段可以更改。 你的意思是主人和宠物的表结构可以改变吗?它们是您在此过程中构建的“临时”表吗?您可以命名列 Col1..Col10(最多为您需要的最大列数)并在未使用的列中保留空值,这样您将拥有 UNPIVOT 的静态列名【参考方案8】:我会采取稍微不同的方法,而不是存储要匹配的列,您可以存储要执行的查询:
create table builder
(
PetID int not null,
Query varchar(max)
)
INSERT INTO builder
VALUES (1, 'SELECT TOP 1 *
FROM pets
INNER JOIN Owners
ON Owners.Name = pets.OwnerName
WHERE petId = 1
ORDER BY
CASE WHEN Owners.Country = pets.Country THEN 0 ELSE 1 END,
CASE WHEN Owners.Zip = pets.Zip THEN 0 ELSE 1 END,
CASE WHEN Owners.Addr = pets.Address THEN 0 ELSE 1 END'),
(2, 'SELECT TOP 1 *
FROM pets
INNER JOIN Owners
ON Owners.Name = pets.OwnerName
WHERE petId = 2
ORDER BY
CASE WHEN Owners.Document = pets.Document THEN 0 ELSE 1 END,
CASE WHEN Owners.Name = pets.OwnerName THEN 0 ELSE 1 END,
CASE WHEN Owners.Zip = pets.Zip THEN 0 ELSE 1 END'),
(3, 'SELECT TOP 1 *
FROM pets
INNER JOIN Owners
ON Owners.Name = pets.OwnerName
WHERE petId = 3
ORDER BY
CASE WHEN Owners.Country = pets.Country THEN 0 ELSE 1 END
')
create table pets
(
PetID int null,
Address varchar(100) null,
Zip int null,
Country varchar(100) null,
Document varchar(100) null,
OwnerName varchar(100) null,
OwnerID int null,
Field1 bit null,
Field2 bit null
)
insert into pets values
(1, '123 5th st', 12345, 'US', 'test.csv', 'John', NULL, NULL, NULL),
(2, '234 6th st', 23456, 'US', 'a.csv', 'Alex', NULL, NULL, NULL),
(3, '345 7th st', 34567, 'US', 'b.csv', 'Mike', NULL, NULL, NULL)
create table owners
(
OwnerID int null,
Addr varchar(100) null,
Zip int null,
Country varchar(100) null,
Document varchar(100) null,
Name varchar(100) null,
OtherField bit null,
OtherField2 bit null,
)
insert into owners values
(1, '456 8th st', 45678, 'US', 'c.csv', 'Mike', NULL, NULL),
(2, '678 9th st', 45678, 'US', 'b.csv', 'John', NULL, NULL),
(3, '890 10th st', 45678, 'US', 'b.csv', 'Alex', NULL, NULL),
(4, '901 11th st', 23456, 'US', 'b.csv', 'Alex', NULL, NULL),
(5, '234 5th st', 12345, 'US', 'b.csv', 'John', NULL, NULL),
(6, '123 5th st', 45678, 'US', 'a.csv', 'John', NULL, NULL)
现在要找到特定宠物的匹配所有者,您只需从表中找到查询并执行它:
DECLARE @query varchar(max)
SELECT TOP 1 @query = query
FROM builder
WHERE petId =1
EXEC (@query)
【讨论】:
【参考方案9】:考虑到这一点,这是一个严格解决您问题的答案
遵循您提出的规则无循环、无游标、无动态 sql 还要严格考虑您的问题,因此这不是通用解决方案,它非常适合您的问题和您拥有的列测试数据
declare @Pets table
(
PetID int null,
Address varchar(100) null,
Zip int null,
Country varchar(100) null,
Document varchar(100) null,
OwnerName varchar(100) null,
OwnerID int null,
Field1 bit null,
Field2 bit null
)
insert into @Pets values
(1, '123 5th st', 12345, 'US', 'test.csv', 'John', NULL, NULL, NULL),
(2, '234 6th st', 23456, 'US', 'a.csv', 'Alex', NULL, NULL, NULL),
(3, '345 7th st', 34567, 'US', 'b.csv', 'Mike', NULL, NULL, NULL)
declare @owners table
(
OwnerID int null,
Addr varchar(100) null,
Zip int null,
Country varchar(100) null,
Document varchar(100) null,
Name varchar(100) null,
OtherField bit null,
OtherField2 bit null
)
insert into @owners values
(1, '456 8th st', 45678, 'US', 'c.csv', 'Mike', NULL, NULL),
(2, '678 9th st', 45678, 'US', 'b.csv', 'John', NULL, NULL),
(3, '890 10th st', 45678, 'US', 'b.csv', 'Alex', NULL, NULL),
(4, '901 11th st', 23456, 'US', 'b.csv', 'Alex', NULL, NULL),
(5, '234 5th st', 12345, 'US', 'b.csv', 'John', NULL, NULL),
(6, '123 5th st', 45678, 'US', 'a.csv', 'John', NULL, NULL)
declare @builder table
(
PetID int not null,
Field varchar(30) not null,
MatchTo varchar(30) not null,
Priority int not null
)
insert into @builder values
(1,'Address', 'Addr',4),
(1,'Zip', 'Zip', 3),
(1,'Country', 'Country', 2),
(1,'OwnerName', 'Name',1),
(2,'Zip', 'Zip',3),
(2,'OwnerName','Name', 2),
(2,'Document', 'Document', 1),
(3,'Country', 'Country', 1)
解决问题的代码
select distinct p.PetID, min(o.OwnerID) as ownerID from @pets p
inner join @builder b on p.PetID = b.PetID
inner join @owners o on
(
(case when b.Field = 'Address' and b.Priority = 1 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 1 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 1 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 1 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 1 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 1 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 1 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 1 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 1 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 1 then o.Document else '-1' end)
)
AND
(
(case when b.Field = 'Address' and b.Priority = 2 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 2 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 2 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 2 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 2 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 2 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 2 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 2 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 2 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 2 then o.Document else '-1' end)
)
AND
(
(case when b.Field = 'Address' and b.Priority = 3 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 3 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 3 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 3 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 3 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 3 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 3 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 3 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 3 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 3 then o.Document else '-1' end)
)
AND
(
(case when b.Field = 'Address' and b.Priority = 4 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 4 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 4 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 4 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 4 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 4 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 4 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 4 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 4 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 4 then o.Document else '-1' end)
)
AND
(
(case when b.Field = 'Address' and b.Priority = 5 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 5 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 5 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 5 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 5 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 5 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 5 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 5 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 5 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 5 then o.Document else '-1' end)
)
group by p.PetID
union
--------------------------
select distinct p.PetID, min(o.OwnerID) as ownerID from @pets p
inner join @builder b on p.PetID = b.PetID
inner join @owners o on
(
(case when b.Field = 'Address' and b.Priority = 1 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 1 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 1 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 1 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 1 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 1 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 1 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 1 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 1 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 1 then o.Document else '-1' end)
)
AND
(
(case when b.Field = 'Address' and b.Priority = 2 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 2 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 2 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 2 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 2 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 2 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 2 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 2 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 2 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 2 then o.Document else '-1' end)
)
AND
(
(case when b.Field = 'Address' and b.Priority = 3 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 3 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 3 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 3 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 3 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 3 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 3 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 3 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 3 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 3 then o.Document else '-1' end)
)
AND
(
(case when b.Field = 'Address' and b.Priority = 4 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 4 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 4 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 4 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 4 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 4 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 4 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 4 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 4 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 4 then o.Document else '-1' end)
)
group by p.PetID
union
--------------------------
select distinct p.PetID, min(o.OwnerID) as ownerID from @pets p
inner join @builder b on p.PetID = b.PetID
inner join @owners o on
(
(case when b.Field = 'Address' and b.Priority = 1 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 1 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 1 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 1 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 1 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 1 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 1 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 1 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 1 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 1 then o.Document else '-1' end)
)
AND
(
(case when b.Field = 'Address' and b.Priority = 2 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 2 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 2 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 2 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 2 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 2 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 2 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 2 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 2 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 2 then o.Document else '-1' end)
)
AND
(
(case when b.Field = 'Address' and b.Priority = 3 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 3 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 3 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 3 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 3 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 3 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 3 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 3 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 3 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 3 then o.Document else '-1' end)
)
group by p.PetID
union
------------------------
select distinct p.PetID, min(o.OwnerID) as ownerID from @pets p
inner join @builder b on p.PetID = b.PetID
inner join @owners o on
(
(case when b.Field = 'Address' and b.Priority = 1 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 1 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 1 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 1 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 1 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 1 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 1 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 1 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 1 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 1 then o.Document else '-1' end)
)
AND
(
(case when b.Field = 'Address' and b.Priority = 2 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 2 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 2 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 2 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 2 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 2 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 2 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 2 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 2 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 2 then o.Document else '-1' end)
)
group by p.PetID
union
------------------------
select distinct p.PetID, min(o.OwnerID) as ownerID from @pets p
inner join @builder b on p.PetID = b.PetID
inner join @owners o on
(
(case when b.Field = 'Address' and b.Priority = 1 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 1 then o.Addr else '-1' end)
or (case when b.Field = 'Zip' and b.Priority = 1 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 1 then o.Zip else '-1' end)
or (case when b.Field = 'Country' and b.Priority = 1 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 1 then o.Country else '-1' end)
or (case when b.Field = 'OwnerName' and b.Priority = 1 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 1 then o.Name else '-1' end)
or (case when b.Field = 'Document' and b.Priority = 1 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 1 then o.Document else '-1' end)
)
group by p.PetID
结果
PetID OwnerID
1 2
2 6
3 1
【讨论】:
【参考方案10】:如果您正在寻找一个没有联合、循环或游标或动态 SQL 的简单解决方案,下面的查询可以正常工作。
SQL 小提琴:http://sqlfiddle.com/#!18/10982/41
select PetID ,COALESCE(
(select top 1 OwnerID from temp_owners
where Zip = pets.Zip
and Name = pets.OwnerName
and Document = pets.Document) ,
(select top 1 OwnerID from temp_owners where
Name = pets.OwnerName
and Document = pets.Document) ,
(select top 1 OwnerID from temp_owners where
Document = pets.Document) ) OwnerId
from
temp_pets pets
结果:
PetID OwnerId
1 (null)
2 6
3 2
【讨论】:
这个不需要Dynamic SQL,因为它是硬编码的,完全忽略了“特殊匹配表”的内容...以上是关于根据动态列查找匹配记录的主要内容,如果未能解决你的问题,请参考以下文章
使用 vb.net 根据 SQL 数据库中存在的记录动态显示/隐藏 DataGrid 按钮列时遇到问题