根据动态列查找匹配记录

Posted

技术标签:

【中文标题】根据动态列查找匹配记录【英文标题】:Find matching records based on dynamic columns 【发布时间】:2018-08-15 03:48:43 【问题描述】:

我有一份宠物清单:

我需要从 Owner 表中为每个宠物找到正确的主人

为了正确地将每只宠物与主人匹配,我需要使用一个特殊的匹配表,如下所示:

因此,对于 PetID=2 的宠物,我需要根据三个字段找到匹配的所有者:

    Pet.Zip = Owner.Zip 
    and Pet.OwnerName = Owner.Name 
    and Pet.Document = Owner.Document

在我们的示例中,它将像这样工作:

 select top 1 OwnerID from owners
         where Zip = 23456 
         and Name = 'Alex' 
         and Document = 'a.csv'

如果没有找到 OwnerID,我需要根据 2 个字段进行匹配(不使用优先级最高的字段)

在我们的例子中:

 select top 1 OwnerID from owners where
             Name = 'Alex' 
             and Document = 'a.csv'

由于没有找到记录,因此我们需要在较少的字段上进行匹配。在我们的示例中:

select top 1 OwnerID from owners where Document = 'a.csv'

现在,我们找到了 OwnerID = 6 的所有者。

现在我们需要用 ownerID = 6 更新宠物,然后我们可以处理下一个宠物。

我现在可以做到这一点的唯一方法是使用循环或游标 + 动态 SQL。

没有循环+动态sql可以实现吗?也许 STUFF + Pivot 不知何故?

sql 小提琴:http://sqlfiddle.com/#!18/10982/1/0

样本数据:

create table  temp_builder
(
    PetID int not null,
    Field varchar(30) not null,
    MatchTo varchar(30) not null,
    Priority int not null
)

insert into temp_builder values
(1,'Address', 'Addr',4),
(1,'Zip', 'Zip', 3),
(1,'Country', 'Country', 2),
(1,'OwnerName', 'Name',1),
(2,'Zip', 'Zip',3),
(2,'OwnerName','Name', 2),
(2,'Document', 'Document', 1),
(3,'Country', 'Country', 1)


create table temp_pets
(
    PetID int null,
    Address varchar(100) null,
    Zip int null,
    Country varchar(100) null,
    Document varchar(100) null,
    OwnerName varchar(100) null,
    OwnerID int null,
    Field1 bit null,
    Field2 bit null
)

insert into temp_pets values
(1, '123 5th st', 12345, 'US', 'test.csv', 'John', NULL, NULL, NULL),
(2, '234 6th st', 23456, 'US', 'a.csv', 'Alex', NULL, NULL, NULL),
(3, '345 7th st', 34567, 'US', 'b.csv', 'Mike', NULL, NULL, NULL)

create table temp_owners
(
    OwnerID int null,
    Addr varchar(100) null,
    Zip int null,
    Country varchar(100) null,
    Document varchar(100) null,
    Name varchar(100) null,
    OtherField bit null,
    OtherField2 bit null,
)

insert into temp_owners values
(1, '456 8th st',  45678, 'US', 'c.csv', 'Mike',  NULL, NULL),
(2, '678 9th st',  45678, 'US', 'b.csv', 'John',  NULL, NULL),
(3, '890 10th st', 45678, 'US', 'b.csv', 'Alex',  NULL, NULL),
(4, '901 11th st', 23456, 'US', 'b.csv', 'Alex',  NULL, NULL),
(5, '234 5th st',  12345, 'US', 'b.csv', 'John',  NULL, NULL),
(6, '123 5th st',  45678, 'US', 'a.csv', 'John',  NULL, NULL)

编辑:许多很棒的建议和回应让我不知所措。我已经对它们进行了测试,其中许多对我来说效果很好。不幸的是,我只能奖励一种解决方案。

【问题讨论】:

我不明白你的优先规则。为什么国家的优先级高于邮政编码? @TimBiegeleisen,对于宠物 #1,我需要尝试按地址、邮编、国家/地区、所有者名称进行匹配。如果不匹配,则按 Zip、Country、OwnerName,如果不匹配,则按 Country、OwnerName,如果不匹配则按 OwnerName。因此,我们从更具体转向不太具体。我想出了这个例子的列名只是为了简化目的 必须有动态SQL,否则,如何使用存储在另一列中的列名......?如果动态 SQL 不是一个选项,那么您的问题在 IMO 中没有答案。 @MichałTurczyn 我可以使用动态 sql,但我想尝试提高效率(我的表有超过 100 万条记录。如果可能,我想避免循环) 我已经编辑了您的问题并将示例数据从您的小提琴链接复制到问题的正文。我还为更多的受众添加了 [sql-server] 和 [tsql] 标签。 【参考方案1】:

可以通过将用于比较的字段视为每个宠物的位集中的条目来避免使用游标、循环和动态 SQL。根据一个位条目(FieldRank 列)为每个优先级计算一个位集(FieldSetRank 列)。

必须对 Pets 和 Owner 表进行反透视,以便可以比较字段及其关联值。已匹配的每个字段和值都分配给相应的 FieldRank。然后根据匹配值 (MatchSetRank) 计算新的位集。仅返回匹配集 (MatchSetRank) 等于所需集 (FieldSetRank) 的记录。

查询执行最终排名以返回具有最高 MatchSetRank 的记录(在保持优先级条件的同时匹配最多列数的记录。 下面的 T-SQL 演示了这个概念。

;WITH CTE_Builder
 AS
 (
     SELECT  [PetID]
            ,[Field]
            ,[Priority]
            ,[MatchTo]
            ,POWER(2, [Priority] - 1) AS [FieldRank] -- Define the field ranking as bit set numbered item.
            ,SUM(POWER(2, [Priority] - 1)) OVER (PARTITION BY [PetID] ORDER BY [Priority] ROWS UNBOUNDED PRECEDING) FieldSetRank -- Sum all the bit set IDs to define what constitutes a completed field set ordered by priority.
     FROM   temp_builder
 ),
CTE_PetsUnpivoted
AS
(   -- Unpivot pets table and assign Field Rank and Field Set Rank.
    SELECT   [PetsUnPivot].[PetID]
            ,[PetsUnPivot].[Field]
            ,[Builder].[MatchTo]
            ,[PetsUnPivot].[FieldValue]
            ,[Builder].[Priority]
            ,[Builder].[FieldRank]
            ,[Builder].[FieldSetRank]

    FROM 
       (
            SELECT [PetID], [Address], CAST([Zip] AS VARCHAR(100)) AS [Zip], [Country], [Document], [OwnerName]
            FROM temp_pets
        ) [Pets]
    UNPIVOT
       (FieldValue FOR Field IN 
          ([Address], [Zip], [Country], [Document], [OwnerName])
    ) AS [PetsUnPivot]
    INNER JOIN [CTE_Builder] [Builder] ON [PetsUnPivot].PetID = [Builder].PetID AND [PetsUnPivot].Field = [Builder].Field
),
CTE_Owners
AS
(
    -- Unpivot Owners table and join with unpivoted Pets table on field name and field value.  
    -- Next assign Pets field rank then calculated the field set rank (MatchSetRank) based on actual matches made.
    SELECT   [OwnersUnPivot].[OwnerID]
            ,[Pets].[PetID]
            ,[OwnersUnPivot].[Field]
            ,[Pets].Field AS [PetField]
            ,[Pets].FieldValue as PetFieldValue
            ,[OwnersUnPivot].[FieldValue]
            ,[Pets].[Priority]
            ,[Pets].[FieldRank]
            ,[Pets].[FieldSetRank]
            ,SUM([FieldRank]) OVER (PARTITION BY [Pets].[PetID], [OwnersUnPivot].[OwnerID] ORDER BY [Pets].[Priority] ROWS UNBOUNDED PRECEDING) MatchSetRank
    FROM 
       (
            SELECT [OwnerID], [Addr], CAST([Zip] AS VARCHAR(100)) AS [Zip], [Country], [Document], [Name]
            FROM temp_owners
        ) [Owners]
    UNPIVOT
       (FieldValue FOR Field IN 
          ([Addr], [Zip], [Country], [Document], [Name])
    ) AS [OwnersUnPivot]
    INNER JOIN [CTE_PetsUnpivoted] [Pets] ON [OwnersUnPivot].[Field] = [Pets].[MatchTo] AND [OwnersUnPivot].[FieldValue] = [Pets].[FieldValue]
),
CTE_FinalRanking
AS
(
    SELECT   [PetID]
            ,[OwnerID]
            -- -- Calculate final rank, if multiple matches have the same rank then multiple rows will be returned per pet. 
            -- Change the “RANK()” function to "ROW_NUMBER()" to only return on result per pet.
            ,RANK() OVER (PARTITION BY [PetID] ORDER BY [MatchSetRank] DESC) AS [FinalRank] 
    FROM    CTE_Owners
    WHERE   [FieldSetRank] = [MatchSetRank] -- Only return records where the field sets calculated based on 
                                            -- actual matches is equal to desired field set ranks. This will 
                                            -- eliminate matches where the number of fields that meets the 
                                            -- criteria is the same but does not meet priority requirements. 
)
SELECT   [PetID]
        ,[OwnerID]
FROM    CTE_FinalRanking
WHERE   [FinalRank] = 1

【讨论】:

我喜欢这种方法;我想出了一个类似的方法。为了获得更快的速度,您可以将 UNPIVOTED 所有者字段数据存储在临时表中,并根据字段名称/字段值对其进行索引。 类似于我在下面发布的方法。关键是匹配列实际上是静态的。只有优先级是动态的。 感谢您的回答!看起来不错,但不幸的是静态列对我不起作用。这些是可以改变的。【参考方案2】:

我会马上说以节省您的时间:

我的解决方案使用动态 SQL。 Michał Turczyn 正确地指出,当比较列的名称存储在数据库中时,您无法避免它。 我的解决方案使用循环。而且我坚信你不会用纯 SQL 查询来解决这个问题,它会在你声明的数据大小上运行得足够快(表有 > 1M 的记录)。您描述的逻辑本质上意味着迭代 - 从更大的匹配字段集到更低的集。 SQL 作为一种查询语言并不是为了涵盖这些棘手的场景而设计的。您可以尝试使用纯 SQL 查询来解决您的问题,但即使您设法构建这样的查询,它也会非常棘手、复杂和不清楚。我不喜欢这种解决方案。这就是为什么我什至没有深入这个方向。 另一方面,我的解决方案不需要创建临时表,这是一个优势。

鉴于此,我的方法相当简单:

    有一个外部循环从最大的匹配器集(所有匹配的字段)迭代到最小的匹配器集(一个字段)。在第一次迭代中,当我们还不知道有多少匹配器存储在宠物的数据库中时,我们读取并使用它们。在接下来的迭代中,我们将使用的匹配器数量减少 1(删除具有最高优先级的匹配器)。

    内部循环遍历当前匹配器集并构建WHERE 子句,用于比较PetsOwners 表之间的字段。

    当前查询已执行,如果某些所有者符合给定条件,我们将中断外循环。

下面是实现这个逻辑的代码:

DECLARE @PetId INT = 2;

DECLARE @MatchersLimit INT;
DECLARE @OwnerID INT;

WHILE (@MatchersLimit IS NULL OR @MatchersLimit > 0) AND @OwnerID IS NULL
BEGIN

    DECLARE @CurrMatchFilter VARCHAR(max) = ''
    DECLARE @Field VARCHAR(30)
    DECLARE @MatchTo VARCHAR(30)
    DECLARE @CurrMatchersNumber INT = 0;

    DECLARE @GetMatchers CURSOR;
    IF @MatchersLimit IS NULL
        SET @GetMatchers = CURSOR FOR SELECT Field, MatchTo FROM temp_builder WHERE PetID = @PetId ORDER BY Priority ASC;
    ELSE
        SET @GetMatchers = CURSOR FOR SELECT TOP (@MatchersLimit) Field, MatchTo FROM temp_builder WHERE PetID = @PetId ORDER BY Priority ASC;

    OPEN @GetMatchers;
    FETCH NEXT FROM @GetMatchers INTO @Field, @MatchTo;
    WHILE @@FETCH_STATUS = 0
    BEGIN
        IF @CurrMatchFilter <> '' SET @CurrMatchFilter = @CurrMatchFilter + ' AND ';
        SET @CurrMatchFilter = @CurrMatchFilter + ('temp_pets.' + @Field + ' = ' + 'temp_owners.' + @MatchTo);
        FETCH NEXT FROM @GetMatchers INTO @field, @matchTo;
        SET @CurrMatchersNumber = @CurrMatchersNumber + 1;
    END
    CLOSE @GetMatchers;
    DEALLOCATE @GetMatchers;

    IF @CurrMatchersNumber = 0 BREAK;

    DECLARE @CurrQuery nvarchar(max) = N'SELECT @id = temp_owners.OwnerID FROM temp_owners INNER JOIN temp_pets ON (' + CAST(@CurrMatchFilter AS NVARCHAR(MAX)) + N') WHERE temp_pets.PetID = ' + CAST(@PetId AS NVARCHAR(MAX));
    EXECUTE sp_executesql @CurrQuery, N'@id int OUTPUT', @id=@OwnerID OUTPUT;

    IF @MatchersLimit IS NULL
        SET @MatchersLimit = @CurrMatchersNumber - 1;
    ELSE
        SET @MatchersLimit = @MatchersLimit - 1;

END

SELECT @OwnerID AS OwnerID, @MatchersLimit + 1 AS Matched;

性能考虑

在这种方法中执行的查询基本上有 2 个:

    SELECT Field, MatchTo FROM temp_builder WHERE PetID = @PetId;

    您应该在temp_builder 表中的PetID 字段上添加索引,此查询将执行得非常快。

    SELECT @id = temp_owners.OwnerID FROM temp_owners INNER JOIN temp_pets ON (temp_pets.Document = temp_owners.Document AND temp_pets.OwnerName = temp_owners.Name AND temp_pets.Zip = temp_owners.Zip AND ...) WHERE temp_pets.PetID = @PetId;

    这个查询看起来很吓人,因为它连接了两个大表 - temp_ownerstemp_pets。但是temp_pets 表被PetID 列过滤,应该只产生一条记录。因此,如果您在 temp_pets.PetID 列上有索引(并且您应该因为该列看起来像主键),则查询将导致扫描 temp_owners 表。即使对于超过 1M 行的表,这种扫描也不会花费太多时间。如果查询仍然太慢,您可以考虑为匹配器中使用的temp_owners 表的列添加索引(AddrZip 等)。添加索引有缺点,比如更大的数据库和更慢的插入/更新操作。所以在给temp_owners列添加索引之前,先检查一下没有索引的表的查询速度。

【讨论】:

【参考方案3】:

我不确定最终结果是否正确,但我建议使用几个常用表表达式使用动态 SQL 生成一批更新语句(恐怕不能不使用动态 SQL 完成),然后使用 Exec(sql) 执行它们。

这种方法的好处是它不涉及循环或游标。

我生成的每个更新语句都在宠物和所有者表之间使用inner join,使用所有者表所有者 ID 更新宠物表的所有者 ID,使用从构建器表到 on 的映射作为基础子句。 第一个 cte 负责从 builder 表中生成 on 子句,第二个 cte 负责生成更新语句。 最后,我将第二个 CTE 中的所有 SQL 语句选择到单个 nvarchar(max) 变量中并执行它。

我解决优先级问题的方法是为每组优先级生成一个更新语句,首先包括所有优先级,然后从下一个 SQL 语句中排除值,首先排除最高优先级,直到我留下一个 on 子句只映射一组列。

所以,首先要声明一个变量来保存生成的更新语句:

DECLARE @Sql nvarchar(max) = ''

现在,第一个 CTE 使用 cross applystufffor xml 为每对 petIdPriority 生成 on 子句:

;WITH OnClauseCTE AS
(
SELECT DISTINCT PetId, Priority, OnClause
FROM temp_builder t0
CROSS APPLY
(
    SELECT STUFF (
    (  
        SELECT ' AND p.'+ Field +' = o.'+ MatchTo
        FROM temp_builder t1
        WHERE PetID = t0.PetId
        AND Priority <= t0.Priority
        FOR XML PATH('')  
    )
    , 1, 5, '') As OnClause
) onClauseGenerator
)

第二个 CTE 为每个 petIdPriority 组合生成一个 UPDATE 语句:

, UpdateStatementCTE AS
(
    SELECT  PetId,
            Priority,
            'UPDATE p 
            SET OwnerID = o.OwnerID 
            FROM temp_pets p 
            INNER JOIN temp_owners o ON ' + OnClause + ' 
            WHERE p.PetId = '+ CAST(PetId as varchar(10)) +'
            AND p.OwnerID IS NULL; -- THIS IS CRITICAL!
            ' AS SQL
    FROM OnClauseCTE
)

最后,从 UpdateStatementCTE 生成一批更新语句:

SELECT @Sql = @Sql + SQL
FROM UpdateStatementCTE    
ORDER BY PetId, Priority DESC -- ORDER BY Priority is CRITICAL!

order by PetId 是为了提高可读性,当您打印出@Sql 的内容时。但是,order by 子句的 Priority DESC 部分是critical,因为我们希望先执行最高优先级,最后执行最低优先级。

现在,@Sql 包含以下内容(缩短):

UPDATE p 
SET OwnerID = o.OwnerID 
FROM temp_pets p 
INNER JOIN temp_owners o ON p.Address = o.Addr AND p.Zip = o.Zip AND p.Country = o.Country AND p.OwnerName = o.Name 
WHERE p.PetId = 1
AND p.OwnerID IS NULL;

...

UPDATE p 
SET OwnerID = o.OwnerID 
FROM temp_pets p 
INNER JOIN temp_owners o ON p.OwnerName = o.Name 
WHERE p.PetId = 1
AND p.OwnerID IS NULL;

...

UPDATE p 
SET OwnerID = o.OwnerID 
FROM temp_pets p 
INNER JOIN temp_owners o ON p.OwnerName = o.Name AND p.Document = o.Document 
WHERE p.PetId = 2
AND p.OwnerID IS NULL;

...

UPDATE p 
SET OwnerID = o.OwnerID 
FROM temp_pets p 
INNER JOIN temp_owners o ON p.Country = o.Country 
WHERE p.PetId = 3
AND p.OwnerID IS NULL;

如您所见,每个更新语句都表示在构建器表中,并且只有在前一个更新语句尚未更改所有者 ID 时才会更改所有者 ID,因为 where 子句的 AND p.OwnerID IS NULL 部分。

运行这批更新语句后,您的 temp_pets 表如下所示:

PetID   Address         Zip     Country     Document    OwnerName   OwnerID     Field1  Field2
1       123 5th st      12345   US          test.csv    John        5           NULL    NULL
2       234 6th st      23456   US          a.csv       Alex        6           NULL    NULL
3       345 7th st      34567   US          b.csv       Mike        1           NULL    NUL

You can see a live demo on rextester.

但是,请注意,您拥有的条件越少,从联接返回的记录就越多,从而使更新更可能不准确。 例如,对于 PetId 3,我得到 OwnerId 1,因为我唯一需要匹配记录的是 Country 列,这意味着它实际上可能是此示例数据中的每个 OwnerId,因为每个人都有相同的USCountry 列中的值。 在以下规则下,我无能为力。

【讨论】:

【参考方案4】:

以下方法基于以下事实:选择和排序要匹配的列的不同组合的数量是有限的,并且可能远少于记录的数量。 对于 5 列,组合总数为 325,但由于不太可能使用所有可能的组合,因此实际数量可能少于 100。 与记录数相比(OP 提到>1M),尝试合并具有相同列组合的宠物是值得的。

以下 SQL 脚本的特点:

没有动态 SQL。 循环,但没有游标;迭代次数是有限的,并且不会随着记录的数量成比例地增长。 创建两个(索引)帮助表。 (请随意将它们设为临时表或表变量。)这大大加快了匹配过程 (INNER JOIN),但在填充表时确实会带来一些开销。 只有简单的 SQL 构造(没有枢轴,没有填充 FOR XML,甚至没有 CTE)。 仅依赖于关键列(PetID、OwnerID)、Priority 列和帮助表中的列的索引。不需要地址、邮编、国家、文档、名称的索引。

乍一看,查询似乎完全是多余的(在 OP 提出的少量样本数据上执行了 47 条 SQL 语句),但对于更大的表,优势应该会变得明显。最坏情况的时间复杂度应该是 O(n log n),这比许多替代方案要好得多。 但当然它仍然需要在实践中证明自己;我还没有使用大型数据集对其进行测试。

小提琴:http://sqlfiddle.com/#!18/53320/1

-- Adding indexes to OP's tables to optimize the queries that follow.
CREATE INDEX IX_PetID ON temp_builder (PetID)
CREATE INDEX IX_Priority ON temp_builder (Priority)
CREATE INDEX IX_PetID ON temp_pets (PetID)
CREATE INDEX IX_OwnerID ON temp_owners (OwnerID)

-- Helper table for pets. Each column has its own index.
CREATE TABLE PetKey (
    PetID int NOT NULL PRIMARY KEY CLUSTERED,
    KeyNames varchar(200) NOT NULL INDEX IX_KeyNames NONCLUSTERED,
    KeyValues varchar(900) NOT NULL INDEX IX_KeyValues NONCLUSTERED
)

-- Helper table for owners. Each column has its own index.
CREATE TABLE OwnerKey (
    OwnerID int NOT NULL PRIMARY KEY CLUSTERED,
    KeyValues varchar(900) NULL INDEX IX_KeyValues NONCLUSTERED
)

-- For every pet, create a record in table PetKey.
-- (Unless the pet already belongs to someone.)
INSERT INTO PetKey (PetID, KeyNames, KeyValues)
SELECT PetID, '', ''
FROM temp_pets
WHERE OwnerID IS NULL

-- For every owner, create a record in table OwnerKey.
INSERT INTO OwnerKey (OwnerID, KeyValues)
SELECT OwnerID, ''
FROM temp_owners

-- Populate columns KeyNames and KeyValues in table PetKey.
-- Lowest priority (i.e. highest number in column Priority) comes first.
-- We use CHAR(1) as a separator character; anything will do as long as it does not occur in any column values.
-- Example: when a pet has address as prio 1, zip as prio 2, then:
--    KeyNames = 'Zip' + CHAR(1) + 'Address' + CHAR(1)
--    KeyValues = '12345' + CHAR(1) + 'John' + CHAR(1)
-- NULL is replaced by CHAR(2); can be any value as long as it does not match any owner's value.
DECLARE @priority int = 1
WHILE EXISTS (SELECT * FROM temp_builder WHERE Priority = @priority)
BEGIN
    UPDATE pk
    SET KeyNames = b.Field + CHAR(1) + KeyNames,
        KeyValues = ISNULL(CASE b.Field
                               WHEN 'Address' THEN p.Address
                               WHEN 'Zip' THEN CAST(p.Zip AS varchar)
                               WHEN 'Country' THEN p.Country
                               WHEN 'Document' THEN p.Document
                               WHEN 'OwnerName' THEN p.OwnerName
                           END, CHAR(2)) +
                    CHAR(1) + KeyValues
    FROM PetKey pk
    INNER JOIN temp_pets p ON p.PetID = pk.PetID
    INNER JOIN temp_builder b ON b.PetID = pk.PetID
    WHERE b.Priority = @priority

    SET @priority = @priority + 1
END

-- Loop through all distinct key combinations.
DECLARE @maxKeyNames varchar(200), @namesToAdd varchar(200), @index int
SELECT @maxKeyNames = MAX(KeyNames) FROM PetKey
WHILE @maxKeyNames <> '' BEGIN
    -- Populate column KeyValues in table OwnerKey.
    -- The order of the values is determined by the column names listed in @maxKeyNames.
    UPDATE OwnerKey
    SET KeyValues = ''

    SET @namesToAdd = @maxKeyNames
    WHILE @namesToAdd <> '' BEGIN
        SET @index = CHARINDEX(CHAR(1), @namesToAdd)

        UPDATE ok
        SET KeyValues = KeyValues +
                        CASE LEFT(@namesToAdd, @index - 1)
                            WHEN 'Address' THEN o.Addr
                            WHEN 'Zip' THEN CAST(o.Zip AS varchar)
                            WHEN 'Country' THEN o.Country
                            WHEN 'Document' THEN o.Document
                            WHEN 'OwnerName' THEN o.Name
                        END +
                        CHAR(1)
        FROM OwnerKey ok
        INNER JOIN temp_owners o ON o.OwnerID = ok.OwnerID

        SET @namesToAdd = SUBSTRING(@namesToAdd, @index + 1, 200)
    END

    -- Match pets with owners, based on their KeyValues.
    UPDATE p
    SET OwnerID = (SELECT TOP 1 ok.OwnerID FROM OwnerKey ok WHERE ok.KeyValues = pk.KeyValues)
    FROM temp_pets p
    INNER JOIN PetKey pk ON pk.PetID = p.PetID
    WHERE pk.KeyNames = @maxKeyNames

    -- Pets that were successfully matched are removed from PetKey.
    DELETE FROM pk
    FROM PetKey pk
    INNER JOIN temp_pets p ON p.PetID = pk.PetID
    WHERE p.OwnerID IS NOT NULL

    -- For pets with no match, strip off the first (lowest priority) name and value.
    SET @namesToAdd = SUBSTRING(@maxKeyNames, CHARINDEX(CHAR(1), @maxKeyNames) + 1, 200)

    UPDATE pk
    SET KeyNames = @namesToAdd,
        KeyValues = SUBSTRING(KeyValues, CHARINDEX(CHAR(1), KeyValues) + 1, 900)
    FROM PetKey pk
    INNER JOIN temp_pets p ON p.PetID = pk.PetID
    WHERE pk.KeyNames = @maxKeyNames

    -- Next key combination.    
    SELECT @maxKeyNames = MAX(KeyNames) FROM PetKey
END

【讨论】:

【参考方案5】:

这是一项艰巨的任务……我是这样做的:

首先,您需要添加一个表,该表将包含半where 子句,即基于temp_builder 表的准备使用条件。此外,由于您有 5 列,我假设最多可以有 5 个条件。这是表的创建:

CREATE TABLE [dbo].[temp_builder_with_where](
    [petid] [int] NULL,
    [priority1] [bit] NULL,
    [priority2] [bit] NULL,
    [priority3] [bit] NULL,
    [priority4] [bit] NULL,
    [priority5] [bit] NULL,
    [whereClause] [varchar](200) NULL
) 
--it's good to create index, for better performance
create clustered index idx on [temp_builder_with_where]([petid])

insert into temp_builder_with_where
select petid,[priority1],[priority2],[priority3],[priority4],[priority5],
         '[pets].' + CAST(field as varchar(100)) + ' = [owners].' + CAST(matchto as varchar(100)) [whereClause]
from (
select petid, field, matchto, [priority],
        1 Priority1,
        case when [priority] > 1 then 1 else 0 end Priority2,
        case when [priority] > 2 then 1 else 0 end Priority3,
        case when [priority] > 3 then 1 else 0 end Priority4,
        case when [priority] > 4 then 1 else 0 end Priority5       
from temp_builder) [builder]

现在我们将遍历该表。你说这个表包含 8000 行,所以我选择了另一种方式:动态查询现在将只插入一个 petid 的结果。

为了做到这一点,我们需要表格来存储我们的结果:

CREATE TABLE [dbo].[TableWithNewId](
    [petid] [int] NULL,
    [ownerid] [int] NULL,
    [priority] [int] NULL
)

现在动态 SQL 用于insert 语句:

declare @query varchar(1000) = ''
declare @i int, @max int
set @i = 1
select @max = MAX(petid) from temp_builder_with_where

while @i <= @max
begin

    set @query = ''

    select @query = @query + whereClause1 + whereClause2 + whereClause3 + whereClause4 + whereClause5 + ' union all ' from (
    select 'insert into [MY_DATABASE].dbo.TableWithNewId  select ' + CAST(petid as varchar(3)) + ' [petid], [owners].ownerid, 1 [priority] from temp_pets [pets], temp_owners [owners] where (' + [where_petid] + [where1] + ')' [whereClause1],
           case when [where2] is null then '' else ' union all select ' + CAST(petid as varchar(3)) + ' [petid], [owners].ownerid, 2 [priority] from temp_pets [pets], temp_owners [owners] where (' + [where_petid] + [where2] + ')' end [whereClause2], 
           case when [where3] is null then '' else ' union all select ' + CAST(petid as varchar(3)) + ' [petid], [owners].ownerid, 3 [priority] from temp_pets [pets], temp_owners [owners] where (' + [where_petid] + [where3] + ')' end [whereClause3], 
           case when [where4] is null then '' else ' union all select ' + CAST(petid as varchar(3)) + ' [petid], [owners].ownerid, 4 [priority] from temp_pets [pets], temp_owners [owners] where (' + [where_petid] + [where4] + ')' end [whereClause4], 
           case when [where5] is null then '' else ' union all select ' + CAST(petid as varchar(3)) + ' [petid], [owners].ownerid, 5 [priority] from temp_pets [pets], temp_owners [owners] where (' + [where_petid] + [where5] + ')' end [whereClause5]
    from (
            select petid, 'petid = ' + CAST(petid as nvarchar(3)) [where_petid],
               (select ' and ' + whereClause from temp_builder_with_where where petid = t.petid and priority1 = 1 for xml path(''),type).value('(.)[1]', 'varchar(500)') [where1],
               (select ' and ' + whereClause from temp_builder_with_where where petid = t.petid and priority2 = 1 for xml path(''),type).value('(.)[1]', 'varchar(500)') [where2],
               (select ' and ' + whereClause from temp_builder_with_where where petid = t.petid and priority3 = 1 for xml path(''),type).value('(.)[1]', 'varchar(500)') [where3],
               (select ' and ' + whereClause from temp_builder_with_where where petid = t.petid and priority4 = 1 for xml path(''),type).value('(.)[1]', 'varchar(500)') [where4],
               (select ' and ' + whereClause from temp_builder_with_where where petid = t.petid and priority5 = 1 for xml path(''),type).value('(.)[1]', 'varchar(500)') [where5]
       from temp_builder_with_where [t]
       where petid = @i
        group by petid
    ) a
    ) a
    --remove last union all
    set @query = left(@query, len(@query) - 10)
    exec (@query)

    set @i = @i + 1

end

请记住,您必须将上述代码中的[MY_DATABASE] 替换为您的数据库名称 . 根据您的示例数据,这将是查询select * from TableWithNewId 的结果:

PetId|OwnerId|Priority
1    |6      |4
2    |4      |2
2    |4      |3
3    |1      |1
3    |2      |1
3    |3      |1
3    |4      |1
3    |5      |1
3    |6      |1

基于该结果,您现在可以根据最低优先级将OwnerId 分配给PetId(好吧,您没有说明如何处理发现多个OwnerId 具有相同优先级的情况)。

【讨论】:

这看起来棒极了!我刚刚对其进行了测试,它似乎运行良好。不幸的是,第一个查询为我返回了 8000 个匹配项,当我执行查询的第二部分(生成@query)时,它需要很长时间。我等了 5 分钟,但它从未完成。但对于小型数据集,它可以工作。现在我需要弄清楚如何优化第二部分。如果您有任何想法,请告诉我:) @user194076 我更新了我的答案,你可以试一试。【参考方案6】:

这可以在没有动态 sql 或循环的情况下完成。关键在于, 用于匹配宠物和主人的列是静态的。只有优先级是动态的。但是,性能很大程度上取决于您的数据。您必须自己进行测试并考虑您认为最好的方法。

下面的解决方案基本上可以找到与任何给定宠物匹配的所有所有者。然后过滤所有者以仅包括匹配优先级 1、或 1 & 2、或 1 & 2 & 3 等的所有者。最后找到匹配所有者的“最佳”,并使用此值更新宠物表.

我在查询中添加了一些解释性 cmets,但如果有任何不清楚的地方,请随时询问。

-- We start off by converting the priority values into int values that are suitable to add up to a bit array
-- I'll save those in a #Temp table to cut that piece of logic out of the final query
IF EXISTS(SELECT 1 FROM #TempBuilder)
BEGIN
    DROP TABLE #TempBuilder
END
SELECT 
    PetID, Field, MatchTo, 
    CASE [Priority] 
    WHEN 1 THEN 16 -- Priority one goes on the 16-bit (10000)
    WHEN 2 THEN 8 -- Priority two goes on the 8-bit (01000)
    WHEN 3 THEN 4 -- Priority three goes on the 4-bit (00100)
    WHEN 4 THEN 2 -- Priority four goes on the 2-bit (00010)
    WHEN 5 THEN 1 END AS [Priority] -- Priority five goes on the 1-bit (00001)
INTO #TempBuilder
FROM dbo.temp_builder;

-- Then we pivot the match priorities to be able to join them on our pets
WITH PivotedMatchPriorities AS (
    SELECT
        PetId,
        [Address], [Zip], [Country], [OwnerName], [Document]
    FROM (SELECT PetId, Field, [Priority] FROM #TempBuilder) tb
        PIVOT 
        (
            SUM([Priority])
            FOR [Field] IN ([Address], [Zip], [Country], [OwnerName], [Document])
        )
        AS PivotedMatchPriorities
),
-- Next we get (for each pet) all owners with ANY matching value
-- We want to filter the matching owners to find these that match priorities 1 (priority sum 10000, i.e. 16), 
    --- or match priorities 1 & 2 (priority sum 11000, i.e. 24)
    --- or match priorities 1 & 2 & 3 (priority sum 11100, i.e. 28)
    --- etc.
MatchingOwners AS (
    SELECT o.*,
        p.PetID,
        pmp.[Address] AS AddressPrio,
        pmp.Country AS CountryPrio,
        pmp.Zip AS ZipPrio,
        pmp.OwnerName AS OwnerPrio,
        pmp.Document AS DocumentPrio,
        CASE WHEN o.Addr = p.[Address] THEN ISNULL(pmp.[Address],0) ELSE 0 END
        + CASE WHEN o.Zip = p.Zip THEN ISNULL(pmp.Zip,0) ELSE 0 END
        + CASE WHEN o.Country = p.Country THEN ISNULL(pmp.Country,0) ELSE 0 END
        + CASE WHEN o.Document = p.Document THEN ISNULL(pmp.[Document],0) ELSE 0 END
        + CASE WHEN o.[Name] = p.OwnerName THEN ISNULL(pmp.OwnerName,0) ELSE 0 END AS MatchValue -- Calculate a match value for each matching owner
    FROM dbo.temp_pets p
        INNER JOIN dbo.temp_owners o 
            ON p.[Address] = o.Addr
            OR p.Country = o.Country
            OR p.Document = o.Document
            OR p.OwnerName = o.[Name]
            OR p.Zip = o.Zip
        INNER JOIN PivotedMatchPriorities pmp ON pmp.PetId = p.PetId
),
-- Now we can get all owners that match the pet, along with a match value for each owner.
-- We want to rank the matching owners for each pet to allow selecting the best ranked owner
-- Note: In the demo data there are multiple owners that match petId 3 equally well. We'll pick a random one in such cases.
RankedValidMatches AS (
    SELECT 
        PetID,
        OwnerID,
        MatchValue,
        ROW_NUMBER() OVER (PARTITION BY PetID ORDER BY MatchValue DESC) AS OwnerRank
    FROM MatchingOwners
    WHERE MatchValue IN (16, 24, 28, 30, 31)
)
-- Finally we can get the best valid match per pet
--SELECT * FROM RankedValidMatches WHERE OwnerRank = 1
-- Or we can update our pet table to reflect our results
UPDATE dbo.temp_pets
SET OwnerID = rvm.OwnerID
FROM dbo.temp_pets tp
    INNER JOIN RankedValidMatches rvm ON rvm.PetID = tp.PetID AND rvm.OwnerRank = 1

【讨论】:

【参考方案7】:

我使用 UNPIVOT 编写了另一个版本,但以更简单的方式对行进行排名和过滤

;with
-- r: rules table
r as (select * from temp_builder),
-- o0: owners table with all fields unpivotable (varchar)
o0 as (SELECT [OwnerID], [Addr], CAST([Zip] AS VARCHAR(100)) AS [Zip], [Country], [Document], [Name] FROM temp_owners ),
-- o: owners table unpivoted
o as (
    SELECT * FROM o0 
    UNPIVOT (FieldValue FOR Field IN ([Addr], [Zip], [Country], [Document], [Name])) AS p
),
-- p0: pets table with all fields unpivotable (varchar)
p0 as (SELECT [PetID], [Address], CAST([Zip] AS VARCHAR(100)) AS [Zip], [Country], [Document], [OwnerName] FROM temp_pets),
-- p: petstable unpivoted
p as (
    SELECT * FROM p0
    UNPIVOT (FieldValue FOR Field IN ([Address], [Zip], [Country], [Document], [OwnerName])) AS p
),
-- t: join up all data and keep only matching priority
d as (
    select petid, ownerid, priority 
    from (
        select r.*, o.ownerid, ROW_NUMBER() over (partition by r.petid, o.ownerid order by r.petid, o.ownerid, priority) calc_priority
        from r
        join p on (r.field = p.field) and (p.petid = r.petid)
        join o on (r.matchto = o.field) and (p.fieldvalue=o.fieldvalue) 
    ) x
    where calc_priority=priority
),
-- g: group by the matching rows to know the best priority reached for each pet
g as (
    select petid, max(priority) max_priority
    from d
    group by petid
)
-- output only the rows with best priority
select d.*
from d
join g on d.petid = g.petid and d.priority = g.max_priority
order by petid, ownerid, priority

这个版本的性能并不比@EdmondQuinton 高,(我投票给了他),我的慢了 5%,但我认为对于非专业用户来说更易于理解和维护。

【讨论】:

谢谢!不幸的是,静态枢轴对我不起作用。字段可以更改。 你的意思是主人和宠物的表结构可以改变吗?它们是您在此过程中构建的“临时”表吗?您可以命名列 Col1..Col10(最多为您需要的最大列数)并在未使用的列中保留空值,这样您将拥有 UNPIVOT 的静态列名【参考方案8】:

我会采取稍微不同的方法,而不是存储要匹配的列,您可以存储要执行的查询:

create table builder
(
    PetID int not null,
    Query varchar(max)
)

INSERT INTO builder
VALUES (1, 'SELECT TOP 1 *
FROM pets
INNER JOIN Owners
    ON Owners.Name = pets.OwnerName 
WHERE petId = 1
ORDER BY 
    CASE WHEN Owners.Country = pets.Country THEN 0 ELSE 1 END,
    CASE WHEN Owners.Zip = pets.Zip THEN 0 ELSE 1 END,
    CASE WHEN Owners.Addr = pets.Address THEN 0 ELSE 1 END'),
(2, 'SELECT TOP 1 *
FROM pets
INNER JOIN Owners
    ON Owners.Name = pets.OwnerName 
WHERE petId = 2
ORDER BY 
    CASE WHEN Owners.Document = pets.Document THEN 0 ELSE 1 END,
    CASE WHEN Owners.Name = pets.OwnerName THEN 0 ELSE 1 END,
    CASE WHEN Owners.Zip = pets.Zip THEN 0 ELSE 1 END'),
(3, 'SELECT TOP 1 *
FROM pets
INNER JOIN Owners
    ON Owners.Name = pets.OwnerName 
WHERE petId = 3
ORDER BY 
    CASE WHEN Owners.Country = pets.Country THEN 0 ELSE 1 END
')

create table pets
(
    PetID int null,
    Address varchar(100) null,
    Zip int null,
    Country varchar(100) null,
    Document varchar(100) null,
    OwnerName varchar(100) null,
    OwnerID int null,
    Field1 bit null,
    Field2 bit null
)

insert into pets values
(1, '123 5th st', 12345, 'US', 'test.csv', 'John', NULL, NULL, NULL),
(2, '234 6th st', 23456, 'US', 'a.csv', 'Alex', NULL, NULL, NULL),
(3, '345 7th st', 34567, 'US', 'b.csv', 'Mike', NULL, NULL, NULL)

create table owners
(
    OwnerID int null,
    Addr varchar(100) null,
    Zip int null,
    Country varchar(100) null,
    Document varchar(100) null,
    Name varchar(100) null,
    OtherField bit null,
    OtherField2 bit null,
)

insert into owners values
(1, '456 8th st',  45678, 'US', 'c.csv', 'Mike',  NULL, NULL),
(2, '678 9th st',  45678, 'US', 'b.csv', 'John',  NULL, NULL),
(3, '890 10th st', 45678, 'US', 'b.csv', 'Alex',  NULL, NULL),
(4, '901 11th st', 23456, 'US', 'b.csv', 'Alex',  NULL, NULL),
(5, '234 5th st',  12345, 'US', 'b.csv', 'John',  NULL, NULL),
(6, '123 5th st',  45678, 'US', 'a.csv', 'John',  NULL, NULL)

现在要找到特定宠物的匹配所有者,您只需从表中找到查询并执行它:

DECLARE @query varchar(max)
SELECT TOP 1 @query = query
FROM builder
WHERE petId =1

EXEC (@query)

【讨论】:

【参考方案9】:

考虑到这一点,这是一个严格解决您问题的答案

遵循您提出的规则无循环、无游标、无动态 sql 还要严格考虑您的问题,因此这不是通用解决方案,它非常适合您的问题和您拥有的列

测试数据

declare @Pets table 
(
    PetID int null,
    Address varchar(100) null,
    Zip int null,
    Country varchar(100) null,
    Document varchar(100) null,
    OwnerName varchar(100) null,
    OwnerID int null,
    Field1 bit null,
    Field2 bit null
)

insert into @Pets values
(1, '123 5th st', 12345, 'US', 'test.csv', 'John', NULL, NULL, NULL),
(2, '234 6th st', 23456, 'US', 'a.csv', 'Alex', NULL, NULL, NULL),
(3, '345 7th st', 34567, 'US', 'b.csv', 'Mike', NULL, NULL, NULL)

declare @owners table
(
    OwnerID int null,
    Addr varchar(100) null,
    Zip int null,
    Country varchar(100) null,
    Document varchar(100) null,
    Name varchar(100) null,
    OtherField bit null,
    OtherField2 bit null
)

insert into @owners values
(1, '456 8th st',  45678, 'US', 'c.csv', 'Mike',  NULL, NULL),
(2, '678 9th st',  45678, 'US', 'b.csv', 'John',  NULL, NULL),
(3, '890 10th st', 45678, 'US', 'b.csv', 'Alex',  NULL, NULL),
(4, '901 11th st', 23456, 'US', 'b.csv', 'Alex',  NULL, NULL),
(5, '234 5th st',  12345, 'US', 'b.csv', 'John',  NULL, NULL),
(6, '123 5th st',  45678, 'US', 'a.csv', 'John',  NULL, NULL)

declare @builder table  
(
    PetID int not null,
    Field varchar(30) not null,
    MatchTo varchar(30) not null,
    Priority int not null
)

insert into @builder values
(1,'Address', 'Addr',4),
(1,'Zip', 'Zip', 3),
(1,'Country', 'Country', 2),
(1,'OwnerName', 'Name',1),
(2,'Zip', 'Zip',3),
(2,'OwnerName','Name', 2),
(2,'Document', 'Document', 1),
(3,'Country', 'Country', 1)

解决问题的代码

select distinct p.PetID, min(o.OwnerID) as ownerID from @pets p
inner join @builder b on p.PetID = b.PetID
inner join @owners o on 
( 
   (case when b.Field = 'Address' and b.Priority = 1 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 1 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 1 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 1 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 1 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 1 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 1 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 1 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 1 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 1 then o.Document else '-1' end)                    
)
AND
( 
   (case when b.Field = 'Address' and b.Priority = 2 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 2 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 2 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 2 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 2 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 2 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 2 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 2 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 2 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 2 then o.Document else '-1' end)                    
)
AND
( 
   (case when b.Field = 'Address' and b.Priority = 3 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 3 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 3 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 3 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 3 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 3 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 3 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 3 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 3 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 3 then o.Document else '-1' end)                    
)
AND
( 
   (case when b.Field = 'Address' and b.Priority = 4 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 4 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 4 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 4 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 4 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 4 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 4 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 4 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 4 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 4 then o.Document else '-1' end)                    
)
AND
( 
   (case when b.Field = 'Address' and b.Priority = 5 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 5 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 5 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 5 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 5 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 5 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 5 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 5 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 5 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 5 then o.Document else '-1' end)                    
)
group by p.PetID

union
--------------------------

select distinct p.PetID, min(o.OwnerID) as ownerID from @pets p
inner join @builder b on p.PetID = b.PetID
inner join @owners o on 
( 
   (case when b.Field = 'Address' and b.Priority = 1 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 1 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 1 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 1 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 1 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 1 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 1 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 1 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 1 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 1 then o.Document else '-1' end)                    
)
AND
( 
   (case when b.Field = 'Address' and b.Priority = 2 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 2 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 2 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 2 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 2 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 2 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 2 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 2 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 2 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 2 then o.Document else '-1' end)                    
)
AND
( 
   (case when b.Field = 'Address' and b.Priority = 3 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 3 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 3 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 3 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 3 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 3 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 3 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 3 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 3 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 3 then o.Document else '-1' end)                    
)
AND
( 
   (case when b.Field = 'Address' and b.Priority = 4 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 4 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 4 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 4 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 4 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 4 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 4 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 4 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 4 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 4 then o.Document else '-1' end)                    
)
group by p.PetID

union
--------------------------

select distinct p.PetID, min(o.OwnerID) as ownerID from @pets p
inner join @builder b on p.PetID = b.PetID
inner join @owners o on 
( 
   (case when b.Field = 'Address' and b.Priority = 1 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 1 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 1 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 1 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 1 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 1 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 1 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 1 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 1 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 1 then o.Document else '-1' end)                    
)
AND
( 
   (case when b.Field = 'Address' and b.Priority = 2 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 2 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 2 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 2 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 2 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 2 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 2 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 2 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 2 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 2 then o.Document else '-1' end)                    
)
AND
( 
   (case when b.Field = 'Address' and b.Priority = 3 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 3 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 3 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 3 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 3 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 3 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 3 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 3 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 3 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 3 then o.Document else '-1' end)                    
)
group by p.PetID

union
------------------------

select distinct p.PetID, min(o.OwnerID) as ownerID from @pets p
inner join @builder b on p.PetID = b.PetID
inner join @owners o on 
( 
   (case when b.Field = 'Address' and b.Priority = 1 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 1 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 1 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 1 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 1 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 1 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 1 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 1 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 1 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 1 then o.Document else '-1' end)                    
)
AND
( 
   (case when b.Field = 'Address' and b.Priority = 2 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 2 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 2 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 2 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 2 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 2 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 2 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 2 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 2 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 2 then o.Document else '-1' end)                    
)
group by p.PetID

union
------------------------

select distinct p.PetID, min(o.OwnerID) as ownerID from @pets p
inner join @builder b on p.PetID = b.PetID
inner join @owners o on 
( 
   (case when b.Field = 'Address' and b.Priority = 1 then p.Address else '0' end) = (case when b.MatchTo = 'Addr' and b.Priority = 1 then o.Addr else '-1' end)                  
or (case when b.Field = 'Zip' and b.Priority = 1 then p.Zip else '0' end) = (case when b.MatchTo = 'Zip' and b.Priority = 1 then o.Zip else '-1' end)                    
or (case when b.Field = 'Country' and b.Priority = 1 then p.Country else '0' end) = (case when b.MatchTo = 'Country' and b.Priority = 1 then o.Country else '-1' end)                    
or (case when b.Field = 'OwnerName' and b.Priority = 1 then p.OwnerName else '0' end) = (case when b.MatchTo = 'Name' and b.Priority = 1 then o.Name else '-1' end)                  
or (case when b.Field = 'Document' and b.Priority = 1 then p.Document else '0' end) = (case when b.MatchTo = 'Document' and b.Priority = 1 then o.Document else '-1' end)                    
)
group by p.PetID

结果

PetID   OwnerID
1       2
2       6
3       1

【讨论】:

【参考方案10】:

如果您正在寻找一个没有联合、循环或游标或动态 SQL 的简单解决方案,下面的查询可以正常工作。

SQL 小提琴:http://sqlfiddle.com/#!18/10982/41

select PetID ,COALESCE(
 (select  top 1 OwnerID from temp_owners
     where Zip = pets.Zip 
     and Name = pets.OwnerName
     and Document = pets.Document) ,
     (select top 1 OwnerID from temp_owners where
         Name = pets.OwnerName 
         and Document = pets.Document)  ,
         (select top 1 OwnerID from temp_owners where
          Document = pets.Document)  ) OwnerId
       from 
temp_pets pets

结果:

PetID   OwnerId
1       (null)
2       6
3       2

【讨论】:

这个不需要Dynamic SQL,因为它是硬编码的,完全忽略了“特殊匹配表”的内容...

以上是关于根据动态列查找匹配记录的主要内容,如果未能解决你的问题,请参考以下文章

LINQ:如何使用动态键连接两个数据表

使用 vb.net 根据 SQL 数据库中存在的记录动态显示/隐藏 DataGrid 按钮列时遇到问题

jquery easyui datagrid 动态 加载列

SUMIF 动态更改求和列

使用 easypoi 导出 excel 实现动态列,完美解决!

沫沫金Sql查询树结构所有终极子节点