SQL LEFT JOIN 到许多类别

Posted

技术标签:

【中文标题】SQL LEFT JOIN 到许多类别【英文标题】:SQL LEFT JOIN to many categories 【发布时间】:2021-11-27 02:15:50 【问题描述】:

假设以下简单的scenario,其中一个产品行连接到一个主要类别、子类别和子子类别。

DECLARE @PRODUCTS TABLE (ID int, DESCRIPTION varchar(50), CAT varchar(30), SUBCAT varchar(30), SUBSUBCAT varchar(30));

INSERT @PRODUCTS (ID, DESCRIPTION, CAT, SUBCAT, SUBSUBCAT) VALUES
(1, 'NIKE MILLENIUM', '1', '10', '100'),
(2, 'NIKE CORTEZ', '1', '12', '104'),
(3, 'ADIDAS PANTS', '2', '27', '238'),
(4, 'PUMA REVOLUTION 5', '3', '35', '374'),
(5, 'SALOMON SHELTER CS', '4', '15', '135'),
(6, 'NIKE EBERNON LOW', '2', '14', '157');

DECLARE @CATS TABLE (ID int, DESCR varchar(100));

INSERT @CATS (ID, DESCR) VALUES
(1, 'MEN'),
(2, 'WOMEN'),
(3, 'UNISEX'),
(4, 'KIDS'),
(5, 'TEENS'),
(6, 'BACK TO SCHOOL');

DECLARE @SUBCATS TABLE (ID int, DESCR varchar(100));

INSERT @SUBCATS (ID, DESCR) VALUES
(10, 'FOOTWEAR'),
(12, 'OUTERWEAR'),
(14, 'SWIMWEAR'),
(15, 'HOODIES'),
(27, 'CLOTHING'),
(35, 'SPORTS');

DECLARE @SUBSUBCATS TABLE (ID int, DESCR varchar(100));

INSERT @SUBSUBCATS (ID, DESCR) VALUES
(100, 'RUNNING'),
(104, 'ZIP TOPS'),
(135, 'FLEECE'),
(157, 'BIKINIS'),
(238, 'PANTS'),
(374, 'JOGGERS');


SELECT prod.ID,
    prod.DESCRIPTION,
    CONCAT(cat1.DESCR, ' > ', cat2.DESCR, ' > ', cat3.DESCR) AS CATEGORIES
FROM @PRODUCTS AS prod
LEFT JOIN @CATS AS cat1 ON cat1.ID = prod.CAT
LEFT JOIN @SUBCATS AS cat2 ON cat2.ID = prod.SUBCAT
LEFT JOIN @SUBSUBCATS AS cat3 ON cat3.ID = prod.SUBSUBCAT;

现在假设@PRODUCTS 表上的外键不仅仅是它们各自表的索引。它们是多个类别、子类别和子子类别(如 here)的逗号分隔索引。

DECLARE @PRODUCTS TABLE (ID int, DESCRIPTION varchar(50), CAT varchar(30), SUBCAT varchar(30), SUBSUBCAT varchar(30));

INSERT @PRODUCTS (ID, DESCRIPTION, CAT, SUBCAT, SUBSUBCAT) VALUES
(1, 'NIKE MILLENIUM', '1, 2', '10, 12', '100, 135'),
(2, 'NIKE CORTEZ', '1, 5', '12, 15', '104, 374'),
(3, 'ADIDAS PANTS', '2, 6', '27, 35', '238, 374');

DECLARE @CATS TABLE (ID int, DESCR varchar(100));

INSERT @CATS (ID, DESCR) VALUES
(1, 'MEN'),
(2, 'WOMEN'),
(3, 'UNISEX'),
(4, 'KIDS'),
(5, 'TEENS'),
(6, 'BACK TO SCHOOL');

DECLARE @SUBCATS TABLE (ID int, DESCR varchar(100));

INSERT @SUBCATS (ID, DESCR) VALUES
(10, 'FOOTWEAR'),
(12, 'OUTERWEAR'),
(14, 'SWIMWEAR'),
(15, 'HOODIES'),
(27, 'CLOTHING'),
(35, 'SPORTS');

DECLARE @SUBSUBCATS TABLE (ID int, DESCR varchar(100));

INSERT @SUBSUBCATS (ID, DESCR) VALUES
(100, 'RUNNING'),
(104, 'ZIP TOPS'),
(135, 'FLEECE'),
(157, 'BIKINIS'),
(238, 'PANTS'),
(374, 'JOGGERS');


SELECT prod.ID,
    prod.DESCRIPTION
    --CONCAT(cat1.DESCR, ' > ', cat2.DESCR, ' > ', cat3.DESCR) AS CATEGORIES
FROM @PRODUCTS AS prod
--LEFT JOIN @CATS AS cat1 ON cat1.ID = prod.CAT
--LEFT JOIN @SUBCATS AS cat2 ON cat2.ID = prod.SUBCAT
--LEFT JOIN @SUBSUBCATS AS cat3 ON cat3.ID = prod.SUBSUBCAT;

在这种情况下,我想实现以下目标:

    能够检索猫、子猫、子子猫的各自名称,即。对于猫 '1, 2' 能够检索他们的名字(我试过 LEFT JOIN @CATS AS cat1 ON cat1.ID IN prod.CAT 但它不起作用) 创建相应猫、子猫、子子猫的三元组,即。对于
猫'1、2' subcats '12, 17' sub-subcats '239, 372'

(在检索到适当的名称后)创建管道分隔的类别路由,例如name of cat 1 > name of subcat 12 > name of sub-subcat 239 | name of cat 2 > name of subcat 17 > name of sub-subcat 372

所以,对于像(1, 'NIKE MILLENIUM', '1, 2', '10, 12', '100, 135'), 这样的一行

我想得到以下结果

ID DESCRIPTION CATEGORIES
1 NIKE MILLENIUM MEN > FOOTWEAR > RUNNING @ WOMEN > OUTERWEAR > FLEECE (I had to use @ as the delimiter of the two triplets because pipe messed with the table's columns)

如果用户愚蠢地存储了比子猫 ID 或子子猫 ID 更多的猫 ID,则查询应该只匹配具有相应位置匹配的那些,即 for

猫'1、2' 子目录“12” sub-subcats '239, 372'

它应该只创建一个三元组,例如 name of 1 > name of 12 > name of 239

【问题讨论】:

显而易见的解决方案是修复你的设计......这是一种糟糕的存储关系的方式,在你修复它之前会一直困扰着你。 您实际尝试过什么?选择并不是真正的尝试。你将不得不做一些事情,比如一个字符串拆分,然后是一个连接,然后是一个连接。 正如我所说,这太糟糕了,请退后并告诉旨在正确执行此操作的应用程序。但如果没有,看看你的进展如何。 你收到的那种分隔的混乱违反了 1NF 并且是一个严重的 PITA。更糟糕的是,在分隔符之后插入了一个空格。啊!!!您将需要在此处使用 string_split,因为您坚持使用此设计。 首先是 Sean 这么说的。我还给了你一个开始提示“你将不得不做一些事情,比如一个字符串拆分,然后是一个连接,然后是一个连接。” 【参考方案1】:

STRING_SPLIT() 不承诺按特定顺序返回值,因此在这种情况下它不起作用,因为序数位置很重要。

使用OPENJSON() 将字符串拆分为单独的行,以确保以相同的顺序返回值。OPENJSON() 还返回一个key 字段,因此您可以加入每个分组中的行号。您需要INNER JOIN,因为您的要求是该“列”中的所有值都必须存在。 使用STUFF() 组装各种 cat>subcat>subsubcat 值。

DECLARE @PRODUCTS TABLE (ID int, DESCRIPTION varchar(50), CAT varchar(30), SUBCAT varchar(30), SUBSUBCAT varchar(30));

INSERT @PRODUCTS (ID, DESCRIPTION, CAT, SUBCAT, SUBSUBCAT) VALUES
(1, 'NIKE MILLENIUM', '1, 2', '10, 12', '100, 135'),
(2, 'NIKE CORTEZ', '1, 5', '12, 15', '104, 374'),
(3, 'ADIDAS PANTS', '2, 6, 1', '27, 35, 10', '238, 374, 100'),
(4, 'JOE THE PLUMBER JEANS', '1, 5', '27', '238, 374');

DECLARE @CATS TABLE (ID int, DESCR varchar(100));

INSERT @CATS (ID, DESCR) VALUES
(1, 'MEN'),
(2, 'WOMEN'),
(3, 'UNISEX'),
(4, 'KIDS'),
(5, 'TEENS'),
(6, 'BACK TO SCHOOL');

DECLARE @SUBCATS TABLE (ID int, DESCR varchar(100));

INSERT @SUBCATS (ID, DESCR) VALUES
(10, 'FOOTWEAR'),
(12, 'OUTERWEAR'),
(14, 'SWIMWEAR'),
(15, 'HOODIES'),
(27, 'CLOTHING'),
(35, 'SPORTS');

DECLARE @SUBSUBCATS TABLE (ID int, DESCR varchar(100));

INSERT @SUBSUBCATS (ID, DESCR) VALUES
(100, 'RUNNING'),
(104, 'ZIP TOPS'),
(135, 'FLEECE'),
(157, 'BIKINIS'),
(238, 'PANTS'),
(374, 'JOGGERS');

;
with prod as (
    SELECT p.ID,
        p.DESCRIPTION
        --CONCAT(cat1.DESCR, ' > ', cat2.DESCR, ' > ', cat3.DESCR) AS CATEGORIES
        , c.value as CatId
        , c.[key] as CatKey
        , sc.value as SubCatId
        , sc.[key] as SubCatKey
        , ssc.value as SubSubCatId
        , ssc.[key] as SubSubCatKey
    FROM @PRODUCTS p
      cross apply OPENJSON(CONCAT('["', REPLACE(cat, ', ', '","'), '"]')) c
      cross apply OPENJSON(CONCAT('["', REPLACE(subcat, ', ', '","'), '"]')) sc
      cross apply OPENJSON(CONCAT('["', REPLACE(subsubcat, ', ', '","'), '"]')) ssc
    where c.[key] = sc.[key]
      and c.[key] = ssc.[key]
)
, a as (
    select p.ID
    , p.DESCRIPTION
    , c.DESCR + ' > ' + sc.DESCR + ' > ' + ssc.DESCR as CATEGORIES
    , p.CatKey
    from prod p
      inner join @CATS c on c.ID = p.CatId
      inner join @SUBCATS sc on sc.ID = p.SubCatId
      inner join @SUBSUBCATS ssc on ssc.ID = p.SubSubCatId
)

select DISTINCT ID
, DESCRIPTION
, replace(STUFF((SELECT distinct ' | ' + a2.CATEGORIES
            from a a2
            where a.ID = a2.ID
            FOR XML PATH(''))
        ,1,2,''), '>', '>') CATEGORIES
from a

【讨论】:

天啊,这太棒了!我的 SQL Server 技能非常先进,但如果它与 SQL Server 2014 兼容,我就能适应我的情况!它甚至会处理可能存在于猫、子猫或子子猫的“孤立”多余索引...... :( dougp 对于我的 SQL Server 2014 问题,没有简单的解决方案 - 或者根本没有 - 是吗? 您需要编写一个用户定义的函数来替换 OPENJSON(CONCAT()) 正在做的事情。我认为这不会那么困难,特别是如果字符串中的值数量有限。您是说每个字符串最多有 3 个 ID 吗?您是否有权制作 UDF? Carlos 提到的split 函数看起来是一个好的开始,但我认为您需要添加一个键才能按列号加入。【参考方案2】:

由于旧技术的变化,答案完全不同。我认为我的原始答案对于使用当前 SQL Server 版本的人来说仍然很好,所以我不想删除它。

我不记得我是从哪里得到这个函数的。当我今天找到它时,它被命名为 split_delimiter。我更改了名称,添加了一些 cmets,并加入了具有超过一个字符长度的分隔符的功能。

CREATE FUNCTION [dbo].[udf_split_string](@delimited_string VARCHAR(8000), @delimiter varchar(10))
RETURNS TABLE AS
RETURN
WITH cte10(num) AS (    --  10 rows
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)                         
, cte100(num) AS (  --  100 rows
    SELECT 1 
    FROM   cte10 t1, cte10 t2
)
, cte10000(num) AS (    --  10000 rows
    SELECT 1 
    FROM   cte100 t1, cte100 t2
)
, cte1(num) AS (    --  1 row per character
    SELECT TOP (ISNULL(DATALENGTH(@delimited_string), 0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
    FROM   cte10000
)
, cte2(num) AS (    --  locations of strings
    SELECT  1
    UNION ALL
    SELECT t.num + len(replace(@delimiter, ' ', '_'))
    FROM   cte1 t
    WHERE  SUBSTRING(@delimited_string, t.num, len(replace(@delimiter, ' ', '_'))) = @delimiter
)
, cte3(num, [len]) AS (
    SELECT t.num
         , ISNULL(NULLIF(CHARINDEX(@delimiter, @delimited_string, t.num), 0) - t.num, 8000)
    FROM  cte2 t
)

SELECT [Key]   = ROW_NUMBER() OVER (ORDER BY t.num)
     , [Value] = SUBSTRING(@delimited_string, t.num, t.[len])
FROM cte3 t;
GO



DECLARE @PRODUCTS TABLE (ID int, DESCRIPTION varchar(50), CAT varchar(30), SUBCAT varchar(30), SUBSUBCAT varchar(30));

INSERT @PRODUCTS (ID, DESCRIPTION, CAT, SUBCAT, SUBSUBCAT) VALUES
(1, 'NIKE MILLENIUM', '1, 2', '10, 12', '100, 135'),
(2, 'NIKE CORTEZ', '1, 5', '12, 15', '104, 374'),
(3, 'ADIDAS PANTS', '2, 6, 1', '27, 35, 10', '238, 374, 100'),
(4, 'JOE THE PLUMBER JEANS', '1, 5', '27', '238, 374');

DECLARE @CATS TABLE (ID int, DESCR varchar(100));

INSERT @CATS (ID, DESCR) VALUES
(1, 'MEN'),
(2, 'WOMEN'),
(3, 'UNISEX'),
(4, 'KIDS'),
(5, 'TEENS'),
(6, 'BACK TO SCHOOL');

DECLARE @SUBCATS TABLE (ID int, DESCR varchar(100));

INSERT @SUBCATS (ID, DESCR) VALUES
(10, 'FOOTWEAR'),
(12, 'OUTERWEAR'),
(14, 'SWIMWEAR'),
(15, 'HOODIES'),
(27, 'CLOTHING'),
(35, 'SPORTS');

DECLARE @SUBSUBCATS TABLE (ID int, DESCR varchar(100));

INSERT @SUBSUBCATS (ID, DESCR) VALUES
(100, 'RUNNING'),
(104, 'ZIP TOPS'),
(135, 'FLEECE'),
(157, 'BIKINIS'),
(238, 'PANTS'),
(374, 'JOGGERS');

;
with prod as (
    SELECT p.ID,
        p.DESCRIPTION
        , c.value as CatId
        , c.[key] as CatKey
        , sc.value as SubCatId
        , sc.[key] as SubCatKey
        , ssc.value as SubSubCatId
        , ssc.[key] as SubSubCatKey
    FROM @PRODUCTS p
      cross apply dbo.udf_split_string(cat, ', ') c
      cross apply dbo.udf_split_string(subcat, ', ') sc
      cross apply dbo.udf_split_string(subsubcat, ', ') ssc
    where c.[key] = sc.[key]
      and c.[key] = ssc.[key]
)
, a as (
    select p.ID
    , p.DESCRIPTION
    , c.DESCR + ' > ' + sc.DESCR + ' > ' + ssc.DESCR as CATEGORIES
    , p.CatKey
    from prod p
      inner join @CATS c on c.ID = p.CatId
      inner join @SUBCATS sc on sc.ID = p.SubCatId
      inner join @SUBSUBCATS ssc on ssc.ID = p.SubSubCatId
)

select DISTINCT ID
, DESCRIPTION
, replace(STUFF((SELECT distinct ' | ' + a2.CATEGORIES
            from a a2
            where a.ID = a2.ID
            FOR XML PATH(''))
        ,1,2,''), '>', '>') CATEGORIES
from a

【讨论】:

这就像一个魅力dougp,非常感谢您对我的帮助,尽管我的案件要求非常具体且难以处理。 dougp 你能向我解释一下查询的 STUFF() 部分吗?我们为什么用它?这对我来说不是很清楚!我们不能只使用 CONCAT() 来添加其余的层次结构吗?【参考方案3】:

那应该可以了,我把你的字符“>”改成了“-”,只是为了让数据更简单。

您的桌子设计并不完美,但第一次尝试几乎从未如此。

select mainp.ID, mainp.DESCRIPTION, stuff(ppaths.metapaths, len(ppaths.metapaths),1,'') metalinks
from @PRODUCTS mainp
cross apply(
select
(select 
  c.DESCR + '-' + sc.DESCR + '-' + sbc.DESCR + '|'
from @PRODUCTS p    
    cross apply (select row_number() over(order by Value) id, Value from split(p.CAT, ','))cat_ids
    inner join @cats c on c.ID = cat_ids.Value
    cross apply (select row_number() over(order by Value) id, Value from split(p.SUBCAT, ','))subcat_ids
    inner join @SUBCATS sc on sc.ID = subcat_ids.Value
    and subcat_ids.id = subcat_ids.id
    cross apply (select row_number() over(order by Value) id, Value  from split(p.SUBSUBCAT, ','))subsubcat_ids
    inner join @SUBSUBCATS sbc on sbc.ID = subsubcat_ids.Value
    and subsubcat_ids.id = subcat_ids.id
where p.id = mainp.ID
for xml path('')) metapaths
) ppaths

拆分功能的链接 https://desarrolladores.me/2014/03/sql-server-funcion-split-para-dividir-un-string/

【讨论】:

事实证明它毕竟不能很好地工作......不幸的是,它创建了所有可能的 cat/subcat/subsubcat 组合......而它只需要匹配 cat/subcats/subsubcats 具有相同的序数位置...这意味着对于 cat [1,2,3]、subcat [10,20,30] 和 subsubcat [100,200,300] 它应该只创建 3 个层次结构,应该是 1 > 10 > 100, 2 > 20 > 200 和 3 > 30 > 300 @FayeD。只需添加序列号并加入,很简单。 Carlos,你能详细说明一下吗?

以上是关于SQL LEFT JOIN 到许多类别的主要内容,如果未能解决你的问题,请参考以下文章

优化查询(LEFT JOIN)

Lamda 表达式里的Join和GroupJoin的区别, 如何实现SQL的Left Join效果

带有 parent_id 的 MySQL 类别 - SELF Join

Mysql - LEFT JOIN - 获取第一个条目

SQL JOIN 重复行

Laravel:自我 JOIN 的雄辩查询