避免用于派生选择中的列的多个重复子查询

Posted

技术标签:

【中文标题】避免用于派生选择中的列的多个重复子查询【英文标题】:Avoiding multiple repetitive sub queries that are being used to derive columns in the select 【发布时间】:2018-11-21 05:49:49 【问题描述】:

我有一个包含多个子查询的视图,这些子查询用于派生选择列表中的列(为简单起见,我没有指定所有子查询)。我的问题是,编写这样一个包含这么多子查询的查询是完全可以的,还是有更好的方法来重写它以避免它们……任何可以遵循的最佳实践。我尝试查看执行派生查询或 cte 的选项,但由于某种原因,我无法将这部分放在一起。如果可能,我想消除那些重复的子查询。

  SELECT a.id,
   (
     SELECT TOP 1
      name
     FROM x.dbo.Info l
     WHERE orderno = a.orderno
       AND releaseno = a.releaseno
       AND stamp =
       (
       SELECT MIN(stamp)
       FROM x.dbo.Info
       WHERE orderno = l.orderno
         AND releaseno = l.releaseno
         AND status = 'Released'
       )
     ORDER BY stamp DESC
   ) [shop_name],
   c.line_no,
   a.status,
   d.family,
   (
     SELECT TOP 1
      name
     FROM x.dbo.Info
     WHERE orderno = a.orderno
       AND releaseno = a.releaseno
       AND status NOT LIKE 'backflus%'
       AND status NOT LIKE 'so%'
     ORDER BY stamp DESC
   ) AS [lastworkplace],
   (
     SELECT TOP 1
      lstatus
     FROM x.dbo.Info
     WHERE orderno = a.orderno
       AND releaseno = a.releaseno
       AND status NOT LIKE 'backflus%'
       AND status NOT LIKE 'so%'
     ORDER BY stamp DESC
   ) AS [laststatus]
FROM BI.dbo.tblz a -- this is a view (not sure if that matters)
  LEFT JOIN X.dbo.tblx b
    ON b.id = a.salesorder
  LEFT JOIN X.dbo.tbls c
    ON c.tranid = a.salesorder
     AND c.itemid = a.assemblyid
     AND c.serialnum = a.ordercode
  LEFT JOIN Z.dbo.tbli d
    ON d.prodline = LEFT(COALESCE(STUFF(a.assemblyid, CHARINDEX('+', a.assemblyid), 1, ''), a.assemblyid), 2)
WHERE a.id = 'p'
  AND
  (
    LEFT(a.prun, 8) >= '20120101'
    OR a.prun IS NULL
  )
UNION ALL
SELECT a.id,
   (
     SELECT TOP 1
      name
     FROM x.dbo.Info l
     WHERE orderno = a.orderno
       AND releaseno = a.releaseno
       AND stamp =
       (
       SELECT MIN(stamp)
       FROM x.dbo.Info
       WHERE orderno = l.orderno
         AND releaseno = l.releaseno
         AND status = 'Released'
       )
     ORDER BY stamp DESC
   ) [shop_name],
   c.line_no,
   a.status,
   d.family,
   (
     SELECT TOP 1
      name
     FROM x.dbo.Info
     WHERE orderno = a.orderno
       AND releaseno = a.releaseno
       AND status NOT LIKE 'backflus%'
       AND status NOT LIKE 'so%'
     ORDER BY stamp DESC
   ) AS [lastworkplace],
   (
     SELECT TOP 1
      lstatus
     FROM x.dbo.Info
     WHERE orderno = a.orderno
       AND releaseno = a.releaseno
       AND status NOT LIKE 'backflus%'
       AND status NOT LIKE 'so%'
     ORDER BY stamp DESC
   ) AS [laststatus]
FROM BI.dbo.tblz a -- this is a view (not sure if that matters)
  LEFT JOIN X.dbo.tblx b
    ON b.id = a.salesorder
  LEFT JOIN X.dbo.tbls c
    ON c.tranid = a.salesorder
     AND c.itemid = a.assemblyid
     AND c.serialnum = a.ordercode
  LEFT JOIN Z.dbo.tbli d
    ON d.prodline = LEFT(COALESCE(STUFF(a.assemblyid, CHARINDEX('+', a.assemblyid), 1, ''), a.assemblyid), 2)
WHERE a.id = 'm'
  AND
  (
    LEFT(a.prun, 8) >= '20120101'
    OR a.prun IS NULL
  );

【问题讨论】:

UNION ALL前后的select语句有什么区别? Alex - 完全一样,除非您注意到 where 子句 a.id = 'm' 而不是 'p' 的区别。 为什么不在单个查询中使用WHERE a.id IN( 'm', 'p' )。注意:您必须衡量性能,因为有时UNION ALL 更快。 【参考方案1】:

您可以使用CTE 重写您的选择。它更具可读性。引用文档:

指定一个临时命名的结果集,称为公用表表达式 (CTE)。这源自一个简单的查询,并在单个 SELECT、INSERT、UPDATE 或 DELETE 语句的执行范围内定义。该子句也可以在 CREATE VIEW 语句中用作其定义的 SELECT 语句的一部分。公共表表达式可以包含对自身的引用。这称为递归公用表表达式。

只是一个简单的示例:

WITH 
 step1 as 
  ( select a+1 as x, b-1 as y
    from t
  ),
 step2 as
  ( select x*2 as i, y/2 as j
    from step1
  )
select i+j as r
from step2;

你可以用这种方式链接多个句子

【讨论】:

如果我们改用 CTE,这会减少上面的子查询吗?如果您注意到这些子查询是重复的并且是从同一个基表中查询的。除了性能之外,我更关心的是减少代码或至少让它达到最佳实践。 为了改进你的句子以匹配最佳实践:1)避免相关子查询(每行一个子查询)2)以可读模式写句子3)检查性能。关于第 1 点:最好在先前的 cte 选择中准备数据,然后与新的连接,而不是为每一行使用子查询。也使用窗口函数代替SELECT TOP 1 我试图避免相关的子查询,并试图将其转换为连接,但我认为我缺乏一些专业知识。另外,如果我选择使用 CTE,我该如何完成它。如果您可以根据我发布的查询给我一个示例,是否可以。【参考方案2】:

如果是:

SELECT TOP 1
    [some column]
FROM x.dbo.Info
WHERE orderno = a.orderno
    AND releaseno = a.releaseno
    AND status NOT LIKE 'backflus%'
    AND status NOT LIKE 'so%'
ORDER BY stamp DESC

您可以尝试OUTER APPLY 参见this 和this 例如:

SELECT a.id,
   (
     SELECT TOP 1
      name
     FROM x.dbo.Info l
     WHERE orderno = a.orderno
       AND releaseno = a.releaseno
       AND stamp =
       (
       SELECT MIN(stamp)
       FROM x.dbo.Info
       WHERE orderno = l.orderno
         AND releaseno = l.releaseno
         AND status = 'Released'
       )
     ORDER BY stamp DESC
   ) [shop_name],
   c.line_no,
   a.status,
   d.family,
   SomeInfo.name AS [lastworkplace], --<-- Note the change
   SomeInfo.lstatus AS [laststatus] --<-- Note the change
FROM BI.dbo.tblz a -- this is a view (not sure if that matters)
  LEFT JOIN X.dbo.tblx b
    ON b.id = a.salesorder
  LEFT JOIN X.dbo.tbls c
    ON c.tranid = a.salesorder
     AND c.itemid = a.assemblyid
     AND c.serialnum = a.ordercode
  LEFT JOIN Z.dbo.tbli d
    ON d.prodline = LEFT(COALESCE(STUFF(a.assemblyid, CHARINDEX('+', a.assemblyid), 1, ''), a.assemblyid), 2)

    OUTER APPLY(  --<-- Note the extra join
        SELECT TOP 1
            *
        FROM x.dbo.Info
        WHERE orderno = a.orderno
            AND releaseno = a.releaseno
            AND status NOT LIKE 'backflus%'
            AND status NOT LIKE 'so%'
        ORDER BY stamp DESC
    ) AS SomeInfo
WHERE a.id = 'p'
  AND
  (
    LEFT(a.prun, 8) >= '20120101'
    OR a.prun IS NULL
  )

【讨论】:

谢谢!亚历克斯我会试试这个。我还希望减少代码,以便我们只查询表一次并获取从它派生的所有列,而不是每次都在选择列表中查询它。

以上是关于避免用于派生选择中的列的多个重复子查询的主要内容,如果未能解决你的问题,请参考以下文章

新列的多个 BigQuery 子选择

无法更新表中的列 它返回单行子查询返回多个

如何从 PostgreSQL 的子查询中选择包含值数组的列?

MySQL 子查询 派生表子查询错误

联合查询、表连接查询、子查询三种查询的特点和注意事项各是啥

MySQL - 如何将子查询中的列别名用于另一个子查询?