选择下的 SQL 函数使其非常慢
Posted
技术标签:
【中文标题】选择下的 SQL 函数使其非常慢【英文标题】:SQL a Function under Select make it very Slow 【发布时间】:2021-10-27 13:46:24 【问题描述】:我正在执行 SQL 选择查询。在[E]字段下,有一个名为“[defaultDB].[dbo].HighFirst”的存储函数,用于判断“高”或“低”先发生。
很遗憾,这个判断功能让查询超级慢。如果我跳过该函数(将函数替换为 1),则只需 2 秒即可完成查询。但是,如果我使用该功能,则需要 30 分钟以上。是否可以不使用store功能,直接将判断嵌入到查询中?
USE defaultDB;
IF OBJECT_ID (N'[defaultDB].[dbo].[HighFirst]', N'FN') IS NOT NULL
DROP FUNCTION HighFirst;
GO
CREATE FUNCTION HighFirst(@Date INT, @Stime INT, @ETime INT, @High Float, @Low Float)
RETURNS BIT
AS
BEGIN
Declare @result Bit
IF
(SELECT min([time]) FROM [defaultDB].[dbo].[Table2] WHERE [date] = @Date AND [Start] >= @Stime AND [Time]<= @etime and [H] > @high) <= (SELECT min([time]) FROM [defaultDB].[dbo].[Table2] WHERE [date] = @Date AND [Start] >= @Stime AND [Time]<= @etime and [L] < @Low)
Set @result = 1
ELSE
Set @result = 0;
Return @result
END;
Select
[Stime],
[Etime],
[target],
[CL],
[WinRate] = count(CASE WHEN [Max] > [target] THEN 1 END) / cast( count(*) as float),
[E] = AVG( Case
WHEN [Max] >= [target] AND [Min]< [CL] THEN IIF ([defaultDB].[dbo].HighFirst([date], [Stime], [Etime], [Target], [CL]) =1 , [Target] ,[CL])
WHEN [Max] >= [target] AND [Min] > [CL] THEN [target] /*HIT*/
WHEN [Max] < [target] AND [Min] < [CL] THEN [CL] /*CL*/
WHEN [Max] < [target] AND [Min] > [CL] THEN [EndReturn] /*both not */
END ) ,
[count] = count(*)
FROM [defaultDB].[dbo].[Table1] M
JOIN (values (0.003), (0.0035), (0.004), (0.0045), (0.005), (0.0055), (0.006), (0.0065), (0.007), (0.0075), (0.008) ) as T([target])
on 1 =1
JOIN (values (-0.003), (-0.0035), (-0.004), (-0.0045), (-0.005), (-0.0055), (-0.006), (-0.0065), (-0.007), (-0.0075), (-0.008), (-0.0085), (-0.009), (-0.0095), (-0.01), (-0.0105), (-0.011), (-0.0115) ) as C([CL])
on [M].[date] in ( 20120307,20120601,20121109,20130826,20131002,20140117,20140122,20140311,20140529,20140718,20150619,20151014,20151022,20160411,20160516,20160721,20160818,20160909,20170127,20170213,20170921,20171025,20171229,20180116,20180315,20180926,20181022,20181128,20181211,20190104,20190329,20190502,20190521,20190528,20190611,20190627,20190823,20190930,20191104,20191211,20200214,20200318,20200529,20200706,20200828,20201230,20210112,20210305,20210318,20210408,20210525,20210617,20210625)
AND [Stime] >= 133000
Group By [Stime], [Etime], [Target], [CL]
【问题讨论】:
如果你把你的函数变成一个内联表值函数,并在FROM
中引用(通过APPLY
)我怀疑性能会好很多。
旁注:JOIN ON 1=1
很傻,就写CROSS JOIN
。我希望Table1.date
不是int
专栏,这很奇怪
【参考方案1】:
我看到了两种优化方法。
稍作改动的第一种方法是将 table2 转换为内存表,然后将 HighFirst 转换为本机编译函数
另一种方法是使用外部应用而不是函数
SELECT IIF (TargetTime.MinTime <= CLTime.MinTime , [Target] ,[CL])
FROM [defaultDB].[dbo].[Table1] AS M
JOIN(VALUES(0.003), (0.0035), (0.004), (0.0045), (0.005), (0.0055), (0.006), (0.0065), (0.007), (0.0075), (0.008)) AS T([target])
ON 1 = 1
JOIN(VALUES(-0.003), (-0.0035), (-0.004), (-0.0045), (-0.005), (-0.0055), (-0.006), (-0.0065), (-0.007), (-0.0075), (-0.008), (-0.0085), (-0.009), (-0.0095), (-0.01), (-0.0105), (-0.011), (-0.0115)) AS C([CL])
ON 1 = 1
OUTER APPLY
(
SELECT MIN([time]) AS MinTime
FROM [defaultDB].[dbo].[Table2]
WHERE [date] = M.[DATE]
AND [Start] >= M.Stime
AND [Time] <= M.etime
AND [H] > T.target
) as TargetTime
OUTER APPLY
(
SELECT MIN([time]) AS MinTime
FROM [defaultDB].[dbo].[Table2]
WHERE [date] = M.[DATE]
AND [Start] >= M.Stime
AND [Time] <= M.etime
AND L < C.CL
) as CLTime
WHERE [M].[date] IN(20120307, 20120601, 20121109, 20130826, 20131002, 20140117, 20140122, 20140311, 20140529, 20140718, 20150619, 20151014, 20151022, 20160411, 20160516, 20160721, 20160818, 20160909, 20170127, 20170213, 20170921, 20171025, 20171229, 20180116, 20180315, 20180926, 20181022, 20181128, 20181211, 20190104, 20190329, 20190502, 20190521, 20190528, 20190611, 20190627, 20190823, 20190930, 20191104, 20191211, 20200214, 20200318, 20200529, 20200706, 20200828, 20201230, 20210112, 20210305, 20210318, 20210408, 20210525, 20210617, 20210625)
AND [Stime] >= 133000;
【讨论】:
谢谢!!!惊人的!!!过去需要一个半小时以上。现在需要 1 分 28 秒!!!有什么办法可以进一步提高速度? 应用您的建议后,sql 查询速度提高了 10 倍。但是,它仍然不够快。因此,我尝试将 GPU 与炽热的 SQL 一起使用。不幸的是......它不支持“APPLY”运算符......我应该如何修改sql?我应该使用加入吗?【参考方案2】:按照函数使用模式,您可以尝试重新定义函数以返回值而不是标志,这样您就不需要 IIF 检查。您可以使用条件聚合来做到这一点。这样 [defaultDB].[dbo].[Table2] 只扫描一次:
CREATE FUNCTION HighFirstVal(@Date INT, @Stime INT, @ETime INT, @High Float, @Low Float)
RETURNS float
AS
BEGIN
RETURN (SELECT case when
min(case when [H] > @high then [time] end) <= min(case when [L] < @Low then [time] end)
then @high else @low end
FROM [defaultDB].[dbo].[Table2]
WHERE [date] = @Date AND [Start] >= @Stime AND [Time]<= @etime)
END
或者您可以在将参数替换为列后直接在查询中使用 RETURN 中的子查询。
【讨论】:
【参考方案3】:您可以将此函数转换为内联表值函数,这可能会执行得更好。
CREATE FUNCTION dbo.HighFirst
(@Date INT, @Stime INT, @ETime INT, @High Float, @Low Float)
RETURNS TABLE
AS RETURN
SELECT Result =
CASE WHEN MIN(CASE WHEN t2.H > @high THEN t2.[time] END) >
MIN(CASE WHEN t2.L < @Low THEN t2.[time] END)
THEN @High ELSE @Low END
FROM dbo.Table2 t2
WHERE t2.[date] = @Date
AND t2.Start >= @Stime
AND t2.[Time] <= @etime;
GO
你可以像这样使用APPLY
:
Select
[Stime],
[Etime],
[target],
[CL],
[WinRate] = count(CASE WHEN [Max] > [target] THEN 1 END) / cast( count(*) as float),
[E] = AVG( Case
WHEN [Max] >= [target] AND [Min]< [CL] THEN h.Result
WHEN [Max] >= [target] AND [Min] > [CL] THEN [target] /*HIT*/
WHEN [Max] < [target] AND [Min] < [CL] THEN [CL] /*CL*/
WHEN [Max] < [target] AND [Min] > [CL] THEN [EndReturn] /*both not */
END ) ,
[count] = count(*)
FROM [defaultDB].[dbo].[Table1] M
CROSS APPLY [defaultDB].[dbo].HighFirst([date], [Stime], [Etime], [Target], [CL]) h
JOIN (values (0.003), (0.0035), (0.004), (0.0045), (0.005), (0.0055), (0.006), (0.0065), (0.007), (0.0075), (0.008) ) as T([target])
on 1 =1
JOIN (values (-0.003), (-0.0035), (-0.004), (-0.0045), (-0.005), (-0.0055), (-0.006), (-0.0065), (-0.007), (-0.0075), (-0.008), (-0.0085), (-0.009), (-0.0095), (-0.01), (-0.0105), (-0.011), (-0.0115) ) as C([CL])
on [M].[date] in ( 20120307,20120601,20121109,20130826,20131002,20140117,20140122,20140311,20140529,20140718,20150619,20151014,20151022,20160411,20160516,20160721,20160818,20160909,20170127,20170213,20170921,20171025,20171229,20180116,20180315,20180926,20181022,20181128,20181211,20190104,20190329,20190502,20190521,20190528,20190611,20190627,20190823,20190930,20191104,20191211,20200214,20200318,20200529,20200706,20200828,20201230,20210112,20210305,20210318,20210408,20210525,20210617,20210625)
AND [Stime] >= 133000
Group By [Stime], [Etime], [Target], [CL]
【讨论】:
我相信瓶颈是“功能”。我尝试通过直接方式应用相同的 sql 查询并将 sql 查询放入函数中。直接方式要快得多以上是关于选择下的 SQL 函数使其非常慢的主要内容,如果未能解决你的问题,请参考以下文章
通过 *** 的 Oracle 和 MS SQL 查询非常慢