在SQL中插值的最佳方法
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了在SQL中插值的最佳方法相关的知识,希望对你有一定的参考价值。
我有一张特定日期的价格表:
Rates
Id | Date | Rate
----+---------------+-------
1 | 01/01/2011 | 4.5
2 | 01/04/2011 | 3.2
3 | 04/06/2011 | 2.4
4 | 30/06/2011 | 5
我想让输出率基于简单的线性插值。
所以,如果我输入17/06/2011:
Date Rate
---------- -----
01/01/2011 4.5
01/04/2011 3.2
04/06/2011 2.4
17/06/2011
30/06/2011 5.0
线性插值是(5 + 2,4) / 2 = 3,7
有没有办法做一个简单的查询(SQL Server 2005),或者这种东西需要以编程方式完成(C#...)?
像这样(纠正):
SELECT CASE WHEN next.Date IS NULL THEN prev.Rate
WHEN prev.Date IS NULL THEN next.Rate
WHEN next.Date = prev.Date THEN prev.Rate
ELSE ( DATEDIFF(d, prev.Date, @InputDate) * next.Rate
+ DATEDIFF(d, @InputDate, next.Date) * prev.Rate
) / DATEDIFF(d, prev.Date, next.Date)
END AS interpolationRate
FROM
( SELECT TOP 1
Date, Rate
FROM Rates
WHERE Date <= @InputDate
ORDER BY Date DESC
) AS prev
CROSS JOIN
( SELECT TOP 1
Date, Rate
FROM Rates
WHERE Date >= @InputDate
ORDER BY Date ASC
) AS next
这里使用CROSS JOIN的技巧是,如果表中的任何一个没有行(1 * 0 = 0)并且查询可能会中断,它将不会返回任何记录。更好的方法是使用带有不等式条件的FULL OUTER JOIN(以避免获得多行)
( SELECT TOP 1
Date, Rate
FROM Rates
WHERE Date <= @InputDate
ORDER BY Date DESC
) AS prev
FULL OUTER JOIN
( SELECT TOP 1
Date, Rate
FROM Rates
WHERE Date >= @InputDate
ORDER BY Date ASC
) AS next
ON (prev.Date <> next.Date) [or Rate depending on what is unique]
正如@Mark已经指出的那样,CROSS JOIN
有其局限性。一旦目标值超出定义值的范围,就不会返回任何记录。
此外,上述解决方案仅限于一个结果。对于我的项目,我需要对整个x值列表进行插值,并提出以下解决方案。也许它对其他读者也感兴趣?
-- generate some grid data values in table #ddd:
CREATE TABLE #ddd (id int,x float,y float, PRIMARY KEY(id,x));
INSERT INTO #ddd VALUES (1,3,4),(1,4,5),(1,6,3),(1,10,2),
(2,1,4),(2,5,6),(2,6,5),(2,8,2);
SELECT * FROM #ddd;
-- target x-values in table #vals (results are to go into column yy):
CREATE TABLE #vals (xx float PRIMARY KEY,yy float null, itype int);
INSERT INTO #vals (xx) VALUES (1),(3),(4.3),(9),(12);
-- do the actual interpolation
WITH valstyp AS (
SELECT id ii,xx,
CASE WHEN min(x)<xx THEN CASE WHEN max(x)>xx THEN 1 ELSE 2 END ELSE 0 END flag,
min(x) xmi,max(x) xma
FROM #vals INNER JOIN #ddd ON id=1 GROUP BY xx,id
), ipol AS (
SELECT v.*,(b.x-xx)/(b.x-a.x) f,a.y ya,b.y yb
FROM valstyp v
INNER JOIN #ddd a ON a.id=ii AND a.x=(SELECT max(x) FROM #ddd WHERE id=ii
AND (flag=0 AND x=xmi OR flag=1 AND x<xx OR flag=2 AND x<xma))
INNER JOIN #ddd b ON b.id=ii AND b.x=(SELECT min(x) FROM #ddd WHERE id=ii
AND (flag=0 AND x>xmi OR flag=1 AND x>xx OR flag=2 AND x=xma))
)
UPDATE v SET yy=ROUND(f*ya+(1-f)*yb,8),itype=flag FROM #vals v INNER JOIN ipol i ON i.xx=v.xx;
-- list the interpolated results table:
SELECT * FROM #vals
运行上面的脚本时,您将在表#ddd
中获得以下数据网格点
id x y
-- -- -
1 3 4
1 4 5
1 6 3
1 10 2
2 1 4
2 5 6
2 6 5
2 8 2
[[该表包含两个身份的网格点(id=1
和id=2
)。在我的例子中,我在1
CTE中使用where id=1
仅引用了valstyp
-group。这可以根据您的要求进行更改。 ]]
和结果表#vals
与列yy
中的插值数据:
xx yy itype
--- ---- -----
1 2 0
3 4 0
4.3 4.7 1
9 2.25 1
12 1.5 2
最后一列qazxsw poi表示用于计算值的插值/外推的类型:
itype
这个工作示例可以找到0: extrapolation to lower end
1: interpolation within given data range
2: extrapolation to higher end
。
以上是关于在SQL中插值的最佳方法的主要内容,如果未能解决你的问题,请参考以下文章