让 SQL 查询更高效
Posted
技术标签:
【中文标题】让 SQL 查询更高效【英文标题】:Making SQL query more efficient 【发布时间】:2011-06-14 01:10:53 【问题描述】:如何让这个 SQL 查询更有效率?
SELECT
(SELECT COUNT(*) FROM table WHERE price < 10) AS priceUnder10,
(SELECT COUNT(*) FROM table WHERE price BETWEEN 10 AND 20) AS price10to20,
(SELECT COUNT(*) FROM table WHERE price > 20) AS priceOver20,
(SELECT COUNT(*) FROM table WHERE colour = 'Red') AS colourRed,
(SELECT COUNT(*) FROM table WHERE colour = 'Green') AS colourGreen,
(SELECT COUNT(*) FROM table WHERE colour = 'Blue') AS colourBlue;
我已经在 price
和 colour
列上建立了索引,因此我正在寻找一种更好的方法来聚合数据。
我已经研究过使用GROUP BY
、HAVING
、自连接和窗口函数,但不知道如何获得相同的结果。
非常感谢任何建议。
【问题讨论】:
【参考方案1】:SELECT
COUNT(CASE WHEN price < 10 THEN 1 END) AS priceUnder10,
COUNT(CASE WHEN price BETWEEN 10 AND 20 THEN 1 END) AS price10to20,
COUNT(CASE WHEN price> 20 THEN 1 END) AS priceOver20,
COUNT(CASE WHEN colour = 'Red' THEN 1 END) AS colourRed,
COUNT(CASE WHEN colour = 'Green' THEN 1 END) AS colourGreen,
COUNT(CASE WHEN colour = 'Blue' THEN 1 END) AS colourBlue
from YourTable
WHERE price IS NOT NULL OR colour IN ('Red','Green','Blue' )
【讨论】:
谢谢,很好的答案,但为什么需要WHERE
?
@gjb - 很可能不是,但想象一下,您有一个 1,000,000 行的表,并且在 'Red','Green','Blue'
中只有 1 行具有非空价格或颜色,其他 999,999 行对结果,但没有它仍然会被扫描。当然,您可能很清楚您的数据不具有这种分布(price
甚至可能无法为空),也可以将其删除!【参考方案2】:
取决于您的数据库如何处理布尔表达式,这:
select sum(price<10),sum(price between 10 and 20)... from tab;
或者这个
select sum(case when price<10 then 1 else 0 end),sum(case when price between 10 and 20 then 1 else 0 end)... from tab;
可能会有帮助。
【讨论】:
【参考方案3】:SELECT count(*) as products,
if(price < 10, 'price band 1',
if (price between 10 and 20, 'price band 2',
'price band 3'
)
) as priceband,
t.colour
from table t
group by t.colour, pricebrand
这会给你
products colour priceband
53 red price band 1
65 red price band 2
12 blue price band 1
23 blue price band 2
等等
【讨论】:
这嵌套了结果,这不是我想要的。尽管如此,了解还是很有用的,所以谢谢。以上是关于让 SQL 查询更高效的主要内容,如果未能解决你的问题,请参考以下文章
Adaptive Execution如何让Spark SQL更高效更好用
mysql中Mysql模糊查询like效率,以及更高效的写法和sql优化方法