mysql有窗口函数吗
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了mysql有窗口函数吗相关的知识,希望对你有一定的参考价值。
开窗函数就有点麻烦了。源代码:select t.*, ROW_NUMBER() OVER( PARTITION BY t.ticket_type ORDER BY t.open_ticket_time DESC ) as rn
from
term t
where
t.ticket_type in(
6, 5, 0, 1, **, *, **, **, *2, 2, 3, 4, 8, 7
)
and t.term_status >= 5
这个代码的意思是根据open_ticket_time倒序的顺序将ticket_type 是( 6, 5, 0, 1, **, *, **, **, *2, 2, 3, 4, 8, 7)还有term_status >= 5的数据对ticket_type 进行分组后排名ROW_NUMBER() 。 参考技术A
一直以来,mysql 只有针对聚合函数的汇总类功能,比如MAX, AVG 等,没有从 SQL 层针对聚合类每组展开处理的功能。不过 MySQL 开放了 UDF 接口,可以用 C 来自己写UDF,这个就增加了功能行难度。
这种针对每组展开处理的功能就叫窗口函数,有的数据库叫分析函数。
在 MySQL 8.0 之前,我们想要得到这样的结果,就得用以下几种方法来实现:
1. session 变量
2. group_concat 函数组合
3. 自己写 store routines
接下来我们用经典的 学生/课程/成绩 来做窗口函数演示
准备
学生表
mysql> show create table student \\G*************************** 1. row *************************** Table: studentCreate Table: CREATE TABLE student ( sid int(10) unsigned NOT NULL, sname varchar(64) DEFAULT NULL, PRIMARY KEY (sid)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci1 row in set (0.00 sec)
课程表
mysql> show create table course\\G*************************** 1. row *************************** Table: courseCreate Table: CREATE TABLE `course` ( `cid` int(10) unsigned NOT NULL, `cname` varchar(64) DEFAULT NULL, PRIMARY KEY (`cid`)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci1 row in set (0.00 sec)
成绩表
mysql> show create table score\\G*************************** 1. row *************************** Table: scoreCreate Table: CREATE TABLE `score` ( `sid` int(10) unsigned NOT NULL, `cid` int(10) unsigned NOT NULL, `score` tinyint(3) unsigned DEFAULT NULL, PRIMARY KEY (`sid`,`cid`)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci1 row in set (0.00 sec)
测试数据
mysql> select * from student;
+-----------+--------------+
| sid | sname |
+-----------+--------------+
| 201910001 | 张三 |
| 201910002 | 李四 |
| 201910003 | 武松 |
| 201910004 | 潘金莲 |
| 201910005 | 菠菜 |
| 201910006 | 杨发财 |
| 201910007 | 欧阳修 |
| 201910008 | 郭靖 |
| 201910009 | 黄蓉 |
| 201910010 | 东方不败 |
+-----------+--------------+
10 rows in set (0.00 sec)
mysql> select * from score;;
+-----------+----------+-------+
| sid | cid | score |
+-----------+----------+-------+
| 201910001 | 20192001 | 50 |
| 201910001 | 20192002 | 88 |
| 201910001 | 20192003 | 54 |
| 201910001 | 20192004 | 43 |
| 201910001 | 20192005 | 89 |
| 201910002 | 20192001 | 79 |
| 201910002 | 20192002 | 97 |
| 201910002 | 20192003 | 82 |
| 201910002 | 20192004 | 85 |
| 201910002 | 20192005 | 80 |
| 201910003 | 20192001 | 48 |
| 201910003 | 20192002 | 98 |
| 201910003 | 20192003 | 47 |
| 201910003 | 20192004 | 41 |
| 201910003 | 20192005 | 34 |
| 201910004 | 20192001 | 81 |
| 201910004 | 20192002 | 69 |
| 201910004 | 20192003 | 67 |
| 201910004 | 20192004 | 99 |
| 201910004 | 20192005 | 61 |
| 201910005 | 20192001 | 40 |
| 201910005 | 20192002 | 52 |
| 201910005 | 20192003 | 39 |
| 201910005 | 20192004 | 74 |
| 201910005 | 20192005 | 86 |
| 201910006 | 20192001 | 42 |
| 201910006 | 20192002 | 52 |
| 201910006 | 20192003 | 36 |
| 201910006 | 20192004 | 58 |
| 201910006 | 20192005 | 84 |
| 201910007 | 20192001 | 79 |
| 201910007 | 20192002 | 43 |
| 201910007 | 20192003 | 79 |
| 201910007 | 20192004 | 98 |
| 201910007 | 20192005 | 88 |
| 201910008 | 20192001 | 45 |
| 201910008 | 20192002 | 65 |
| 201910008 | 20192003 | 90 |
| 201910008 | 20192004 | 89 |
| 201910008 | 20192005 | 74 |
| 201910009 | 20192001 | 73 |
| 201910009 | 20192002 | 42 |
| 201910009 | 20192003 | 95 |
| 201910009 | 20192004 | 46 |
| 201910009 | 20192005 | 45 |
| 201910010 | 20192001 | 58 |
| 201910010 | 20192002 | 52 |
| 201910010 | 20192003 | 55 |
| 201910010 | 20192004 | 87 |
| 201910010 | 20192005 | 36 |
+-----------+----------+-------+
50 rows in set (0.00 sec)
mysql> select * from course;
+----------+------------+
| cid | cname |
+----------+------------+
| 20192001 | mysql |
| 20192002 | oracle |
| 20192003 | postgresql |
| 20192004 | mongodb |
| 20192005 | dble |
+----------+------------+
5 rows in set (0.00 sec)
MySQL 8.0 之前
比如我们求成绩排名前三的学生排名,我来举个用 session 变量和 group_concat 函数来分别实现的例子:
session 变量方式
每组开始赋一个初始值序号和初始分组字段。
SELECT
b.cname,
a.sname,
c.score, c.ranking_score
FROM
student a,
course b,
(
SELECT
c.*,
IF(
@cid = c.cid,
@rn := @rn + 1,
@rn := 1
) AS ranking_score,
@cid := c.cid AS tmpcid
FROM
(
SELECT
*
FROM
score
ORDER BY cid,
score DESC
) c,
(
SELECT
@rn := 0 rn,
@cid := ''
) initialize_table
) c
WHERE a.sid = c.sid
AND b.cid = c.cid
AND c.ranking_score <= 3
ORDER BY b.cname,c.ranking_score;
+------------+-----------+-------+---------------+
| cname | sname | score | ranking_score |
+------------+-----------+-------+---------------+
| dble | 张三 | 89 | 1 |
| dble | 欧阳修 | 88 | 2 |
| dble | 菠菜 | 86 | 3 |
| mongodb | 潘金莲 | 99 | 1 |
| mongodb | 欧阳修 | 98 | 2 |
| mongodb | 郭靖 | 89 | 3 |
| mysql | 李四 | 100 | 1 |
| mysql | 潘金莲 | 81 | 2 |
| mysql | 欧阳修 | 79 | 3 |
| oracle | 武松 | 98 | 1 |
| oracle | 李四 | 97 | 2 |
| oracle | 张三 | 88 | 3 |
| postgresql | 黄蓉 | 95 | 1 |
| postgresql | 郭靖 | 90 | 2 |
| postgresql | 李四 | 82 | 3 |
+------------+-----------+-------+---------------+
15 rows in set, 5 warnings (0.01 sec)
group_concat 函数方式
利用 findinset 内置函数来返回下标作为序号使用。
SELECT
*
FROM
(
SELECT
b.cname,
a.sname,
c.score,
FIND_IN_SET(c.score, d.gp) score_ranking
FROM
student a,
course b,
score c,
(
SELECT
cid,
GROUP_CONCAT(
score
ORDER BY score DESC SEPARATOR ','
) gp
FROM
score
GROUP BY cid
ORDER BY score DESC
) d
WHERE a.sid = c.sid
AND b.cid = c.cid
AND c.cid = d.cid
ORDER BY d.cid,
score_ranking
) ytt
WHERE score_ranking <= 3;
+------------+-----------+-------+---------------+
| cname | sname | score | score_ranking |
+------------+-----------+-------+---------------+
| dble | 张三 | 89 | 1 |
| dble | 欧阳修 | 88 | 2 |
| dble | 菠菜 | 86 | 3 |
| mongodb | 潘金莲 | 99 | 1 |
| mongodb | 欧阳修 | 98 | 2 |
| mongodb | 郭靖 | 89 | 3 |
| mysql | 李四 | 100 | 1 |
| mysql | 潘金莲 | 81 | 2 |
| mysql | 欧阳修 | 79 | 3 |
| oracle | 武松 | 98 | 1 |
| oracle | 李四 | 97 | 2 |
| oracle | 张三 | 88 | 3 |
| postgresql | 黄蓉 | 95 | 1 |
| postgresql | 郭靖 | 90 | 2 |
| postgresql | 李四 | 82 | 3 |
+------------+-----------+-------+---------------+
15 rows in set (0.00 sec)
MySQL 8.0 窗口函数
MySQL 8.0 后提供了原生的窗口函数支持,语法和大多数数据库一样,比如还是之前的例子:
用 row_number() over () 直接来检索排名。
mysql>
SELECT
*
FROM
(
SELECT
b.cname,
a.sname,
c.score,
row_number() over (
PARTITION BY b.cname
ORDER BY c.score DESC
) score_rank
FROM
student AS a,
course AS b,
score AS c
WHERE a.sid = c.sid
AND b.cid = c.cid
) ytt
WHERE score_rank <= 3;
+------------+-----------+-------+------------+
| cname | sname | score | score_rank |
+------------+-----------+-------+------------+
| dble | 张三 | 89 | 1 |
| dble | 欧阳修 | 88 | 2 |
| dble | 菠菜 | 86 | 3 |
| mongodb | 潘金莲 | 99 | 1 |
| mongodb | 欧阳修 | 98 | 2 |
| mongodb | 郭靖 | 89 | 3 |
| mysql | 李四 | 100 | 1 |
| mysql | 潘金莲 | 81 | 2 |
| mysql | 欧阳修 | 79 | 3 |
| oracle | 武松 | 98 | 1 |
| oracle | 李四 | 97 | 2 |
| oracle | 张三 | 88 | 3 |
| postgresql | 黄蓉 | 95 | 1 |
| postgresql | 郭靖 | 90 | 2 |
| postgresql | 李四 | 82 | 3 |
+------------+-----------+-------+------------+
15 rows in set (0.00 sec)
那我们再找出课程 MySQL 和 DBLE 里不及格的倒数前两名学生名单。
mysql>
SELECT
*
FROM
(
SELECT
b.cname,
a.sname,
c.score,
row_number () over (
PARTITION BY b.cid
ORDER BY c.score ASC
) score_ranking
FROM
student AS a,
course AS b,
score AS c
WHERE a.sid = c.sid
AND b.cid = c.cid
AND b.cid IN (20192005, 20192001)
AND c.score < 60
) ytt
WHERE score_ranking < 3;
+-------+--------------+-------+---------------+
| cname | sname | score | score_ranking |
+-------+--------------+-------+---------------+
| mysql | 菠菜 | 40 | 1 |
| mysql | 杨发财 | 42 | 2 |
| dble | 武松 | 34 | 1 |
| dble | 东方不败 | 36 | 2 |
+-------+--------------+-------+---------------+
4 rows in set (0.00 sec)
到此为止,我们只是演示了row_number() over() 函数的使用方法,其他的函数有兴趣的朋友可以自己体验体验,方法都差不多。
MYSQL窗口函数-获取最后一个值
我正在尝试在特定时间范围内获取玩家余额的最后一个值。
我有一个transactions
表。
玩家余额不是最大值或最小值。
SELECT project_id,
player_id,
FIRST_VALUE(balance) OVER (PARTITION BY player_id ORDER BY event_arrival_time DESC) AS balance
FROM transactions
WHERE event_arrival_time BETWEEN '2019-12-02 00:00:00' AND '2019-12-03 23:59:59'
AND project_id='aaa'
GROUP BY project_id, player_id
[我得到值,但是如果我使用对单个玩家吹的查询来测试它们,则我会得到不同的余额,并且在期中的某个地方会看到给定的结果余额。
此外,如果我多次运行此查询,则会得到不同的余额,就像它选择了不同的事务一样(我们正在谈论10分钟的差异)。
SELECT *
FROM transacitions
where event_arrival_time BETWEEN '2019-12-02 00:00:00' AND '2019-12-03 23:59:59'
AND project_id='aaa' and player_id = 'player1'
ORDER BY event_arrival_time desc
我想获得那个时期的球员名单,以及他们的最新余额(不是MAX值-也许是最大日期)。
删除GROUP BY
子句,如果需要,请使用DISTINCT
中的SELECT
:
SELECT DISTINCT
project_id,
player_id,
FIRST_VALUE(balance) OVER (PARTITION BY player_id ORDER BY event_arrival_time DESC) AS balance
FROM transactions
WHERE event_arrival_time BETWEEN '2019-12-02 00:00:00' AND '2019-12-03 23:59:59'
AND project_id='aaa'
您需要过滤,而不是聚合。
您可以使用相关的子查询来做到这一点:
SELECT project_id, player_id, balance
FROM transactions t
WHERE event_arrival_time = (
SELECT MAX(t1. event_arrival_time)
FROM transactions t1
WHERE
t1.player_id = t.player_id
AND t1.event_arrival_time >= '2019-12-02'
AND t1.event_arrival_time < '2019-12-03'
AND t1.project_id = 'aaa'
)
为了提高性能,您需要在(project_id, player_id, event_arrival_time)
上建立索引。您也可以尝试使用covering索引:(project_id, player_id, event_arrival_time, balance)
;使用这样的索引,数据库将可能仅通过查看索引来执行整个查询,而无需实际访问基础数据。
您也可以使用窗口功能:
SELECT project_id, player_id, balance
FROM (
SELECT
t.*,
RANK() OVER(PARTITION BY player_id ORDER BY event_arrival_time DESC) rn
FROM transactions t
WHERE
event_arrival_time >= '2019-12-02'
AND event_arrival_time < '2019-12-03'
AND project_id='aaa'
) t
WHERE rn = 1
我认为这应该可行。没有样品很难分辨。
SELECT
t.project_id,
t.player_id,
GROUP_CONCAT(t.balance) AS Balance -- supposed to have single value
FROM transacitions t
LEFT JOIN transactions t2 ON t.project_id = t2.project_id AND t.player_id = t2.player_id
AND t.event_arrival_time < t2.event_arrival_time
where t2.player_id IS NULL
AND t.event_arrival_time BETWEEN '2019-12-02 00:00:00' AND '2019-12-03 23:59:59'
GROUP BY t.project_id, t.player_id
ORDER BY t.event_arrival_time desc
以上是关于mysql有窗口函数吗的主要内容,如果未能解决你的问题,请参考以下文章