按团队分组的基于赢、平和输的连续连胜/非连胜

Posted

技术标签:

【中文标题】按团队分组的基于赢、平和输的连续连胜/非连胜【英文标题】:Consecutive streaks/non-streaks based on Win, Draw & Loss grouped by team 【发布时间】:2019-08-10 13:28:23 【问题描述】:

我正在尝试根据输赢和平局标准计算结果数据库中的连胜记录。

目标:获得按团队分组的最长连续获胜/非获胜次数

我尝试过来自其他线程的不同 SQL 查询建议,但我要么错过了分组或团队列,而且通常只需要 2 路选项(赢和输) - 我需要 3 路选项(赢、输, 平局包括不赢、不输和不平局)

看过这个 - https://www.sqlteam.com/articles/detecting-runs-or-streaks-in-your-data

但我不知道如何让团队包括。分组到组合中

方案:

CREATE TABLE teamresults (matchid varchar(255), date DATE, time TIME, team varchar(255), teamresult varchar(255))

数据样本:

INSERT INTO teamresults (matchid,"date","time",team,teamresult) VALUES 
('030420181800acfc','2018-04-03','18:00:00','AC Horsens','L')
,('080420181600brac','2018-04-08','16:00:00','AC Horsens','L')
,('150420181400aaac','2018-04-15','14:00:00','AC Horsens','L')
,('180420181800acfc','2018-04-18','18:00:00','AC Horsens','D')
,('210420181600fcac','2018-04-21','16:00:00','AC Horsens','L')
,('270420181900acfc','2018-04-27','19:00:00','AC Horsens','L')
,('040520181900acaa','2018-05-04','19:00:00','AC Horsens','W')
,('110520181900fcac','2018-05-11','19:00:00','AC Horsens','L')
,('180520182000acbr','2018-05-18','20:00:00','AC Horsens','D')
,('210520181800fcac','2018-05-21','18:00:00','AC Horsens','L')
,('120520191200veac','2019-05-12','12:00:00','AC Horsens','W')
,('190520191400acve','2019-05-19','14:00:00','AC Horsens','D')
,('140720191400acfc','2019-07-14','14:00:00','AC Horsens','L')
,('210720191200siac','2019-07-21','12:00:00','AC Horsens','W')
,('270720191730acfc','2019-07-27','17:30:00','AC Horsens','L')
,('040820191600brac','2019-08-04','16:00:00','AC Horsens','W')
,('010420181400hoag','2018-04-01','14:00:00','AGF','W')
,('080420181800agsi','2018-04-08','18:00:00','AGF','W')
,('130420181900agfc','2018-04-13','19:00:00','AGF','W')
,('170420181900fcag','2018-04-17','19:00:00','AGF','L')
,('230420181900agho','2018-04-23','19:00:00','AGF','L')
,('300420181900siag','2018-04-30','19:00:00','AGF','W')
,('060520181200agob','2018-05-06','12:00:00','AGF','W')
,('130520181800obag','2018-05-13','18:00:00','AGF','W')
,('190520181600ags�','2018-05-19','16:00:00','AGF','D')
;

下面的查询确实有效,但只接受单个输入语句 - 所以我只能获得胜利、失败或平局的连续记录 - 不是非赢、非输和非平局。

SELECT
   team,
   MAX(cnt)
FROM
 (

SELECT
      team,
      COUNT(*) AS cnt
   FROM 
    (

SELECT
        team, 
        date,
        teamresult,
        SUM(CASE WHEN teamresult <> 'W'  THEN 1 else 0 END) 
        OVER (PARTITION BY team 
              ORDER BY date 
              ROWS UNBOUNDED PRECEDING) AS dummy
      FROM teamresults

      ) dt
         WHERE teamresult = 'W' 
   GROUP BY team, dummy
 ) dt
GROUP BY team;

我还希望能够找到按团队分组的最长非连续比赛

SQL fiddle 可在此处获得: http://sqlfiddle.com/#!18/3a2ac/1

提前致谢

更新: Gordon 查询正在工作,但这些查询在 postgres/cockroach 中不起作用 - 所以现在尝试通过窗口函数 rank() 将它们转换为支持的查询

select team, teamresult, cnt, rank() over (order by cnt desc) from
(SELECT team, teamresult, COUNT(*) as cnt
FROM (SELECT tr.*,
             ROW_NUMBER() OVER (PARTITION BY team ORDER BY "date", "time") as seqnum,
             ROW_NUMBER() OVER (PARTITION BY team, teamresult ORDER BY "date", "time") as seqnum_r
      FROM teamresults tr
     ) tr
WHERE teamresult = 'W'
GROUP BY team, teamresult, (seqnum - seqnum_r)
ORDER BY ROW_NUMBER() OVER (PARTITION BY team ORDER BY COUNT(*) DESC)) as ranked

这确实给了我这样的输出(来自我的数据库的数据样本):

FC København    W   9   1
AaB             W   8   2
FC København    W   8   2
FC København    W   8   2
FC København    W   8   2
Brøndby IF      W   7   6
FC Midtjylland  W   7   6
FC København    W   7   6
FC København    W   7   6
FC København    W   7   6
Esbjerg fB      W   6   11
FC Midtjylland  W   6   11
AaB             W   6   11
Brøndby IF      W   6   11
Brøndby IF      W   6   11

预期输出:

Team           Longest consecutive streak
FC København       9
AaB                8
Brøndby IF         7
FC Midtjylland     7
Esbjerg fB         6

【问题讨论】:

define calculate streaks 向我们展示数据库模式、示例数据、当前和预期输出。请阅读How-to-Ask 这里是START 了解如何提高问题质量并获得更好答案的好地方。 How to create a Minimal, Complete, and Verifiable example 感谢反馈 - 已编辑我的帖子以更准确地了解预期结果,包括。当前方案、数据和查询。 请在该数据中包含期望输出 【参考方案1】:

您可以使用以下方法获得所有条纹:

SELECT team, teamresult, COUNT(*) as cnt
FROM (SELECT tr.*,
             ROW_NUMBER() OVER (PARTITION BY team ORDER BY "date", "time") as seqnum,
             ROW_NUMBER() OVER (PARTITION BY team, teamresult ORDER BY "date", "time") as seqnum_r
      FROM teamresults tr
     ) tr
GROUP BY team, teamresult, (seqnum - seqnum_r);

您可以修改它以获得每支球队最长的连胜纪录:

SELECT TOP(1) WITH TIES team, teamresult, COUNT(*) as cnt
FROM (SELECT tr.*,
             ROW_NUMBER() OVER (PARTITION BY team ORDER BY "date", "time") as seqnum,
             ROW_NUMBER() OVER (PARTITION BY team, teamresult ORDER BY "date", "time") as seqnum_r
      FROM teamresults tr
     ) tr
WHERE teamresult = 'W'
GROUP BY team, teamresult, (seqnum - seqnum_r)
ORDER BY ROW_NUMBER() OVER (PARTITION BY team ORDER BY COUNT(*) DESC);

如果您只想要任何类型的最长连击,请删除WHERE。如果您希望每个团队每种类型的最长,请将teamresult 添加到PARTITION BY

Here 是一个 dbfiddle。

编辑:

如果你想要非胜利,你需要通过一个表达式来分区:

SELECT TOP(1) WITH TIES team,
       (CASE WHEN teamresult = 'W' THEN 'W' END) as is_win,
       COUNT(*) as cnt
FROM (SELECT tr.*,
             ROW_NUMBER() OVER (PARTITION BY team ORDER BY "date", "time") as seqnum,
             ROW_NUMBER() OVER (PARTITION BY team, (CASE WHEN teamresult = 'W' THEN 'W' END) ORDER BY "date", "time") as seqnum_r
      FROM teamresults tr
     ) tr
-- WHERE teamresult = 'W'
GROUP BY team, (CASE WHEN teamresult = 'W' THEN 'W' END), (seqnum - seqnum_r)
ORDER BY ROW_NUMBER() OVER (PARTITION BY team ORDER BY COUNT(*) DESC)

【讨论】:

谢谢 - 但不幸的是我的 DB(cockroach)SQL 错误 [42601] 中断:错误:在“with”处或附近出现语法错误详细信息:源 SQL:SELECT TOP(1) WITH TIES 团队, teamresult, COUNT(*) as cnt 非连胜查询会是什么样子?将 = 替换为 表示未获胜,返回 3 表示 Horsens,而数据集中的最大连胜实际上是 6。 感谢非获胜者 :-) - 如何转换查询以支持 cockroach/postgres? - 他们不支持'with ties' - 相反他们支持 Rank() 作为 windows 函数。看了这个***.com/questions/9629953/…,但找不到如何更改查询的逻辑。 @MortenStensgaard 。 . .如果没有关系,您可以使用rank() 而不是row_number()【参考方案2】:

感谢 Gordon,我通过以下查询解决了我的问题:

按队伍分组的最长连胜(用D代替W表示平局,用L表示失败)

select team, max(cnt) longeststreak from (

SELECT team, teamresult, COUNT(*) as cnt
FROM (SELECT tr.*,
             RANK() OVER (PARTITION BY team ORDER BY "date", "time") as seqnum,
             RANK() OVER (PARTITION BY team, teamresult ORDER BY "date", "time") as seqnum_r
      FROM teamresults tr
     ) tr
WHERE teamresult = 'W'
GROUP BY team, teamresult, (seqnum - seqnum_r)
ORDER BY RANK() OVER (PARTITION BY team ORDER BY COUNT(*) DESC)

)
group by team
order by longeststreak DESC

按团队分组的最长连续不胜(将 W 替换为 D 表示不平局,将 L 替换为不输)

select team, max(cnt) longestnonstreak from (

SELECT team, 
       (CASE WHEN teamresult = 'W' THEN 'W' END) as is_win,
       COUNT(*) as cnt
FROM (SELECT tr.*,
             RANK() OVER (PARTITION BY team ORDER BY "date", "time") as seqnum,
             RANK() OVER (PARTITION BY team, (CASE WHEN teamresult = 'W' THEN 'W' END) ORDER BY "date", "time") as seqnum_r
      FROM teamresults tr
     ) tr
GROUP BY team, (CASE WHEN teamresult = 'W' THEN 'W' END), (seqnum - seqnum_r)
ORDER BY RANK() OVER (PARTITION BY team ORDER BY COUNT(*) DESC)

)
group by team
order by longestnonstreak desc

感谢 Gordon 协助解决问题。

【讨论】:

以上是关于按团队分组的基于赢、平和输的连续连胜/非连胜的主要内容,如果未能解决你的问题,请参考以下文章

在熊猫队中连续几周获得最长连胜纪录

如何计算每位球员的最长连胜纪录

如何对 RDD 中的项目进行排名以建立连胜?

MySQL 连胜一队

在 Postgres 中的连胜纪录

使用标准 SQL 的最长连续