Oracle同英超联赛数据统计和展示的结合

Posted bisal(Chen Liu)

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Oracle同英超联赛数据统计和展示的结合相关的知识,希望对你有一定的参考价值。

技术是为业务服务的,一直在各个领域被论证,毕竟有场景使用,才能体现出价值,否则只能自娱自乐了。

了解现代足球篮球联赛的朋友,可能知道,现在球场上产生的数据是相当多的,无论是从维度上,还是量级上,例如跑动数据、射门数据这些都是比较基础的,甚至能做出一些二级分析数据和预测数据,还提供更高阶的数据,可以让用户进行付费购买,可以说是将数据用到了极致。而且这些数据应该是随着比赛动态统计和展示的,国内联赛某些数据能做到,但是比较雏形,国外很多联赛在这方面已经比较先进了,例如英超联赛、美职联,一场比赛中,球队、球员、现场等的数据都能实时统计和展示,这些数据无论对球迷观众,还是球员教练,都是有所帮助的,有兴趣的朋友,可以在观看比赛的时候关注一下。

从上赛季的英超联赛,Oracle就开始作为技术提供商支持各种数据统计分析等的工作,助力英超联赛。这篇来自Oracle官方博客的文章,介绍了Oracle Cloud Infrastructure(OCI)服务是怎么协助计算最佳逆转奖项的,学习英文的同时,还能了解技术,一举两得。


文章标题是《Oracle Cloud crunches reams of match data to inform Premier League’s two new end-of-season awards》。

原文链接,

https://blogs.oracle.com/cloud-infrastructure/post/oracle-cloud-crunches-data-premier-league-end-of-season-awards

Steven Bergwijn for the win (Photo credit: Geoff Caddick/Getty Images)

When the Premier League’s Tottenham Hotspur trailed Leicester City by a goal with only minutes left to play, a draw seemed unlikely and a win nearly impossible. After two Spurs goals in less than 90 seconds, fans couldn’t believe what they had just witnessed.

But was it, in fact, the Most Improbable Comeback in the 2021-2022 Premier League season? After crunching the data—1.2 billion rows of it, totaling more than 10 billion data points from all 380 matches—we determined that it absolutely was.

Most Improbable Comeback is one of two brand new end-of-season awards the Premier League announced on May 26, each one based on a rigorous data analysis using Oracle Cloud Infrastructure (OCI) services.

Tottenham Hotspur take home the Most Improbable Comeback trophy, for their 3-2 come-from-behind win at Leicester on January 19. Equally stunning was the season’s Most Powerful Goal, for which Manchester City midfielder Fernandinho takes home the trophy for his laser strike against Leeds United on April 30.

To arrive at the award winners, the Premier League partnered with Oracle, which deployed two of its data scientists to analyze the massive amounts of match data using several cutting-edge OCI services. What follows is a behind-the-scenes look at that analysis.


Most Improbable Comeback: How it’s calculated

The Oracle data science duo—Brian Macdonald and Nithin TS—arrived at candidates for this new Premier League team award using the Win Probability statistic, a third-party stat that calculates the chance of a team securing a win or draw in each match by simulating the remainder of the match 100,000 times.

That statistical model, based on four years of match data generated by Stats Perform, factors in the current score at different times throughout each match, the time remaining in a given match, the number of players on the pitch for each team (to account for any players ejected because of a red card), and whether a team is home or away.

Using OCI Data Science Service, Oracle analyzed the win probabilities for each team in 30-second intervals for each of the 380 matches to calculate which team came back from the lowest win probability to defeat its opponent.

For the Most Improbable Comeback winner, Tottenham Hotspur, OCI Data Science determined that the spread between the Spurs’ lowest win probability at any time in their match against Leicester and their win probability at the end of the competition was the largest for any team in any match during the season.

Specifically, that spread was calculated at 99.98 percentage points, the difference between the Spurs’ 0.02% win probability when they trailed Leicester by a goal during stoppage time, and 100%, when they finished shortly thereafter with a 3-2 victory, thanks to goals in the 95th and 97th minutes by winger Steven Bergwijn.

Not only did the Spurs come from behind to win the match, but they also clinched the victory with mere seconds left in the competition. To put the improbability of the Spurs' comeback into perspective, a 0.02% win probability states that this outcome will happen only once in every 5,000 games—or once every 13 seasons.

What’s more, that victory turned out to be critical to the Spurs’ end-of-season qualification for next season’s pan-European Champions League tournament.

Macdonald, the lead Oracle data scientist on the project, says a surprising finding from this OCI Data Science analysis was that a few matches during the course of the 2021-2022 Premier League season produced a win-probability variance of more than 99 percentage points. So the variance in the Spurs-Leicester match, while so close to 0, wasn’t an outlier among those matches with huge comebacks, he notes.

Another surprising OCI Data Science finding was that one team, Wolverhampton Wanderers, finished second and third in the Most Improbable Comeback category—on the winning and losing ends. In the #2 comeback of the season, the Wolves rose from a 2-0 deficit to defeat Aston Villa 3-2 away on October 16, overcoming a win probability of only 0.10% at one point in the match, while the Wolves gave up a two-goal lead at home to Leeds on March 18 in the #3 comeback, losing late in the match 3-2. Leeds had a win probability of only 0.74% earlier in that match.

In the remaining two Most Improbable Comeback matches in the Top 5, Watford lost at home to Burnley and away to Brentford (see table at the end of this article for more details).


Most Powerful Goal: Data shows a clear winner

This new Premier League award recognizes the player whose goal-scoring shot had the highest average velocity from the time it was struck to the time it crossed the goal line, with the caveats that the strike was from beyond the box’s 18-yard line and was not deflected.

In contrast to the Most Improbable Comeback category, where the extent of the biggest comebacks was very close, the highest-velocity shot in the Most Powerful Goal category was in a league of its own. The OCI Data Science analysis revealed that Fernandinho’s 21.21-meter strike against Leeds—at an average velocity of 73.08 miles per hour (117.61 kilometers per hour)—was a full 12.3% more powerful than the season’s #2 rocket launch. “The rest of the top 10 in this category were all kind of close,” Macdonald says. “Each increment was small, and then boom, there’s this big jump for the winner.”

For the fans watching at home, it can be tricky to discern between shots of such power, particularly when some shots skim the pitch surface and others fly into the top corner of the goal. “That’s one reason the data analytics behind these awards are so important,” says Will Brass, the Premier League’s chief commercial officer. “The calculations are complex, involving player and ball tracking as well as detailed analysis of the moment the ball is struck. Oracle Cloud Infrastructure gives us confidence in these precise computations and allows us clarity in declaring a deserved winner.”

The rest of the Top 10 in the Most Powerful Goal category ranged from the 65.05-mph strike by Southampton’s James Ward-Prowse, at #2, to the 60.13-mph strike by Aston Villa’s Jacob Ramsey, at #10. (See table for full details.)

As might be expected, all the finalists for Most Powerful Goal were for shots from near the center of goal just outside the box. “It makes sense,” Macdonald says, “because as I look at these shots, a lot of them involve deflected passes coming back to the shooter, away from the goal, which gives the ball extra velocity. It’s just basic physics.”


Setting up, using the OCI environment

Macdonald says he was able to set up the OCI instances applied to both award evaluations in just 30 minutes.

The first step was to write Bash scripts on OCI Compute virtual machines to pull data from the APIs of the Premier League’s two main data providers and put it into OCI Object Storage. Those scripts pulled updated data after every match day.

One provider is Second Spectrum, which supplies location data on the positioning (3D coordinates) of all 22 players on the pitch, as well as the ball, throughout each Premier League match by using machine learning and computer vision algorithms. The other provider is Stats Perform, whose Opta service enhances the location data to identify match “events,” such as shots (including their location on the pitch, distance from goal, and whether they were left-footed or right-footed), corner kicks, fouls, penalties, and so on.

From there, the Oracle data scientists uploaded the data to Oracle Autonomous Data Warehouse, using the cloud-based warehouse’s built-in JSON capabilities to handle the complex, nested JSON structures needed to represent a football match. They then conducted a series of in-depth analyses using the OCI Data Science machine learning platform.

In all, they took in billions of data points from all 380 matches to calculate myriad analytical metrics about each game and goal, ultimately generating a short list of candidates for each award, culminating in the Premier League’s selection of a single winner in each category.

“Connecting to the APIs of the two data providers was probably the most complicated part, because we had to work through the normal first-time authentication steps,” Macdonald says. “As soon as I got those working, it's just running the same commands over and over again. The rest was easy.”

Oracle data scientists used this architecture to calculate the awards

To try out the OCI environment and produce some preliminary results, the two Oracle data scientists started their analyses in late April, well before the Premier League season ended on May 22, refreshing their leader lists for each award after every match. Macdonald did most of the data analysis for the Most Powerful Goal evaluation and Nithin did most of it for Most Improbable Comeback.

“It was separate work tracks in a shared OCI environment with shared code snippets in OCI Data Science,” Macdonald explains. “We did a lot of in-depth analytics and discussions of the results, validating and comparing the data, ensuring that we didn’t miss anything.”


Key OCI products used

OCI Data Science Service, the fulcrum of their analyses, is a fully managed and serverless platform for data science teams to build, train, and manage high-quality machine learning models. Automated ML capabilities rapidly examine the data and recommend the optimal algorithms, while tuning the model and explaining its results.

OCI Data Science’s drag-and-drop data integration and preparation tools make it easy for users to move data into a data lake or data warehouse. The cloud platform’s security tools and user interfaces enable users with multiple roles to participate in projects and share models. Model-agnostic explanations help data scientists, business analysts, and executives have confidence in the results.

Oracle Autonomous Data Warehouse is a cloud-based data warehouse service that eliminates operational complexities by automating provisioning, configuration, patching, tuning, scaling, and backup.

OCI Compute provides fast, flexible, and affordable compute capacity—from bare metal servers and virtual machines to lightweight containers—to fit any workload. OCI Compute’s uniquely flexible VM and bare metal instances deliver optimal price-performance.

OCI Object Storage enables customers to securely store any type of data in its native format. With built-in redundancy, OCI Object Storage is ideal for building modern applications that require scale and flexibility, as it can be used to consolidate multiple data sources for analytics, backup, or archive purposes.

The Oracle data scientists also used Oracle Analytics Cloud to present a complete leaderboard for each award, allowing them to re-sort the data based on different criteria—say, to include Most Powerful Goal candidates for shots that occurred within the 18-yard box or narrow the analysis to players on a certain team.

Oracle Analytics Cloud provides a complete set of tools for deriving and sharing data insights. The platform lets analysts visualize any data findings, on any device. It also lets users ingest, profile, and cleanse data using a variety of algorithms, as well as aggregate data and then run ML models at scale.

Most Improbable Comeback (top 5)

Most Powerful Goal (top 10)

如果您认为这篇文章有些帮助,还请不吝点下文章末尾的"点赞"和"在看",或者直接转发pyq,

近期更新的文章:

翻译专业资格(水平)考试介绍

MySQL忘了账号密码,除了跑路,还能补救么?

非标准数据块的表空间使用

数据库安全的重要性

CentOS 7.9安装Oracle 21c历险记

近期的热文:

"红警"游戏开源代码带给我们的震撼

文章分类和索引:

公众号1000篇文章分类和索引

以上是关于Oracle同英超联赛数据统计和展示的结合的主要内容,如果未能解决你的问题,请参考以下文章

sql 印度超级联赛数据库中排名前十的最年轻球员

sql 印度超级联赛数据库中排名前十的最年轻球员

sql - 各国参加印度球员联赛的球员人数。至少那些打过单场比赛的人。

sql 在印度超级联赛中获得最多捕获量的球员(2008年 - 2016年)

sql 在印度超级联赛中获得最多捕获量的球员(2008年 - 2016年)

sql 在印度超级联赛中拥有最多男子比赛的前12名球员(2008年 - 2016年)