IBC2022: This Technical Paper discusses an AI application that impacts the entire workflow of collecting and logging data from live sporting events, with a case study in cricket.


Data will continue to play a major role in sports from both performances as well as consumer engagement perspectives. An increasing amount of digital sports content is generated and made available through television broadcasts or streaming over the internet. Such content is an empowering source of data in sports. However, the content is not used to its full potential due to the tedious amount of work required to tag, curate, and extract value. To date, most tactical analyses are performed by reviewing match videos manually. In an elite broadcast production setting, partial automation is currently possible to only retrieve the finest and the most contentious moment of a match. Meta data tags are either generated using the pitch of audio signals, or generic classifications of themes or are purely generated by human loggers manually. In this paper, we address the problem of automated tagging in a domain-specific software system that is built on tracking technologies and action recognition using computer vision and deep learning models. This is beyond creating general tags for highlight generation. We examine the sport of cricket as a case study and present the practical impacts of this technology in a match broadcast and other sectors of the sports industry such as scouting and high-performance training.


Detailed accurate data collection will become more important and valuable as the sport evolves in the future. Increased levels of data that teams and fans are interested in require a larger and more skilled team to collect and log for such uses. The majority of sporting leagues dynamically require large logging teams, with sometimes as many as twenty individuals logging a live match content which is processed in a control room for broadcast purposes such as replay and for use by teams coaching and analytics departments. The introduction of automation across the entire workflow of data collection, processing, curation, and maintenance will significantly assist the industry in overcoming the challenges of collecting and logging data from live sporting events. This paper discusses an AI application that impacts the entire workflow of collecting and logging data from live sporting events. The introduced technology provides reliable and accurate data to broadcasters and teams. For cricket alone, it generates over fifty tags for each ball bowled in a match. Each tag is associated with millions of data points collected in every frame of a ball delivery content. The rich data is then consolidated using fifty parameters in a team profile and seventy parameters in the player’s profile to extract insights. 

Essentially our AI-powered system discretises the video to game-event sand non-game event segments and further breaks down each game-event segment in to a single unit of analysis (shot, play, strike, etc.) in that game. We examine the sport of cricket as a casestudy and present an architecture that allows automatic recognition of deliveries. Cricket is the second most watched sport after soccer with over a billion fans. Cricket matches are long in duration compared to other sports. An accurate recognition model that detects deliveries and distinguishes their semantic taxonomy allows searching the content and retrieving any event types they encode. This rich information, combined with the quantitative nature of cricket matches where many metrics are measured for every delivery, provides an unprecedented opportunity to create innovations, investigate game tactics, and create customised highlights and other sources of fan engagement.

The system is compatible with both archived and live sports content. We use the asynchronous and multi-process implementation to perform in almost real-time. In building the underlying models, we have considered a wide variety of sports production types and have trained these models with a large amount of content. This ensures the models are generalisable to any conditions or production setups. We investigate the utilisation of natural language processing ontop of the automated tags to generate automated commentary scripts in English. This can further enhance the sports production in the absence of professional English commentary, which is the case for lower-tier broadcast and streaming productions.

Download the paper below