As master control room operations grow in density and potential stress-points, the need for multi-faceted audio and video monitoring solutions – underpinned by increasing levels of automation – is becoming more acute.
Although the primary components of master control broadcast monitoring – including down-mixing, loudness measurements and compliance, and audio delay for verifying lip sync – have not shifted too much, the amount of content to which they must be applied has changed significantly in recent years.
With master control rooms already among the most complex facilities in a broadcast centre, the onus of R&D has therefore been on allowing broadcasters to streamline their monitoring operations while simultaneously accommodating higher levels of throughput than ever before.
Individual approaches do vary, but without exception there is a need for core solutions that can analyse incoming and outgoing signals, and which enable concise visual feedback within operators’ monitor stacks.
In particular, with regard to audio, the expectations around loudness have become more fully universalised since the publication of the EBU R128 standard in 2010, with a recently announced supplement detailing how loudness normalisation can be applied to streaming services.
- Read more: EBU tackles loudness in streaming
Assisting broadcasters to maintain compliance with such regulations has therefore been a key facet of solution development in recent years. But there has also been a consistent focus on streamlining monitoring workflows as well as finessing the synchronicity between “alarming opportunities – for example around captioning and subtitling – and those instances where human interaction is required” to resolve an issue, as Imagine Communications CTO Networking & Infrastructure John Mailhot observes.
Having established a notable presence in the broadcast market relatively recently, Focusrite has been well-positioned to chart the emergence of a more responsive audio monitoring environment.
As Anthony Wilkins, director of EMEA Sales at Focusrite Pro, remarks: “The most noticeable change has been the need to be flexible and adapt very quickly to different environments, and to be able to install and configure monitoring solutions with a minimum of cabling complications and set-up complexity.”
With this in mind Focusrite has looked to support the migration of SDI to IP-based broadcast infrastructures – and the “need to be able to maintain connectivity with legacy audio formats and devices” – with its RedNet interfaces. This range offers “the ability to ‘bridge’ between Dante audio-over-IP networks and AES, MADI and analogue sources”, with Wilkins highlighting the X2P and AM2 products’ enabling of “a simple way to add monitoring functionality to an existing Dante network”.
Observing developments from a vantage point that allows him to track changes in broadcast and post, Wilkins anticipates a growing impact from Next Generation Audio, with Focusrite already discerning “an increase in the demand for immersive audio monitoring – in particular as part of a Dolby Atmos mixing workflow. [On a related note] our recently introduced RedNet R1 desktop remote/monitor controller allows for the flexible monitoring of common immersive audio formats from mono to 7.1.4 configurations.”
Along with flexibility, Interra Systems highlights the role to be played by increased automation – and specifically machine learning (ML) – in today’s demanding monitoring environments. Anupama Anantharaman, VP product management at Interra Systems, says that content creators are now “dealing with a massive volume of content, and media processing has become more complicated in recent years, increasing the potential for audio quality issues. Using automated QC solutions and advances in digital media technology and ML, content creators can address the audio quality issues with great ease.”
With all this in mind Interra Systems announced an ML-driven automated software tool for lip sync detection and verification, Baton LipSync, in spring 2020. The new solution acknowledges the fact that the scope for lip sync issues is actually expanding at present. “Lip sync errors are introduced in content at various stages in the workflows, from transcoding to format and frame-rate conversion, live capture of content and content delivery,” says Anantharaman. “With broadcasters receiving hundreds of videos captured by novices with all kinds of devices, the number of lip-sync issues is increasing. These problems are real, causing broadcasters to spend a significant amount of money on manual checks and software tools to ensure that their content has no visible lip-sync issues.”
Support for exhaustive metadata and audio quality checks is integral to Baton LipSync, which uses computer vision and “well-trained” ML algorithms to detect where audio lead and lag exist – independent of language or content format. The content creator can further view the original and adjusted video side by side along with the exact location of errors on the timeline, using Interra Systems’ BMP media player to enable complete analysis and resolution.
Looking ahead, Anantharaman expects continued usage of both on-premise and cloud deployments, as well as increased reliance on “AL and ML technologies – combined with computer vision techniques – to improve audio and video quality.”
Monitoring by exception
For Imagine Communications, Mailhot also recognises the impact that the transition to IP is having – something acknowledged within its own multiviewer products that support the transition from SDI to IP – and the increasing role that AI and ML are likely to have. But he also stresses the “continuing importance of a technique such as monitoring by exception”, whereby analysis of comprehensive information – typically encompassing equipment and signal information, and the content of playlists – makes it possible to determine specific alarms and consolidate them into customised displays.
“These alarms make it possible to trigger operator actions,” confirms Mailhot, who also points out the increased ability of this approach to help identify potential issues upstream in the signal path.
As for the next shift in broadcast monitoring, Mailhot believes that “qualitative monitoring” will become more critical as premium content services mature: “You are seeing more channels come through where qualitative monitoring is more important – for example, where there is no ad insertion and it’s [vital that there be] no interruption to programming.” Whilst higher density monitoring is crucial here, so is “some level of deep alarming, such as that [pertaining to questions like] ‘how long should something be silent, or how long should there be a blank screen’?”
There is also a personnel dimension to be factored in here. “As the number of channels goes up and the operator numbers go down, I think you are going to see more of the presentation suite-type models in which there is a higher density [of monitoring for] Master Control operators as opposed to Engineering operators.” In that context, he indicates, the driving impetus may be how to “allow the density to be increased but without the operator losing track of what they are doing.”
Perhaps more than any other broadcast technology discipline, monitoring seems to be the area in which the skilful blend of new with tried-and-trusted technologies and techniques is instrumental to getting the job done. In that context the rise of more sophisticated alarming/notification and AI/ML can be seen as the vital latest assets in helping broadcasters manage increased expectation around both quality and quantity of content.