As interest in object-based audio (OBA) continues to grow, new production tools are being developed by research institutions, broadcasters and vendors to make it happen

Among the different technologies now frequently bracketed together under the label of Next Generation Audio, it is perhaps object-based audio (OBA) that is generating the most excitement in broadcasting.

SalsaMixing_IBC crop

Salsa Sound: In the studio

OBA has the potential to reinvent the traditional audio picture for home viewers – giving them the freedom to emphasise their preferred audio elements, such as dialogue, commentary or crowd. This potential has been seized upon with particular vigour by sports and entertainment producers.

The general opinion is that, within a few years, it will be relatively commonplace for viewers to enjoy a degree of personalization, using an interface that lets them adjust the prominence of various audio objects and elements.

But it does require a paradigm shift on the part of content creators, with BBC R&D neatly summarising the principles behind OBA as follows: “By breaking down a piece of media into separate objects, attaching meaning to them, and describing how they can be rearranged, a programme can change to reflect the context of an individual viewer.”

“People’s interest in object-based broadcasting varies enormously depending on their level of understanding of it. In some areas, for example BBC Radio Engineering, it is the focus of a significant amount of effort.” Andrew Mason, BBC R&D

Given the cutting-edge nature of this area – and its somewhat uncertain commercial future – it makes sense for broadcasters and research institutions to be figuring prominently alongside vendors in developing solutions that can support OBA, binaural sound and other NGA technologies.

Developing production tools
In the UK, the BBC has been at the forefront of OBA research, with BBC R&D playing an integral role in the now-completed Orpheus project – a European initiative to develop, implement and validate a new end-to-end object-based media chain for audio content.

It has also contributed to new ITU recommendation - ITU-R BS.2125 ‘A serial representation of the Audio Definition Model’ - which was published in February 2019 and builds upon a specification of metadata that can be used to describe object-based audio, scene-based audio and channel-based audio.

Andrew Mason BBC

Andrew Mason, BBC R&D

Surveying the current OBA landscape, BBC R&D senior research engineer Andrew Mason says: “People’s interest in object-based broadcasting varies enormously depending on their level of understanding of it. In some areas, for example BBC Radio Engineering, it is the focus of a significant amount of effort, designing the next generation of radio broadcasting infrastructure.”

Mason adds: “The impact on production areas – both TV and radio – is still modest, being limited at the moment to an underpinning technology for binaural productions, many of which have now been aired or published on the BBC website. [Meanwhile] the interest of programme commissioners and programme makers in the possibilities of personalisation – for speech/music balance control, as an example – is still being developed.”

BBC R&D is certainly putting in the groundwork, having developed – and continuing to develop – production tools to introduce programme makers to workflows for immersive audio experiences, as well as creating plug-ins for popular digital audio workstations. In addition, the BBC’s educational arm, the BBC Academy, now runs a training programme for producers, designed by BBC R&D staff, that teaches binaural production. 

Sounds Amazing binaural session COPYRIGHT BBC 16x9

BBC Sounds binaural session 

Another organisation involved in the Orpheus project was Fraunhofer IIS, the German research institute that is one of the key supporters of MPEG-H, an audio system devised for immersive and interactive sound for TV and VR applications.

At IBC 2018 Fraunhofer IIS exhibited a complete production and transmission chain that included MPEG-H monitoring units for real-time monitoring and content authoring, post-production tools, MPEG-H Audio real-time broadcast encoders, and decoders in professional and consumer receivers. Ateme, Jünger Audio, Lawo and Linear Acoustic were among the leading manufacturers participating in the showcase.

“Viewers can personalise a programme’s audio mix by switching between different languages, enhancing hard-to-understand dialogue, or adjusting the volume of the commentator in sports broadcasts.” Adrian Murtaza, Fraunhofer IIS 

Fraunhofer adrian.murtaza

Adrian Murtaza, Fraunhofer IIS

With MPEG-H, says Fraunhofer IIS technical standards and business development senior manager Adrian Murtaza, it is possible to offer “immersive sound to increase the realism and immersion in the scene, [as well as] the use of audio objects to enable interactivity. This means viewers can personalise a programme’s audio mix, for instance by switching between different languages, enhancing hard-to-understand dialogue, or adjusting the volume of the commentator in sports broadcasts.”

Along with Dolby’s AC-4 format – which natively supports the Dolby Atmos immersive audio technology – MPEG-H is expected to make a significant impact on broadcast NGA services over the next few years.

Its supporters can already point to a successful deployment in South Korea – where it was specified as part of the country’s first 4K UHD TV service, which has been on air 24/7 since May 2017 – as well as a slew of high-profile trials, including that conducted by France Télévisions at the 2018 French Open in Paris.

Although industry heavy-hitters like Dolby and Fraunhofer are clearly making a crucial contribution to the OBA/NGA revolution, it’s also affording opportunities for newer players.

Take the example of UK-based Salsa Sound, a startup that originated as a research initiative at Salford University and has developed a set of tools for automatic mixing which is both channel- and object-based.

With audio over IP and OTT services being increasingly adopted and rolled out by OB companies and broadcasters, all of the infrastructure is there for OBA broadcasts.” Rob Oldfield, Salsa Sound

Co-founder Rob Oldfield says that Salsa’s primary focus is on live sports, where its machine learning engine will automatically create a mix of the on-pitch sounds without any additional equipment, services or human input – freeing the sound supervisors up to be able to create better mixes and author OBA immersive experiences.

Rob Oldfield

Rob Oldfield, Salsa Sound

“Our solutions not only create a mix for a channel-based world, but also allow for the individual objects to be broadcast separately with accompanying metadata from our optimised triangulation procedure which places all of the sounds in 3D space – even in a high noise environment – which helps facilitate immersive and interactive applications,” says Oldfield.

Calls for customisation
Although it may take some years yet for content creatives to become fully au fait with the potential of OBA, one by one the primary pieces of the workflow jigsaw are falling into place.

Oldfield remarks: “Both broadcasters and viewers are increasingly looking for more interactive, personalised and immersive experiences with content that is not restricted to certain reproduction formats (stereo, 5.1, etc). With audio over IP and OTT services being increasingly adopted and rolled out by OB companies and broadcasters, all of the infrastructure is there for OBA broadcasts – and this can only be a good thing for viewers who are looking for more dynamic, rich and entertaining content that they can tailor to their own needs.”

Mason agrees that personalisation will be a big driver of OBA, but appears slightly more circumspect about the trajectory of adoption.

“Obstacles are still present in that some equipment that the workflow requires does not exist, or exists but is not yet robust enough for broadcast deployment,” he says. “Getting object-based audio to work at a large scale, for sustained broadcasting, rather than ‘one offs’, presents a challenge. Also, interoperability of tools from different manufacturers is still not as well-developed as will be required. One approach that we are currently developing is that of defining the production, interchange and archive formats (e.g. through the adoption of Audio Definition Model) so that productions can be ready when the technology further down the chain becomes stable enough.”

With 4K, HDR and potentially, in a few years’ time, 8K making the visual side of the home entertainment experience ever more impressive, the role of OBA in making the audio aspect equally compelling is bound to be fundamental to the overall success of these cutting-edge broadcast services.