The subject of advances in audio is explored by Rob Oldfield in his paper ’Cloud-based AI for automatic audio production’ and by Matteo Torcoli in the paper ’Dialog+ in Broadcasting: First Field Tests using Deep-Learning-based Dialogue Enhancement.’

Every broadcaster knows that the most common complaint from viewers is that programme dialog is hard to discern against a background of atmospheric sounds, mood music and competing voices. It is especially a problem of age, where 90% of people over 60 years old, report problems.

Research over decades has unsuccessfully sought a way of enhancing the intelligibility of TV dialog - until now! We present the results of trials by a collaboration of researchers using their deep-neural-network-based technology across a wide range of TV content and age groups. These show a startling performance; join us to judge the benefits for yourself!

Advances in audio

Advances in audio

We shall also hear about exciting research using cloud-based AI and 5G connectivity to deliver live immersive experiences to a variety of consumer devices. Key to the experience is the ability of viewers to change their content viewpoint, with live rendering taking place in the cloud. The presentation focuses on the audio which is object-based and AI-driven, and carries with it the metadata necessary for personalised rendering of the scene.

The capture of the background is also critical to the recreation of the audio scene, for this the team chose second-order ambisonics accompanied by Serialised Audio Definition Model descriptive metadata. The presentation will explore detailed aspects of the audio processing and production. Altogether, a fascinating glimpse of the technology required to convey 360° audio for free-viewpoint XR!

  • More Tech Paper sessions here

Download the papers below