Object-based audio production has begun to get the attention of consumers over the past year with Dolby Atmos technology incorporated into headphones, speakers, and soundbars from a variety of vendors. Both Apple, Amazon, and Tidal also offer special Atmos-encoded spatial audio streams.
Traditional surround sound uses a linear approach to technology. A 5.1 mix is made up of six discreet audio streams – centre, left, right, left surround, right surround, and LFE (Low Frequency Effects) that are intended to be played back via six separate speakers.
The challenge faced by this approach is that creators have no control of the configuration of the playback systems in peoples’ homes. A carefully crafted surround mix can completely fall apart when listened to over a set of cheap stereo speakers or the tinny mono speaker of a smartphone.
One of the main objectives of an object-based approach is to allow the playback devices themselves to control the final mix. Instead of a fixed linear stream, audio elements are treated as independent objects with metadata associated with them that specifies volume, location, and other attributes that the playback system renders based on its capabilities.
But the potential of the object approach goes well beyond simply replicating the surround experience. There is the opportunity to created truly immersive and personalised experiences for audiences.
Charlie Morrow, founder of Morrowsound and one of the pioneers of immersive audio, believes the most important thing is to capture the sound correctly at the start. He references orchestral recordings captured in mono in the 1930s, “The German recordings of orchestral music of the classics are superb,” he said. “Because all they were concerned about was being in the right spot at the right time. Because people were making music so that it would sound right if you were in the right spot at the right time.”
Morrow discussed his approach to recording concerts in special audio. “If it’s an immersive sound experience, you also have to record the ambience,” he said. “I’ve found it very useful to place some microphones in interesting places by spending time in rehearsal going around and finding which alcove off the stage sounds interesting - maybe it’s a dressing room, maybe it’s someplace off to the side near a delivery exit, maybe there’s something in the back of the theatre - and then to be able to capture all of all of that with reasonable accuracy so that it can be combined with a very good stereo image from right in the audience to work with as a kind of a base.”
Tony Churnside, an audio technologist and producer, has worked on cutting edge audio productions for the BBC, The Guardian, the musician Björk, and many others, believes that aesthetics should guide the use of technology. “If the creative intent of the experience is just a stereo mix listened to on headphones and that’s the only way it’s ever going to be listened to, you can represent the whole thing as a single object of two channels,” he said. “Whereas if your intention is for it to be listened to on stereo headphones, but also in a 5.1 environment, and also in a 22.2 or a properly rendered binaural experience where all of the different sources have got different directions then your objects are probably somewhere in between all of the sound sources in the space, and the channels that feed your loudspeakers.”
“For example, if you’re going to make a fully object-based version of the BBC Proms that people can listen to on headphones or a 22.2 channel setup, or on a 7.1 setup at home, sending every single one of the 150 mic channels to a device that’s going to render those is overkill,” he said. “The answer might be to send a bed of the space and a few key objects that need directionality. So, you might only be sending 10 objects, not 150.”
Personalised and interactive audio
Object-based audio offers more than immersive experiences. Allowing the audience a certain amount of control of the playback unlocks huge potential for personalisation and interactive experiences.
Audio technology consultant Rupert Brun has a different take on the opportunities afforded by object-based audio. “I’m absolutely convinced personalization is far more important than immersion for the vast majority of broadcast content,” he said.
“Broadcasters receive more complaints about inaudible dialogue than anything else. And sometimes it’s because the actors have mumbled and sometimes it’s because the sound balance is not good. But the problem is far more widespread than those isolated incidents,” he said.
“We have an ageing population. And one of the first things to go when you get older is your ability to understand speech in the presence of background sound. You add that to the frankly poor sound quality of low budget televisions and it’s a growing problem. We’re not all just consuming media sitting in a quiet living room. I think one sound balance will no longer serve everybody.
“If the broadcasters were to mix the dialogue sufficiently prominently that an ageing population with inexpensive televisions could hear every word, then those who have a better hearing and have invested in a wonderful home cinema system would get a very unpleasant experience because the dialogue would be far too dominant. I think the ability to rebalance the dialogue against everything else will be a huge use case.”
Churnside emphasised the benefits that moving away from a linear approach bring to creating interactive works.
“When you move away from channels and stems and work with objects is when you start talking about interactivity in the time domain,” he said. “In an object-based world, things aren’t just given a location in space - they’re given a location in time as well. It’s then you can start really playing with interactivity by navigating through multi stranded storylines and triggering different bits of different times into something that’s more like computer game than a fixed narrative.”
Churnside was involved in the creation of “The Vostok K Incident,” an experimental multichannel, multi-device production.
“This was an interactive, immersive audio experience with objects that were being triggered in space and time,” he said. “Different audio objects were sent to different devices. You could hear explosions coming from behind you, and other elements without necessarily having a 5.1 system – you could connect simultaneously with a mobile phone an iPad and a laptop, for example. It wasn’t just about being immersive, we introduced additional secret, hidden bits of story, depending on how many devices you had connected.”
Although Atmos is grabbing most of the attention right now, there are other standards and companies competing in the space. DTS has introduced its DTS:X technology, and several broadcasters and device manufacturers have adopted MPEG-H to support object-based audio.
Leaked news of a new immersive audio standard from Google has garnered a lot of attention, and with the Xbox supporting Dolby Atmos and Sony integrating their 3D audio technology in the PlayStation5, all signs indicate that the concept is developing significant traction beyond traditional broadcast and cinema.