Technology consistently offers new ways for video to be made and consumed. AI and XR are two of the new frontiers currently rolling out fresh pathways to be explored.
BT spent 18 months exploring the opportunities of the latter. XR stands for extended reality, and encompasses virtual reality, augmented reality and mixed reality. It covers everything from apps that map contextual content onto a phone screen to fully immersive VR experiences that fully replace the outside world with a new reality.
This project was led by Andrew Gower, BT’s Head of Immersive Content Research, who talked about what the team got up to in this experimental research phase in a presentation titled: Technical Paper ‘XR and Avatars’
Prototyping and experimentation
They created a series of prototypes in association with industry partners, with a view to attaining a clearer view of the opportunities and challenges ahead. These included a MotoGP mixed reality application in which the user wears a virtual reality headset, for a top-down view of the race through a “very large wraparound video gallery” and a 3D render of a full-size bike.
Another prototype revolved around an interactive multi-view football experience. The user can choose the camera, and in some cases can effectively select where it looks, thanks to the use of Kandao Obsidian R Professional 360-degree cameras. “People can look whoever they want to and jump to a different viewpoint,” said Gower.
The BT team also experimented with volumetric video, where environments in video are modelled in full 3D. They “combine RGB data with depth sensing data, in order to generate a polygonal mesh and a texture map,” said Gower.
Spatial Audio was also explored. It’s where sound played to the viewer is placed in the sound field based on their own position in an environment, just like real-world sound. “When you move through a space or you orientate yourself, those audio objects stay static, if that’s how the audio engineer has designed them.“
These cutting-edge proof of concept experiences were made with a handful of partners including Condense Reality, The Grid Factory, Bristol University, Dance East and Salsa sound, and were funded by the DCMS.
Democratising XR via the cloud
However, short of having a VR headset to provide you with an on-the-spot demo, it’s the technical learnings of this project that turn out to be the most intriguing parts. And that all starts with the central premise, as Gower explains.
“The goal was to enable many more people to have the experience and not have to invest in high-end equipment,” he said. BT’s XR work was all based around implementation of cloud computing and 5G, meaning the end user can access the experiences from almost anywhere, and they are not tied to the power of the phone or computer used.
“It is difficult to deliver these purely on client-side devices without some form of cloud capability,” said Gower.
BT used computers with Nvidia graphics cards to provide the cloud power to render the content involved in each of its prototype experiences. He called it “A relatively small cluster, with just 12 RTX8000 graphics cards.”
The Nvidia RTX8000 is the pro equivalent of a high-end gaming graphics card, one with 48GB RAM — double the amount of the top Nvidia RTX 4090 consumer card.
The processing power of these 12 cloud-based graphics cards handles all the real rendering work involved, meaning the job of the user’s phone or VR headset is largely to display the resulting video feed. “The system is only delivering personalised video,” Gower explained, this fact being at the root of how this cloud-based approach can give a consistent experience regardless of the specific device someone uses.
Of course, this is a little more complicated than a VOD scenario, due to the innate interactivity of XR.
“As someone is moving around, changing their position, we need to update the system, its orientation,” said Gower. “The 3D scene gets rendered in the cloud and at that time we also create a virtual video viewpoint. That replicated in real-time the position, the orientation of the user.”
There is a constant stream of information heading both ways. The end user has to relay its position, using motion sensor data, to inform the exact view of the scene the cloud server needs to render. And any snags in this process will be immediately jarring and, in the case of VR, may even induce nausea. “The delay and the latency has a massive impact on the quality of experience of the end user,” said Gower.
The challenges of deploying at scale
The BT team found it was possible to “deliver some really compelling experiences.” But the sheer technical challenge on the horizon is immense, particularly if you begin talking about deploying this form of experience at scale. A live football match makes a good case study.
“When we’re thinking about sports broadcasting, you have tens of thousands of people watching the same content,” Gower noted. But the number of feeds, the number of people, that can be supported per block of hardware is an issue.
“In the prototypes we developed we only managed to get two concurrent streams per GPU. A sister project to us, called the Green Planet, which was using exactly the same setup, doing similar types of experiences, managed about five but couldn’t really push it beyond that,” said Gower.
“The techno-economics of delivering these services to consumers in a broadcasting context is quite a challenge. There is a lot more work to do,” he said.
“The number of users per GPU needs to increase, by a factor of 10 at the very least… That needs a reimagining of how cloud XR, or another edge cloud rendering platform, needs to be architected.”
One possible avenue mentioned was in sharing the processing load between the cloud and the end user’s device.
There’s another issue too, of what the hardware is doing when demand for these XR experiences drop off — again, this is hugely relevant when considered in the context of live sport. “We can’t deliver this type of service to consumers just thinking about broadcasting,” said Gower. “We have to think about how that cluster gets used 24/7, 365. During the day, for example, it’s used by industry and enterprise, for architects and engineers. In the evening, perhaps it’s a consumer offering, for gaming and broadcasting.”
This part of cloud computing is often forgotten. While the “cloud” part makes the hardware seem almost ethereal, the power is still very much provided by real computers. They just live in someone else’s building.
The idea of a million football fans watching a cup final, and all directing their own footage via cloud computing, may be a way off still. However, BT’s experiments in XR show what is possible with a little imagination.
BT’s Head of immersive Content Research Andrew Gower presented a technical whitepaper ‘XR and Avatars’ involving a series of experiments in extended reality at an IBC2022 presentation hosted by Alberto Duenas, Video Specialist at Warner Bros. Discovery.
Watch more Technical Paper ‘XR and Avatars’