Authors: Dr. M. Oskar van Deventer; Jean-Claude Dufourd; Sejin Oh; Seong Yong Lim; Youngkwon Lim; Krishna Chandramouli; Rob Koenen
The proliferation of new capabilities in affordable smart devices capable of capturing, processing and rendering audio-visual media content triggers a need for coordination and orchestration between these devices and their capabilities, and of the content flowing from and to such devices.
The upcoming MPEG Media Orchestration standard (“MORE”, ISO/IEC 23001-13) enables the temporal and spatial orchestration of multiple media and metadata streams. Temporal orchestration is about time synchronisation of media and sensor captures, processing and renderings, for which the MORE standard uses and extends a DVB standard.
Spatial orchestration is about the alignment of (global) position, altitude and orientation, for which the MORE standard provides dedicated timed metadata. Other types of orchestration involve timed metadata for region of interest, perceptual quality of media, audio-feature extraction and media timeline correlation.
This article presents the status of the MORE standard, as well as associated technical and experimental support materials. We also link MORE to the recently initiated MPEG-I (MPEG Immersive) project.
A typical household may own more than 10 internet-connected media devices, including smart TVs, tablet devices, smartphones and smart watches. The combined use of devices may enhance the media consumption experience; For example, the recent HbbTV 2.0 standard enables a smart TV to be connected to a tablet device for new types of interactive and synchronised media applications.
New opportunities for media orchestration also arise at the capture side, as the number of cameras, microphones and sensors (location, orientation) may match the number of people present at large sports, music or other events.
Moreover, the emergence of 360o video (“virtual reality”) and associated 3D audio offer opportunities for less TV-centric orchestrations of media capture, processing and rendering. MPEG (Moving Picture Experts Group) initiated its Media Orchestration (“MORE”) activity early 2015, to create tools to manage multiple, heterogeneous devices over multiple, heterogeneous networks, orchestrating the devices, media streams and resources to create a single media experience.
The focus of the activity has been on temporal and spatial orchestration, that is, protocols and metadata for the time synchronisation of media capture and media renderings, as well as their spatial alignment. The work has resulted in a draft international standard, which is expected to be formally published early 2018.
The remainder of this paper discusses use cases for media orchestration, details of the functional architecture and associated metadata, and a discussion how media orchestration fits in the MPEG Immersive (MPEG-I) roadmap.