The move towards end-to-end IP between media producers and audiences will make new broadcasting systems vastly more agnostic to data formats and to diverse sets of consumption and production devices.

In this world, object-based media becomes increasingly important; delivering efficiencies in the production chain, enabling the creation of new experiences that will continue to engage the audience and giving us the ability to adapt our media to new platforms, services and devices.

This paper describes a series of practical case studies of our work in object-based user experiences since 2014. These projects encompass speech audio, on-line news and enhanced drama.

In each case, we are working with production teams to develop systems, tools and algorithms for an object-based world: these technologies and techniques enable its creation (often using traditional linear media assets) and post-production; transforming user experience for audiences and production.


In 2014 BBC R&D presented an IBC paper on object-based broadcasting [1], the representation of media content by a set of individual assets together with metadata describing their relationships and associations, and the abilities to bring these back together again to make new content experiences. This work has continued to progress since those very early prototypes and proofs-of-concept.

We have now created a range of object-based experiences, together with experimental tools to enable the sustainable creation of such content. Collectively, these systems have formed a valuable catalyst for building our knowledge and understanding of how producers of creative content can design and deliver these experiences.

We find them to be useful, practical case studies that should enable broadcasting organisations to thrive among the new broadcasting systems evolving from end-to-end IP and ubiquitous computing.

In each of the scenarios described here, we have worked with production teams to develop systems, tools and algorithms for an object-based world: these technologies and techniques enable its creation (often using traditional linear media assets) and post- production, transforming user experiences for audiences and enhancing the craft of production professionals.

In [1] we emphasised the continued importance of skilled craft in the curation of audio and video objects, as well as the data objects that describe them and their relationships and roles in the audience experience. This included the construction of a layered curatorial model, relating richer description of content relationships to more responsive experiences.

In this paper we will see this distilled into the craft and the opportunities in curating the semantics of objects, exploiting descriptive relationships between experiences and the elements comprising them. Specifically, this paper describes the following projects:

  • Discourse – a text-based semantic editing system for audio production

  • Atomising News – structured storylining of content to support dynamic presentation

  • Squeezebox – a tool for adding prioritisation semantics to segmented linearcontent, to allow simple control of content duration

  • StoryExplorer / StoryArc – presenting an interactive experience based on the semantics of a drama and assisting writers in their craft

  • Visual Perceptive Media – a pilot of a richly annotated set of video assets, which can be assembled into a short drama based on each viewer’s current context


Speech radio listenership remains high and podcasting continues to grow in popularity. Although much speech content is still broadcast live, a large proportion is pre-recorded and the experience constructed using audio editing software.

Commonly, such tools represent sound using simple waveforms, allowing users to visually search and scan audio content but displaying very limited information. This approach does not scale well [2]. Efficient navigation and editing of speech is crucial to the radio production process.

However, unlike text, speech audio must be navigated sequentially and does not naturally support visual search techniques [3]. Furthermore, the authoring of object-based experiences may also require the annotation of the speech audio with semantic mark-up describing various useful attributes; functionality not generally offered by waveform editors.

Semantic analysis techniques can be used to extract higher-level information from the audio, such as: whether the content is speech or music [4], where different people are speaking [5] or a transcript of what they are saying. Presenting this information to the user could allow them to navigate and edit audio content much more efficiently. They can also be used to create new experiences like responsive radio or variable-length programmes.

Over the last year Discourse has been developed; a semantic audio editing system that uses a text-based interface to enable users to navigate and edit speech using an automatically generated transcript. Development included a qualitative study of current radio production and evaluation of semantic editing. We found that current practice involves time-consuming note-taking and logging, before editing the audio based on these notes. The semantic editing system allows producers to complete this process up to twice as fast in some cases.

However, the semantic system was not as efficient for short recordings. Participants commented that Discourse allowed them to navigate and edit the audio much faster and that the accuracy of the transcript was good enough for their purposes. Both of these results support previous findings [6, 7].