Timing is a fundamental property of media experiences. In particular, multi-device scenarios require a shared timing to provide engaging, coherent services to users. Trends are also that people have multiple devices available, and that they do not limit the usage to one device at a time.

The industry caters to these developments in a variety of ways, often using expensive, custom solutions to limited areas or targeting short term issues that lock down users to particular devices and services. This paper discusses some existing solutions, and indicates how they could benefit from shared timing, as provided by the suggested HTML Timing Object and online Shared Motions.


People are using and interacting with an increasing amount of devices, often used simultaneously or combined with more traditional TV viewing. While watching, people are also active in product research, online shopping and even interacting with content by voting, commenting or sharing thoughts and content on social networks.

The availability of sensor data opens for advanced multi-screen services. Key to engaging users is providing coherent user experiences across devices and technologies, ensuring correctly timed presentations of both media, interactive components and user input.

Popular multi device solutions are however still facing some fundamental limitations, both from a technical and user experience point of view. For example, while Sonos plays audio perfectly in a home and a SmartTV can show IP content, the Sonos speakers can’t easily play the audio from the SmartTV while maintaining sync. Twitter might integrate fairly well with live broadcasts, but much less so for time-shifted content. Netflix is great for binge- watching, but it lacks mechanisms for social viewing beyond physically watching the same screen.

We see a pattern where successful multi-device solutions like Sonos and Chromecast seem to focus on small and well defined use cases. They commonly depend on a predictable environment, e.g similar hardware and closed software solutions and a physically shared network. This makes Web integration difficult, and adds usability hurdles as the requirement list for successful operation must largely be met by the user.

A web based online solution could meet user expectations more easily, as users can interact with their account rather than their physical, yet typically invisible, network. Online solutions are however typically unable to meet user expectations in multi-device scenarios.

Streaming a web radio on two devices will more likely than not create a highly confusing or annoying experience as the devices fail to play back content in synchrony. Due to differences in hardware, network latency, buffer sizes and several other factors, providing tight audio/video playback over the Internet might seem a challenging task. There is a relatively simple solution though: Shared timing and control.

The World Wide Web Consortium (W3C) Multi-Device Timing Community Group (1) has been working on a new model that exports timing as a first class citizen on the Web. The Timing Object (2) and Shared Motions (3) provide a programming model and an approach for creating natively multi-device experiences using HTML5 by adding globally shared timing and control.

In our work, we have used the open reference implementation of the Timing Object proposal (4) and a hosted implementation of Shared Motion by the Motion Corporation (5).

In this paper, we look at how the industry approaches timing related issues, how they compare to a shared timing model, as well as how shared timing can provide further opportunities.


Timing is typically still regarded as system internal - a player will have its internal clock. The clock might be exported to allow for example progress bars to be visualized, or an API might be available to change position, playback state or other control properties. A different approach is explicit timing, where timing is kept external from the player. The player will rather slave after the external clock as best it can. If there are internal clocks, e.g. hardware decoders, the player will do the necessary adjustments itself.