ABSTRACT

The emergence of Beyond HD resolutions has triggered a new competition both in broadcast and cinema industries.

ProRes, XAVC, AVC-Ultra, JPEG- 2000, HEVC, VP9… All these are 4K/UHD-supporting codecs. Manufacturers’ marketing slides show excellent quality guarantees.

However, these may concern a single-generation encoding/decoding. Real-life workflows imply more complicated parameters’ configurations, especially in post-production.

Mesclado’s internal Lab for applied researches simulated a complete media chain, based on different production genres such as sport and fiction. Production, editing, mastering and distribution, each step is potentially being affected by a decoding-processing-encoding generation.

Our goal is to objectively measure distortion levels between an original sample and its distributed version.

We were able to identify good and sub-optimal codecs’ combinations. We concluded by recommending good engineering practices to save both bandwidth and storage through the production process and to increase the delivered image quality through the conventional distribution channels.

This work was conducted in partnership with Image Matters and direct involvement of Dwarf Animation Studio.

INTRODUCTION

We are witnessing today the big shifting of audio-visual professionals towards Beyond HD resolutions. A huge excitement is worldwide spread about a new user experience with video and audio immersion.

In cinema, it is going fast and many productions are done in 4K. In Broadcast, tests are being performed during global public events (Linkin Park Berlin 2014 concert UHD live broadcast, Roland Garros since 2013, FIFA World Cup, 8K UHD 2020 Tokyo Olympics, etc.).

In parallel, new UHD-supporting codec schemes are being introduced to the market, such as XAVC and HEVC implementations. Many doubts were raised about the maturity of such codecs when involved in complex media workflows.

ORIGIN OF THE IDEA

Mesclado was interested in this issue and launched a measurement campaign in the 1st quarter of 2014. It concerned a single encoding/decoding generation on UHD samples to test the codecs performance.

Three 10-second samples were involved for the tests: ballet show (50fps, 3840x2160), France Televisions’ series “Plus Belle La Vie” (50fps, 3840x2160) and French Tennis Open Roland Garros (59.94 fps, 3840x2160). Each one has been encoded using the codecs’ panel.

The comparison was performed between the raw input reference and the encoded output, subsequently decoded to raw data.

We chose to adopt objective measurement using Peak Signal to Noise Ratio (PSNR) as involving human beings would introduce a wider error range and subjective parameters. Figure 1 and Table 1 illustrate the results of the tests conducted on the Roland Garros sample.

Figure 1 psnr comparison for Roland Garros 2013 sample

Figure 1 psnr

Table 1 PSNR comparison of Roland Garros 2013 samples

Table 1

Source: Quality analysis report, Mesclado

These “one shot” operations were not sufficient to conclude about the true potential of each codec through a complete chain (production, post-production, mastering, distribution). Taking this project to the next level was therefore necessary, using the same PSNR algorithm but different encoding platforms.

This time, we wanted to go further with multi-generation encoding, by simulating a complete media chain with these codecs through real-life professional pipelines. The aim is to detect the level of distortion at delivery phases. This issue was submitted by our partner Image Matters, a company involved in JPEG-2000 media workflows.

OBJECTIVE MEASUREMENT

Objective quality measurement delivers an unbiased judgement: it is based on mathematical algorithms strongly correlated to human perception.

Many metrics are used such as PSNR, Structural SIMilarity (SSIM) and Difference Mean Opinion Score (DMOS).

Although SSIM and DMOS offer the best correlation with human perception, it has been demonstrated that SSIM is not efficient for high bitrate (reference related to our previous work of video quality measurement with a French media group) and that DMOS needs calibration for every test (it is an average between 1 and 100 and for a high video dynamic, we need to calibrate DMOS with a Minkowski variable.

This calibration is required for every test campaign. The comparison must take in consideration this calibration.). For this reason, we decided to use PSNR for measuring the video quality.

DOWNLOAD THE FULL TECH PAPER BELOW