Technical Paper: This paper presents the development and implementation of novel perceptual quantization matrices for coding High Dynamic Range (HDR) mobile device-based video content. 

Abstract

The proposed perceptual quantization matrices are based on Human Visual System (HVS) and utilized for reducing video transmission bit-rate and for optimizing perceived visual quality of video content to be displayed on mobile devices, such as tablets and smartphones.

According to the proposed video coding scheme, perceptual quantization matrices are first calculated based on Human Visual System (HVS) characteristics and on predefined viewing conditions, and then utilized during the encoding loop for removing non-perceptible visual information, while making an especial emphasis on the Ultra High Definition (UltraHD) resolution and H.265/MPEG-HEVC video coding standard.

Based on extensive experimental results, visual quality of the HDR UltraHD video content is significantly improved, for substantially the same bit-rate, in terms of the popular objective quality metric SSIMPlus. On the other hand, the video transmission bit-rate is significantly reduced by up to about 25%, while keeping visual quality of the video content, to be displayed on a mobile device screen, substantially at the same level.

Introduction

There is currently a strong demand for high resolution video content, particularly for the high-definition (HD) and UltraHD video content for a variety of mobile devices, such as tablets, smartphones and even smartwatches. According to the recent Cisco® report (1), the IP video traffic is expected to be 82% of all Internet traffic by 2022, and there is a continuous need to decrease video transmission bit-rate, especially for delivery over wireless or cellular networks without reducing visual presentation quality.

In addition, the HDR UltraHD video content is recently attracting a lot of attention due the relatively high luminance levels and fine shadow details, which extend much beyond conventional Standard Dynamic Range (SDR) content. The HDR technology makes it possible to present highly bright signals along with very dark signals on the same video frame, thereby providing a high contrast ratio within the same image. In addition, the HDR video content is usually combined with a Wide Color Gamut (WCG), such as BT.2020 (2), (3), thereby enabling to present video with a significantly extended color spectrum.

Particularly, HDR has gained its popularity after the development and approval of the High Efficiency Video Coding (HEVC) standard (4), i.e. H.265/MPEG-HEVC (2)-(9), in 2013. As known, HEVC was especially designed for coding of HD and UltraHD video content with a much larger coding gain (10)-(13) compared to its predecessor H.264/MPEG-AVC (14), thereby reducing both spatial and temporal video content redundancies in a much more efficient way, which in turn significantly assisted in compression of the HDR UltraHD video content. However, coding of the HDR video content still remains challenging due to users’ demands for high visual quality, which in turn requires allocating more bits and increasing a video coding depth (e.g., from 8 bits to 10 bits). In addition, the transmission bandwidth is normally limited due to a typical limitation of the existing network infrastructure, especially in case of the transmission over wireless/cellular networks. As a result, in order to stay within the transmission bandwidth limits, the high-resolution HDR video content is often compressed with visually perceived coding artifacts. Moreover, encoding of the HDR content normally consumes significant computational resources due to a requirement to preserve fine details within the HDR video. Therefore, there is a strong demand to improve perceived visual quality of the compressed HDR video substantially without increasing its bit-rate (15)-(18).

Download the paper below