Technical paper: This paper introduces a workflow to down-convert existing UHD HDR videos to their HD SDR versions and proposes a joint super-resolution, gamut extension, and inverse tone-mapping network.
With rapid developments of display technology in recent years, Ultra-high definition (UHD) high dynamic range (HDR) displays have emerged in consumer markets. However, due to the lack of UHD HDR video contents, it is necessary to convert legacy high definition (HD) videos with standard dynamic range (SDR) to their UHD HDR versions.
In this paper, we first introduce a workflow to down-convert existing UHD HDR videos to their HD SDR versions and then propose a joint super-resolution, gamut extension, and inverse tone-mapping network (JSGIN), which directly learns the upconversion from the HD SDR videos to their UHD HDR versions. Our JSGIN can enhance visual experience by reconstructing lost information and achieves better subjective visual quality with fewer artifacts than recent state-of-the-art methods.
Display technology has developed fast in recent years, Ultra-high definition (UHD) higher dynamic range (HDR) displays have become available for consumers. Nevertheless, because of the shortage of UHD HDR video contents, it is required to up-convert legacy high definition (HD) standard dynamic range (SDR) videos to UHD HDR videos. Compared with the current HD SDR television systems ‘(1)’, UHD television systems ‘(2)’ provide higher spatial resolution and wider colour gamut, and HDR television systems ‘(3)’ provide a higher dynamic range.
Super-resolution (SR) methods up-scale low-resolution images to high-resolution images. Recent convolutional neural network (CNN) based methods have achieved considerable improvements over conventional SR methods. SRCNN ‘Dong et al (4)’ was the first CNN-based SR method. Then, the CNN architecture was improved by various methods such as sub-pixel convolution ‘Shi et al (5)’ and modified residual blocks ‘Lim et al (6)’.
Gamut extension (GE) algorithms extend colours from a source gamut to a wider destination gamut. Linear colour space conversion cannot restore colour information outside the source gamut. Conventional GE algorithms attempt to make full use of the wider destination gamut. Recently, ‘TAKEUCHI et al (7)’ proposed a CNN-based GE algorithm that achieves significant gains against conventional GE algorithms.
Inverse tone-mapping (ITM) methods expand SDR images to HDR images. Compared with conventional ITM methods that only focus on mapping the dynamic range, CNN-based ITM methods can restore the lost details in highlights and shadows. ‘Eilertsen et al (8)’ introduced a deep learning system to reconstruct an HDR image from a single exposed SDR image.
UHD HDR videos can be reconstructed from HD SDR videos by cascading SR, GE, and ITM methods. However, the errors from the previous conversion may accumulate, which leads to less accurate results and more overall complexity compared with the joint learning of SR, GE, and ITM. A multi-purpose CNN structure ‘Kim and Kim (9)’ was first proposed to perform the joint learning task of SR, GE, and ITM to directly up-convert HD SDR videos to UHD HDR videos. Then, Deep SR-ITM ‘Kim et al (10)’ was proposed to achieve better results than ‘(9)’ by introducing input decomposition methods and modulation blocks.
ResNet ‘He et al (11)’ introduces local residual learning to ease the difficulty of training of deep CNNs. Global residual learning in SR was first adopted by VDSR ‘Kim et al (12)’ to facilitate training convergence for a deep CNN. Both local residual learning and global residual learning are adopted in our method.
In this paper, we first introduce a workflow to down-convert the existing UHD HDR videos to their HD SDR versions. Then, we propose a single CNN to jointly learn SR, GE, and ITM, which can directly up-convert HD SDR videos to their UHD HDR versions. Compared to recent state-of-the-art methods ‘(9) (10)’, UHD HDR videos generated by our method provide a better visual experience.