Digital video, and the benefit of compression, is so ubiquitous that we no longer think about it, but it is worth considering that every time we process a video signal we are making a compromise.

That compromise is between a a number of factors: most obviously, image quality; processing requirements to encode and decode; and the bitrate of the compressed signal, which determines the bandwidth needed to move the signal and the disk space needed to store it.

NHK Super Hi-Vision

NHK Super Hi-Vision will start broadcast trials next year prior to the Tokyo Olympics

These are decisions we still face, not least because we are continuing to push technical boundaries. Ultra HD, with 4K resolution, extended colour gamut and possibly even higher frame rates, creates a lot more raw data.

NHK’s Super Hi-Vision – an 8K, HFR, HDR system – will start trial broadcasts next year prior to the Tokyo Olympics in 2020. 360-degree VR produces the same sorts of raw data sizes as UHD.

For video, the MPEG group of compression schemes is most familiar: particularly MPEG-2, MPEG-4 (now better known as H.264) and the modern derivative H.265.

The Moving Picture Experts Group was founded by Hiroshi Yasuda and Leonardo Chiariglione in 1988. Chiariglione remains the organisation’s chair, and is a holder of the IBC International Honour for Excellence.

MPEG-2, the first high quality delivery standard, was published in 1995. Its authors took the pragmatic approach of specifying what could be achieved by the current state of processing. So the video is broken up into blocks, each of which is processed using a mathematical algorithm known as a discrete cosine transform. MPEG-2 typically uses groups of pictures, where only one frame is fully encoded and the rest described as variants on the first, the i-frame.

With more than 20 years of Moore’s Law under our belts, processing power is now very much more potent. That is leading to increasing interest in a different compression scheme: JPEG2000. This is capable of compressing even 4K images in a single pass, with no need to divide the picture into blocks.

It is an intraframe compression scheme, so each picture is compressed individually. And rather than discrete cosine transforms it uses wavelets, subjectively a better, ‘kinder’ compression scheme.

Processing and storage

While all of that is good, there is one element in JPEG2000 that can be overlooked, but which has a very practical outcome.Without going into the maths behind it, the compression process involves dividing the image into high and low bands, vertically and horizontally.

After the first compression path you have, as a by-product of the compression scheme, a quarter size copy of the original (half the number of pixels in each direction). The process is iterative, so you could expect to have a quarter size copy of the copy, and so on.

The compression process involves dividing the image into high and low bands, vertically and horizontally

In the decoder, you do not have to go all the way back to the original: you can decode to the size you want. So, whereas today we need a separate processing and storage chain to create browse resolution proxies, they are inherent and automatically available in JPEG2000, a big saving. If you have a 4K file, you automatically and with no other processing have an HD version, something that is approximately SD (960 x 540), and so on.

JPEG2000 compression

If you have a 4K file, with JPEG2000 you automatically have an HD version

The downside is that JPEG2000 encoding requires significant processing power. But thanks to researchers like Jiri Matela, realtime encoding can be accomplished using massively parallel processing. That sounds like another layer of complexity, but in fact the massively parallel processing can be provided by a typical GPU, making JPEG2000 practical in a standard workstation.

What are the applications? JPEG2000 is already widely used for high quality contribution circuits, especially in remote production systems.

Pac-12, the western USA college sports network, uses JPEG2000 over IP for all camera feeds from venues to its control rooms in San Francisco, saving an estimated $12 – $15k on each of the 850 games they cover in a year.

But the instant access to proxies could provide a powerful offering for post as we move into Ultra HD and VR finishing. Full resolution files remain clumsily large to store and time-consuming to move around.

Could JPEG2000 – maybe with mathematically lossless compression for the final render – be the way to make workflows practical?