The recent market focus on TikTok and Zoom during the pandemic is a clear indication how streaming video has become such an important part of people’s life these days...
The future of communications is about video.
The recent focus on media applications like TikTok and business services such as Zoom during the pandemic is a clear indication how streaming video has become such an important part of people’s life these days.
Video, though, is a bandwidth hog. Over the years, standards have been created to compress video streams to reduce bandwidth requirement without losing visual quality. It also allowed higher quality video to be stored and streamed with existing hardware. The MPEG (Motion Picture Expert Group) committee of the International Organization for Standardization (or ISO) and the International Telecommunication Union (ITU) have history of developing standards that have been widely deployed and licensed internationally. The MPEG and ITU standards have evolved over the years, but they have always been the work of multiple companies that pool their IP and license it to equipment vendors and content providers.
In July the latest version of the standard was finalized — the Versatile Video Coding (VVC) standard (also known as H.266). Final publication will occur within in a few months and first deployments are expected in 2021.
One of the key contributors was Qualcomm, by the group led by Marta Karczewicz, vice president of technology, Qualcomm Technologies. Karczewicz is also a nominee for the European Inventor Award’s Lifetime Achievement laurel for her seminal work in video coding. In a conversation with Karczewicz, she explained some of the changes from the previous standard, High Efficiency Video Coding (HEVC) or H.265.
HEVC has been critical to delivering 4K resolution and high dynamic Range (HDR) video to consumers, but it’s been around since 2013 and there’s a lot of developments since then. Without video compression technology 4K/60Hz HDR stream would take 7 Gbps! For reference, Roku and other streaming media boxes only require up to 25 Mbps for 4K UHD video with existing compression standards.
VVC will further compress file size for content by 40% over HEVC as we prepare for 8K video streams. Several tools for HDR and wide color gamut were annexes onto the previous HEVC/H.265 spec and are now more tightly integrated into VVC. The new standard has enhanced VR and 360 video support for the next generation of immersive experiences. Game streaming will also be a key future entertainment option and the standard must be cognizant of latency issues. There are additional text graphic compression features designed for video conferencing and other business related application like Zoom.
The other goal is that the process has to support asymmetric operation — the compression of the original content requires the most compute because it often done once while decompression can occur across multiple clients. In addition, receiving devices are often power and cost constrained, so the video decompression needs to be power- and cost-efficient. Both power and cost are often determined by the complexity of the logic and less complex decompression logic have both lower power and lower cost.
The goal is to make the delivered video appear as close to the original uncompressed video stream as possible, but the end result is dependent on the content. All video compression technologies are based on finding redundant patterns of raw pixels that can be replaced by a smaller codes.
Part of the compression comes from comparing multiple video frames and finding pixel patterns that are similar between frames but have a motion vector. Compression is possible by modeling the motion between different frames and sending only the locality changer.
These motion models have become more sophisticated, requiring less information to restore the original image at the end device. There are also optical flow techniques first developed for computer vision being used in the standard. The mathematic models used to encode images use more complex higher order polynomials, but new logic design based on process scaling can handle these workloads. The image quality also benefits from adaptive filters. For HDR video, the standard has luma matching with chroma scaling to handle the wide dynamic range.
Today all this work is based on digital signal processing algorithms, but in the future standards, machine learning will play a part in predicting pixel grouping replacements. Also, future developments will be looking into compressing time of flight information (for depth) and point cloud information, such as created by LIDAR in autonomous vehicles. The future will involve more compression of 3D space and time information for AR and VR applications. Karczewicz grew up liking to solve puzzles, and squeezing ever higher quality video through wired and wireless data channels is a puzzle that will never be finished.
Qualcomm posted a blog post explaining their involvement in VVC.