Next Generation Video Coding

TNT members involved in this project:
Dr.-Ing. Marco Munderloh
Prof. Dr.-Ing. Jörn Ostermann
Show all

Video coding has been playing an important role in the world of communication and multimedia systems since the 1990s where bandwith is still a valuable commodity. Video coding techniques offer the possibility of coding video at the lowest possible bit rate while maintaining a certain level of video quality. The lastest video coding standard, High Efficieny Video Coding (HEVC), has the capability of doubling the data compression ratio compared to H.264/MPEG-4 AVC at the same level of video quality. The current standard consists of major cornerstones such as intra-frame prediction, inter-frame prediction, scaling, transform as well as scalar quantization.

However, an increased diversity of services and the growing need of applications based on videos of high resolutions (e.g. 4k x 2k, 8k x 4k) leaves still enough space to be improved. At Institut für Informationsverarbeitung (TNT), a variety of aspects of video coding has been studied to further improve the current HEVC standard, which are listed below.

Intra Coding

The prediction tools which led to the prosperous application of modern video coding standards can be roughly distinguished into inter and intra coding tools. While intra coding solely relies on information which is contained in the current picture, inter coding uses the redundancy between different pictures to further increase the coding efficiency. Therefore, in general, intra coding requires considerably higher bit rates than inter coding to achieve the same visual quality for typical video signals.

Nevertheless, intra coding is an essential part of all video coding systems: it is required if new content appears in a sequence, to start a video transmission, for random access into ongoing transmissions and for error concealment. In this project, we aim at increasing the coding efficiency for intra prediction. Additionally, we combine contour driven technologies with video coding technologies to further improve our prediction.

Texture Synthesis

The human visual system (HVS) automatically filters relevant and irrelevant information. To construct visually identical images for a viewer, only the relevant information needs to be shown to them. As the relevant information is only a small part of the original image, video coding can highly profit from modeling the HVS. We develop an algorithm where a large region of similar texture (e.g. an homogenously textured wall) is represented by a small patch extracted from that region. A region perceptually similar to the original can be reconstructed from this small patch. Obviously, only coding a small patch instead of the whole region results in high bitrate savings.

Deep Learning-based Encoder Control

The superior coding efficiency of HEVC and its extensions is achieved at the expense of very complex encoders. It was analyzed that HEVC encoders are several times more complex than AVC encoders. One main complexity driver for HEVC encoders is the comprehensive rate-distortion (RD) optimization which is indispensable to fully exploit all benefits of the HEVC standard. A major disadvantage of the RD optimization is that encoders which cannot afford a comprehensive RD optimization will likely not accomplish the optimal coding efficiency. The RD optimization consists in the evaluation of all combination possibilities (coding modes, parameter for these coding modes, partitioning, etc.) and the selection of the combination with the smallest RD costs. In case of the intra prediction, the RD optimization determines the intra prediction mode. Specifically, for HEVC, there are 33 angular intra prediction modes, the DC mode and the planar mode.

Therefore, to overcome the described disadvantage, we aim at avoiding the RD optimization complexity for the intra prediction mode decision. Considering that the intra prediction mode decision can be formulated as a classification problem with the different intra prediction modes forming the classes, we suggest machine learning approaches as solution. Deep learning is a very active topic in the machine learning community. It is evident that deep learning approaches provide superior results for classification problems by utilizing deep convolutional neural networks (CNNs). For this reason, we use CNNs for the intra prediction mode decision.

The general motion compensation uses previously coded blocks to predict the content of current block, in order to reduce the bit rate by coding only the displacement vector and the difference instead of the original block content. The prediction accuracy, however, is decreased by varying motion blur, which accompanies objects in acceleration.

In order to compensate the inaccuracy brought by generation motion compensation, the characteristic of motion blur is studied and several approaches based on reference frame filtering have been considered and attempted. The approaches are referred to as motion blur compensation methods. Current approaches are aiming at the the case of single layer coding and avoids the additional signaling of the filter choice or the filter coeffcients. An example is shown below in Fig 1.
Example Prediction Result

Figure 1: Prediction mode distribution. Red: Blur, Green: Skip , Yellow: Inter and White: Intra.

Applications like remote computing and wireless displays together with the growing usage of mobile devices (e.g. smartphones) have led to new scenarios how computers are used. In these scenarios the output is displayed on a different device (e.g. on a tablet computer) than the program execution device (e.g. a server in the cloud). For this purpose, the program output needs to be coded and transmitted from the execution device to the display device. Commonly, the coding of computer generated video signals is referred to as screen content coding.

A screen content coding extension for the state-of-the-art video coding standard HEVC was developed by the Joint Collaborative Team on Video Coding (JCT-VC). New coding tools, which address several typical characteristics of screen content (e.g. no noise, small number of different colors, RGB source material), were included in this extension. However, none of the new coding tools addresses static content.

Therefore, taking into account that the absence of temporal changes is very common for screen content videos, we propose coding tools specifically addressing this kind of content. This coding mode, which we refer to as copy mode, is based on the direct copy of the collocated block from the reference frame. Additionally, considering real time applications, the copy mode is enhanced with several encoder optimizations to enable the fast encoding of static content.

Show recent publications only
  • Conference Contributions
    • Thorsten Laude, Felix Haub, Jörn Ostermann
      HEVC Inter Coding using Deep Recurrent Neural Networks and Artificial Reference Pictures
      Picture Coding Symposium (PCS), IEEE, Ningbo, CN, November 2019
    • Deyao Zhu, Marco Munderloh, Bodo Rosenhahn, Jörg Stückler
      Learning to Disentangle Latent Physical Factors for Video Prediction
      German Conference on Pattern Recognition (GCPR), September 2019
    • Holger Meuel, Stephan Ferenz, Yiqun Liu, Jörn Ostermann
      Rate-Distortion Theory for Simplified Affine Motion Compensation Used in Video Coding
      Proceedings of the IEEE International Conference on Visual Communications and Image Processing (VCIP), Taichung, Taiwan, December 2018
    • Felix Haub, Thorsten Laude, Jörn Ostermann
      HEVC Inter Coding Using Deep Recurrent Neural Networks and Artificial Reference Pictures
      arXiv Preprint 1812.02137 , December 2018
    • Holger Meuel, Stephan Ferenz, Yiqun Liu, Jörn Ostermann
      Rate-Distortion Theory for Affine Global Motion Compensation in Video Coding
      Proceedings of the 25th IEEE International Conference on Image Processing (ICIP), pp. 3593-3597, Athens, Greece, October 2018
    • Holger Meuel
      Rate-Distortion Theory for Affine (Global) Motion Compensation in Video Coding
      Proceedings of the 4th Summer School on Video Compression and Processing (SVCP) 2018, Leibniz Universität Hannover, Institut für Informationsverarbeitung, p. 6, Hannover, Germany, July 2018, edited by Voges, Jan
    • Thorsten Laude, Yeremia Gunawan Adhisantoso, Jan Voges, Marco Munderloh, Jörn Ostermann
      A Comparison of JEM and AV1 with HEVC: Coding Tools, Coding Efficiency and Complexity
      2018 Picture Coding Symposium (PCS), pp. 36-40, June 2018
    • Yiqun Liu, Jörn Ostermann
      Scene-based KLT for Intra Coding in HEVC
      Picture Coding Symposium (PCS), June 2018
    • Bastian Wandt, Thorsten Laude, Bodo Rosenhahn, Jörn Ostermann
      Extending HEVC with a Texture Synthesis Framework using Detail-aware Image Decomposition
      Proceedings of the Picture Coding Symposium (PCS), IEEE, San Francisco, US, June 2018
    • Bastian Wandt, Thorsten Laude, Bodo Rosenhahn, Jörn Ostermann
      Detail-aware image decomposition for an HEVC-based texture synthesis framework
      Data Compression Conference (DCC), March 2018
    • Bastian Wandt, Thorsten Laude, Yiqun Liu, Bodo Rosenhahn, Jörn Ostermann
      Extending HEVC Using Texture Synthesis
      IEEE Visual Communications and Image Processing (VCIP), St. Petersburg, Florida, USA, December 2017
    • Thorsten Laude, Jörn Ostermann
      Contour-based Multidirectional Intra Coding for HEVC
      Proceedings of 32nd Picture Coding Symposium (PCS), Nuremberg, Germany, December 2016
    • Thorsten Laude, Jörn Ostermann
      Deep learning-based intra prediction mode decision for HEVC
      Proceedings of 32nd Picture Coding Symposium (PCS), Nuremberg, Germany, December 2016
    • Thorsten Laude, Jörn Ostermann
      Copy Mode for Static Screen Content Coding with HEVC
      IEEE International Conference on Image Processing (ICIP), IEEE, pp. 1930 - 1934, Québec City, Canada, September 2015
    • Yiqun Liu, Jörn Ostermann
      Fast Motion Blur Compensation in HEVC Using Fixed-length Filter
      IEEE International Conference on Image Processing (ICIP), Québec City, Canada, September 2015
    • Yiqun Liu, Wei Wu, Jörn Ostermann
      Motion Blur Compensation in HEVC Using Fixed-Length Adaptive Filter
      Proceedings of 31th Picture Coding Symposium, Cairns, Australia, May 2015
    • Thorsten Laude, Xiaoyu Xiu, Jie Dong, Yuwen He, Yan Ye, Jörn Ostermann
      Improved Inter-Layer Prediction for the Scalable Extensions of HEVC
      Data Compression Conference (DCC), IEEE, p. 412, Snowbird, UT, USA, March 2014
    • Thorsten Laude, Xiaoyu Xiu, Jie Dong, Yuwen He, Yan Ye, Jörn Ostermann
      Scalable Extension of HEVC Using Enhanced Inter-Layer Prediction
      IEEE International Conference on Image Processing (ICIP), IEEE, pp. 3739-3743, Paris, France, 2014
    • Thorsten Laude, Holger Meuel, Yiqun Liu, Jörn Ostermann
      Motion Blur Compensation in Scalable HEVC Hybrid Video Coding
      Proceedings of 30th Picture Coding Symposium (PCS), IEEE, pp. 313-316, San Jose, California, USA, December 2013
    • Holger Meuel, Julia Schmidt, Marco Munderloh, Jörn Ostermann
      Analysis of Coding Tools and Improvement of Text Readability for Screen Content
      Proceedings of Picture Coding Symposium (PCS), IEEE, Krakow, Poland, May 2012
    • Sven Klomp, Marco Munderloh, Jörn Ostermann
      Block Size Dependent Error Model for Motion Compensation
      IEEE International Conference on Image Processing, pp. 969-972, Hong Kong, September 2010
    • Marco Munderloh, Sven Klomp, Jörn Ostermann
      Mesh-based Decoder-Side Motion Estimation
      IEEE International Conference on Image Processing, pp. 2049-2052, Hong Kong, September 2010
    • Robert Cohen, Sven Klomp, Anthony Vetro, Huifang Sun
      Direction-Adaptive Transforms for Coding Prediction Residuals
      IEEE International Conference on Image Processing, pp. 185-188, Hong Kong, September 2010
  • Journals
    • Holger Meuel, Jörn Ostermann
      Analysis of Affine Motion-Compensated Prediction in Video Coding
      IEEE Transactions on Image Processing, IEEE, Vol. 29, pp. 7359-7374, June 2020
    • Thorsten Laude, Yeremia Gunawan Adhisantoso, Jan Voges, Marco Munderloh, Jörn Ostermann
      A Comprehensive Video Codec Comparison
      APSIPA Transactions on Signal and Information Processing, Vol. 8, No. 1, November 2019
    • Thorsten Laude, Jan Tumbrägel, Marco Munderloh, Jörn Ostermann
      Non-linear Contour-based Multidirectional Intra Coding
      APSIPA Transactions on Signal and Information Processing, Cambridge University Press, Vol. 7, No. 11, Cambridge, October 2018, edited by Tatsuya Kawahara
    • Fernando Pereira, Luis Torres, Christine Guillemot, Touradj Ebrahimi, Riccardo Leonardi, Sven Klomp
      Distributed Video Coding: Selecting the most promising application scenarios
      Signal Processing: Image Communication, Elsevier B.V., Vol. 23, No. 5, pp. 339-352, June 2008
    • Joern Ostermann
      Object-based analysis-synthesis Coding based on the source model of moving rigid 3D-objects
      Signal Processing: Image Communication, Nr. 6, , pp. 143-161, January 1994
  • Books
    • Dipl.-Ing. Thorsten Laude
      Konturbasierte multidirektionale Intra-Prädiktion für die Videocodierung
      Fortschritt-Berichte VDI, VDI Verlag GmbH, Vol. 10, No. 871, 2021
    • Holger Meuel
      Analysis of Affine Motion-Compensated Prediction and its Application in Aerial Video Coding
      Fortschritt-Berichte VDI, VDI Verlag GmbH, Vol. 10, No. 865, Düsseldorf, December 2019
  • Book Chapters
    • Holger Meuel, Julia Schmidt, Marco Munderloh, Jörn Ostermann
      Advanced Video Coding for Next-Generation Multimedia Services - Chapter 3: Region of Interest Coding for Aerial Video Sequences Using Landscape Models
      Advanced Video Coding for Next-Generation Multimedia Services, Intech, pp. 51-78, January 2013, edited by Yo-Sung Ho
  • Standardisation Contributions
    • Holger Meuel, Stephan Preihs, Jörn Ostermann
      On Light-field Cameras using the Example of Lytro Illum
      ISO/IEC JTC1/SC29/WG11 Document M36169, 112th MPEG-Meeting, Warsaw, PL, June 2015
  • Technical Report
    • Holger Meuel, Marco Munderloh, Jörn Ostermann
      Radial Distortion in Hybrid Video Coding
      Technical Report, Institut für Informationsverarbeitung, Hannover, Germany, June 2012