Next Generation Video Coding

TNT members involved in this project:

Video coding has been playing an important role in the world of communication and multimedia systems since the 1990s where bandwith is still a valuable commodity. Video coding techniques offer the possibility of coding video at the lowest possible bit rate while maintaining a certain level of video quality. The lastest video coding standard, High Efficieny Video Coding (HEVC), has the capability of doubling the data compression ratio compared to H.264/MPEG-4 AVC at the same level of video quality. The current standard consists of major cornerstones such as intra-frame prediction, inter-frame prediction, scaling, transform as well as scalar quantization.

However, an increased diversity of services and the growing need of applications based on videos of high resolutions (e.g. 4k x 2k, 8k x 4k) leaves still enough space to be improved. At Institut für Informationsverarbeitung (TNT), a variety of aspects of video coding has been studied to further improve the current HEVC standard, which are listed below.

Intra Coding, Texture Synthesis, and Deep Learning-based Encoder Control

Intra Coding

The prediction tools which led to the prosperous application of modern video coding standards can be roughly distinguished into inter and intra coding tools. While intra coding solely relies on information which is contained in the current picture, inter coding uses the redundancy between different pictures to further increase the coding efficiency. Therefore, in general, intra coding requires considerably higher bit rates than inter coding to achieve the same visual quality for typical video signals.

Nevertheless, intra coding is an essential part of all video coding systems: it is required if new content appears in a sequence, to start a video transmission, for random access into ongoing transmissions and for error concealment. In this project, we aim at increasing the coding efficiency for intra prediction. Additionally, we combine contour driven technologies with video coding technologies to further improve our prediction.

Texture Synthesis

The human visual system (HVS) automatically filters relevant and irrelevant information. To construct visually identical images for a viewer, only the relevant information needs to be shown to them. As the relevant information is only a small part of the original image, video coding can highly profit from modeling the HVS. We develop an algorithm where a large region of similar texture (e.g. an homogenously textured wall) is represented by a small patch extracted from that region. A region perceptually similar to the original can be reconstructed from this small patch. Obviously, only coding a small patch instead of the whole region results in high bitrate savings.

Deep Learning-based Encoder Control

The superior coding efficiency of HEVC and its extensions is achieved at the expense of very complex encoders. It was analyzed that HEVC encoders are several times more complex than AVC encoders. One main complexity driver for HEVC encoders is the comprehensive rate-distortion (RD) optimization which is indispensable to fully exploit all benefits of the HEVC standard. A major disadvantage of the RD optimization is that encoders which cannot afford a comprehensive RD optimization will likely not accomplish the optimal coding efficiency. The RD optimization consists in the evaluation of all combination possibilities (coding modes, parameter for these coding modes, partitioning, etc.) and the selection of the combination with the smallest RD costs. In case of the intra prediction, the RD optimization determines the intra prediction mode. Specifically, for HEVC, there are 33 angular intra prediction modes, the DC mode and the planar mode.

Therefore, to overcome the described disadvantage, we aim at avoiding the RD optimization complexity for the intra prediction mode decision. Considering that the intra prediction mode decision can be formulated as a classification problem with the different intra prediction modes forming the classes, we suggest machine learning approaches as solution. Deep learning is a very active topic in the machine learning community. It is evident that deep learning approaches provide superior results for classification problems by utilizing deep convolutional neural networks (CNNs). For this reason, we use CNNs for the intra prediction mode decision.

Motion Blur Compensation

The general motion compensation uses previously coded blocks to predict the content of current block, in order to reduce the bit rate by coding only the displacement vector and the difference instead of the original block content. The prediction accuracy, however, is decreased by varying motion blur, which accompanies objects in acceleration.

In order to compensate the inaccuracy brought by generation motion compensation, the characteristic of motion blur is studied and several approaches based on reference frame filtering have been considered and attempted. The approaches are referred to as motion blur compensation methods. Current approaches are aiming at the the case of single layer coding and avoids the additional signaling of the filter choice or the filter coeffcients. An example is shown below in Fig 1.

Figure 1: Prediction mode distribution. Red: Blur, Green: Skip , Yellow: Inter and White: Intra.

Screen Content Coding

Applications like remote computing and wireless displays together with the growing usage of mobile devices (e.g. smartphones) have led to new scenarios how computers are used. In these scenarios the output is displayed on a different device (e.g. on a tablet computer) than the program execution device (e.g. a server in the cloud). For this purpose, the program output needs to be coded and transmitted from the execution device to the display device. Commonly, the coding of computer generated video signals is referred to as screen content coding.

A screen content coding extension for the state-of-the-art video coding standard HEVC was developed by the Joint Collaborative Team on Video Coding (JCT-VC). New coding tools, which address several typical characteristics of screen content (e.g. no noise, small number of different colors, RGB source material), were included in this extension. However, none of the new coding tools addresses static content.

Therefore, taking into account that the absence of temporal changes is very common for screen content videos, we propose coding tools specifically addressing this kind of content. This coding mode, which we refer to as copy mode, is based on the direct copy of the collocated block from the reference frame. Additionally, considering real time applications, the copy mode is enhanced with several encoder optimizations to enable the fast encoding of static content.

Publications

Show recent publications only

Conference Contributions
- Thorsten Laude, Felix Haub, Jörn Ostermann
  HEVC Inter Coding using Deep Recurrent Neural Networks and Artificial Reference Pictures
  Picture Coding Symposium (PCS), IEEE, Ningbo, CN, November 2019
  (pdf) BibTeX
- Deyao Zhu, Marco Munderloh, Bodo Rosenhahn, Jörg Stückler
  Learning to Disentangle Latent Physical Factors for Video Prediction
  German Conference on Pattern Recognition (GCPR), September 2019
  (PDF) BibTeX
- Holger Meuel, Stephan Ferenz, Yiqun Liu, Jörn Ostermann
  Rate-Distortion Theory for Simplified Affine Motion Compensation Used in Video Coding
  Proceedings of the IEEE International Conference on Visual Communications and Image Processing (VCIP), Taichung, Taiwan, December 2018
  (pdf) BibTeX
- Felix Haub, Thorsten Laude, Jörn Ostermann
  HEVC Inter Coding Using Deep Recurrent Neural Networks and Artificial Reference Pictures
  arXiv Preprint 1812.02137 , December 2018
  (arXiv.org) BibTeX
- Holger Meuel, Stephan Ferenz, Yiqun Liu, Jörn Ostermann
  Rate-Distortion Theory for Affine Global Motion Compensation in Video Coding
  Proceedings of the 25th IEEE International Conference on Image Processing (ICIP), pp. 3593-3597, Athens, Greece, October 2018
  (pdf, pdfIEEEexplore) BibTeX
- Holger Meuel
  Rate-Distortion Theory for Affine (Global) Motion Compensation in Video Coding
  Proceedings of the 4th Summer School on Video Compression and Processing (SVCP) 2018, Leibniz Universität Hannover, Institut für Informationsverarbeitung, p. 6, Hannover, Germany, July 2018, edited by Voges, Jan
  (pdfLink) BibTeX
- Thorsten Laude, Yeremia Gunawan Adhisantoso, Jan Voges, Marco Munderloh, Jörn Ostermann
  A Comparison of JEM and AV1 with HEVC: Coding Tools, Coding Efficiency and Complexity
  2018 Picture Coding Symposium (PCS), pp. 36-40, June 2018
  (DOI) BibTeX
- Yiqun Liu, Jörn Ostermann
  Scene-based KLT for Intra Coding in HEVC
  Picture Coding Symposium (PCS), June 2018
  BibTeX
- Bastian Wandt, Thorsten Laude, Bodo Rosenhahn, Jörn Ostermann
  Extending HEVC with a Texture Synthesis Framework using Detail-aware Image Decomposition
  Proceedings of the Picture Coding Symposium (PCS), IEEE, San Francisco, US, June 2018
  (pdf) BibTeX
- Bastian Wandt, Thorsten Laude, Bodo Rosenhahn, Jörn Ostermann
  Detail-aware image decomposition for an HEVC-based texture synthesis framework
  Data Compression Conference (DCC), March 2018
  BibTeX
- Bastian Wandt, Thorsten Laude, Yiqun Liu, Bodo Rosenhahn, Jörn Ostermann
  Extending HEVC Using Texture Synthesis
  IEEE Visual Communications and Image Processing (VCIP), St. Petersburg, Florida, USA, December 2017
  (pdfDOI) BibTeX
- Thorsten Laude, Jörn Ostermann
  Contour-based Multidirectional Intra Coding for HEVC
  Proceedings of 32nd Picture Coding Symposium (PCS), Nuremberg, Germany, December 2016
  (pdfDOI) BibTeX
- Thorsten Laude, Jörn Ostermann
  Deep learning-based intra prediction mode decision for HEVC
  Proceedings of 32nd Picture Coding Symposium (PCS), Nuremberg, Germany, December 2016
  (pdfDOI) BibTeX
- Thorsten Laude, Jörn Ostermann
  Copy Mode for Static Screen Content Coding with HEVC
  IEEE International Conference on Image Processing (ICIP), IEEE, pp. 1930 - 1934, Québec City, Canada, September 2015
  (pdfDOI) BibTeX
- Yiqun Liu, Jörn Ostermann
  Fast Motion Blur Compensation in HEVC Using Fixed-length Filter
  IEEE International Conference on Image Processing (ICIP), Québec City, Canada, September 2015
  (pdfIEEEexplore) BibTeX
- Yiqun Liu, Wei Wu, Jörn Ostermann
  Motion Blur Compensation in HEVC Using Fixed-Length Adaptive Filter
  Proceedings of 31th Picture Coding Symposium, Cairns, Australia, May 2015
  (pdfIEEEexplore) BibTeX
- Thorsten Laude, Xiaoyu Xiu, Jie Dong, Yuwen He, Yan Ye, Jörn Ostermann
  Improved Inter-Layer Prediction for the Scalable Extensions of HEVC
  Data Compression Conference (DCC), IEEE, p. 412, Snowbird, UT, USA, March 2014
  (pdfDOI) BibTeX
- Thorsten Laude, Xiaoyu Xiu, Jie Dong, Yuwen He, Yan Ye, Jörn Ostermann
  Scalable Extension of HEVC Using Enhanced Inter-Layer Prediction
  IEEE International Conference on Image Processing (ICIP), IEEE, pp. 3739-3743, Paris, France, 2014
  (pdfDOI) BibTeX
- Thorsten Laude, Holger Meuel, Yiqun Liu, Jörn Ostermann
  Motion Blur Compensation in Scalable HEVC Hybrid Video Coding
  Proceedings of 30th Picture Coding Symposium (PCS), IEEE, pp. 313-316, San Jose, California, USA, December 2013
  (pdfIEEEexplore) BibTeX
- Holger Meuel, Julia Schmidt, Marco Munderloh, Jörn Ostermann
  Analysis of Coding Tools and Improvement of Text Readability for Screen Content
  Proceedings of Picture Coding Symposium (PCS), IEEE, Krakow, Poland, May 2012
  (pdf, pdfDOI) BibTeX
- Sven Klomp, Marco Munderloh, Jörn Ostermann
  Block Size Dependent Error Model for Motion Compensation
  IEEE International Conference on Image Processing, pp. 969-972, Hong Kong, September 2010
  (pdf) BibTeX
- Marco Munderloh, Sven Klomp, Jörn Ostermann
  Mesh-based Decoder-Side Motion Estimation
  IEEE International Conference on Image Processing, pp. 2049-2052, Hong Kong, September 2010
  (pdf) BibTeX
- Robert Cohen, Sven Klomp, Anthony Vetro, Huifang Sun
  Direction-Adaptive Transforms for Coding Prediction Residuals
  IEEE International Conference on Image Processing, pp. 185-188, Hong Kong, September 2010
  (pdf) BibTeX
Journals
- Holger Meuel, Jörn Ostermann
  Analysis of Affine Motion-Compensated Prediction in Video Coding
  IEEE Transactions on Image Processing, IEEE, Vol. 29, pp. 7359-7374, June 2020
  (pdfIEEEexplore) BibTeX
- Thorsten Laude, Yeremia Gunawan Adhisantoso, Jan Voges, Marco Munderloh, Jörn Ostermann
  A Comprehensive Video Codec Comparison
  APSIPA Transactions on Signal and Information Processing, Vol. 8, No. 1, November 2019
  (pdfDOI) BibTeX
- Thorsten Laude, Jan Tumbrägel, Marco Munderloh, Jörn Ostermann
  Non-linear Contour-based Multidirectional Intra Coding
  APSIPA Transactions on Signal and Information Processing, Cambridge University Press, Vol. 7, No. 11, Cambridge, October 2018, edited by Tatsuya Kawahara
  (pdfLink) BibTeX
- Fernando Pereira, Luis Torres, Christine Guillemot, Touradj Ebrahimi, Riccardo Leonardi, Sven Klomp
  Distributed Video Coding: Selecting the most promising application scenarios
  Signal Processing: Image Communication, Elsevier B.V., Vol. 23, No. 5, pp. 339-352, June 2008
  (pdf) BibTeX
- Joern Ostermann
  Object-based analysis-synthesis Coding based on the source model of moving rigid 3D-objects
  Signal Processing: Image Communication, Nr. 6, , pp. 143-161, January 1994
  (pdf) BibTeX
Books
- Dipl.-Ing. Thorsten Laude
  Konturbasierte multidirektionale Intra-Prädiktion für die Videocodierung
  Fortschritt-Berichte VDI, VDI Verlag GmbH, Vol. 10, No. 871, 2021
  BibTeX
- Holger Meuel
  Analysis of Affine Motion-Compensated Prediction and its Application in Aerial Video Coding
  Fortschritt-Berichte VDI, VDI Verlag GmbH, Vol. 10, No. 865, Düsseldorf, December 2019
  (Link, VDI publisher site) BibTeX
Book Chapters
- Holger Meuel, Julia Schmidt, Marco Munderloh, Jörn Ostermann
  Advanced Video Coding for Next-Generation Multimedia Services - Chapter 3: Region of Interest Coding for Aerial Video Sequences Using Landscape Models
  Advanced Video Coding for Next-Generation Multimedia Services, Intech, pp. 51-78, January 2013, edited by Yo-Sung Ho
  (pdfLink) BibTeX
Standardisation Contributions
- Holger Meuel, Stephan Preihs, Jörn Ostermann
  On Light-field Cameras using the Example of Lytro Illum
  ISO/IEC JTC1/SC29/WG11 Document M36169, 112th MPEG-Meeting, Warsaw, PL, June 2015
  (pdf) BibTeX
Technical Report
- Holger Meuel, Marco Munderloh, Jörn Ostermann
  Radial Distortion in Hybrid Video Coding
  Technical Report, Institut für Informationsverarbeitung, Hannover, Germany, June 2012
  (pdf) BibTeX