Institut für Informationsverarbeitung (TNT) - Research: Conditional Coding for Learned Compression

Description

In this project we investigate learning-based video coding from the point of view of conditional coding with a meta-learning-based method for regularization and dynamic adaptation in cooperation with the Department of Computer Science of the National Chiao Tung University in Taiwan.

The development of new deep neural network architectures such as variational autoencoders (VAE) and augmented normalizing flows (ANF) opens up new possibilities for learning-based video coding. In this project, we develop a new method for conditional video coding based on ANFs instead of the often used VAEs. ANFs offer the advantage that they are more expressive than VAEs, but still include VAEs as a special case. In another aspect of this project, we deal with the adaptability and generalization ability of a learning-based video codec. A weakness of learning-based video codecs is the large deviation in data distribution between training and test data. This means that a codec achieves good results on the training data, but poor results on unknown data. To improve generalization capabilities, we develop a meta cost function that allows preserving common properties between frames of a video sequence. Furthermore, we use this meta cost function to dynamically adapt the decoder to the data distribution of the input data during inference.

Publications

Show recent publications only

Conference Contributions
- Yi-Hsin Chen, Kuan-Wei Ho, Martin Benjak, Jörn Ostermann, Wen-Hsiao Peng
  On the Rate-Distortion-Complexity Trade-offs of Neural Video Coding
  IEEE 26th International Workshop on Multimedia Signal Processing (MMSP), 2024
  BibTeX
- Martin Benjak, Yi-Hsin Chen, Wen-Hsiao Peng, Jörn Ostermann
  Learning-Based Scalable Video Coding with Spatial and Temporal Prediction
  IEEE International Conference on Visual Communications and Image Processing (VCIP), Jeju, Korea, December 2023
  (pdf) BibTeX
- Hong-Sheng Xie, Yi-Hsin Chen, Wen-Hsiao Peng, Martin Benjak, Jörn Ostermann
  Rate Adaptation for Learned Two-layer B-frame Coding without Signaling Motion Information
  IEEE International Conference on Visual Communications and Image Processing (VCIP), Jeju, Korea, December 2023
  BibTeX
Journals
- Yi-Hsin Chen, Hong-Sheng Xie, Cheng-Wei Chen, Zong-Lin Gao, Martin Benjak, Wen-Hsiao Peng, Jörn Ostermann
  Maskcrt: Masked conditional residual transformer for learned video compression
  IEEE Transactions on Circuits and Systems for Video Technology, 2024
  BibTeX
Standardisation Contributions
- Yi-Hsin Chen, Yi-Chen Yao, Kuan-Wei Ho, Martin Benjak, Wen-Hsiao Peng
  Bitstream Generation and Bit Rate Fitting Results of MaskCRT for CVQM UHD and 4K Sequences
  17th Meeting of ISO/IEC JTC 1/SC 29/AG 5 Document m69870, November 2024
  BibTeX
- Yi-Hsin Chen, Chen-Wei Cheng, Martin Benjak, Wen-Hsiao Peng
  Progress report on MaskCRT
  15th Meeting of ISO/IEC JTC 1/SC 29/AG 5 Document m66976, April 2024
  BibTeX
- Yi-Hsin Chen, Cheng-Wei Chen, Zong-Lin Gao, Martin Benjak, Wen-Hsiao Peng
  Results on the Bit Rate Fitting for MaskCRT
  15th Meeting of ISO/IEC JTC 1/SC 29/AG 5 Document m67455, April 2024
  BibTeX
- Yi-Hsin Chen, Zong-Lin Gao, Yi-Chen Yao, Kuan-Wei Ho, Martin Benjak, Wen-Hsiao Peng
  Bitstream Generation and Bit Rate Fitting Results of MaskCRT for CVQM HD Sequences
  15th Meeting of ISO/IEC JTC 1/SC 29/AG 5 Document m68079, April 2024
  BibTeX
- Yi-Hsin Chen, Zong-Lin Gao, Martin Benjak, Wen-Hsiao Peng
  Response to Call for Learning-Based Video Codecs for Study of Quality Assessment by NYCU and LUH
  14th Meeting of ISO/IEC JTC 1/SC 29/AG 5 Document m66163, January 2024
  BibTeX

Conditional Coding for Learned Image and Video Compression