Conditional Coding for Learned Image and Video Compression

TNT members involved in this project:
Martin Benjak, M. Sc.
Prof. Dr.-Ing. Jörn Ostermann

In this project we investigate learning-based video coding from the point of view of conditional coding with a meta-learning-based method for regularization and dynamic adaptation in cooperation with the Department of Computer Science of the National Chiao Tung University in Taiwan.

The development of new deep neural network architectures such as variational autoencoders (VAE) and augmented normalizing flows (ANF) opens up new possibilities for learning-based video coding. In this project, we develop a new method for conditional video coding based on ANFs instead of the often used VAEs. ANFs offer the advantage that they are more expressive than VAEs, but still include VAEs as a special case. In another aspect of this project, we deal with the adaptability and generalization ability of a learning-based video codec. A weakness of learning-based video codecs is the large deviation in data distribution between training and test data. This means that a codec achieves good results on the training data, but poor results on unknown data. To improve generalization capabilities, we develop a meta cost function that allows preserving common properties between frames of a video sequence. Furthermore, we use this meta cost function to dynamically adapt the decoder to the data distribution of the input data during inference.

Show all publications
  • Martin Benjak, Yi-Hsin Chen, Wen-Hsiao Peng, Jörn Ostermann,
    Scalable COOL-CHIC: Dual-Resolution Images from a Single Bitstream
    Picture Coding Symposium (PCS), Aachen, Germany, December 2025
  • Martin Benjak, Yi-Hsin Chen, Wen-Hsiao Peng, Jörn Ostermann,
    Progressive COOL-CHIC: Efficient Decoding for Dual-Resolution Images
    IEEE Visual Communications and Image Processing (VCIP), Klagenfurt, Austria, December 2025
  • Kuan-Wei Ho, Yi-Hsin Chen, Martin Benjak, Jörn Ostermann, Wen-Hsiao Peng
    A Cross-Framework Study of Temporal Information Buffering Strategies for Learned Video Compression
    Picture Coding Symposium (PCS), Aachen, Germany, December 2025