Genomic Data Science

TNT members involved in this project:
Yeremia G. Adhisantoso, M.Sc.
Fabian Müntefering, M.Sc.
Prof. Dr.-Ing. Jörn Ostermann
Stephanie Kristin Schröder, M.Eng.
Dr.-Ing. Jan Voges

In recent years, technological advances in DNA sequencing have led to faster and more cost-efficient approaches for sequencing individual genomes and other genomic samples. Due to the enormous amount of sequencing data generated, the processing, storage, and analysis of sequencing data presents new challenges to the scientific community. New methods and tools need to be developed to improve understanding of the underlying biology and overcome current limitations in terms of storage space, processing speed, and many more.

  • Conference Contributions
    • Christian Rohlfing, Thibaut Meyer, Jens Schneider, Jan Voges
      Python Wrapper for Context-based Adaptive Binary Arithmetic Coding
      2023 IEEE International Conference on Visual Communications and Image Processing (VCIP), December 2023
    • Fabian Müntefering, Jörn Ostermann, Jan Voges
      BACON: Bacterial Clone Recognition from Metagenomic Sequencing Data
      AICPM 2023, Hannover (DE), September 2023
    • Yeremia Gunawan Adhisantoso, Jan Voges, Jörn Ostermann
      PEKORA: High-Performance 3D Genome Reconstruction Using K-th Order Spearman's Rank Correlation Approximation
      ISMB/ECCB 2023, Lyon (FR), July 2023
    • Idoia Ochoa, Hongyi Li, Florian Baumgarte, Charles Hergenrother, Jan Voges, Mikel Hernaez
      AliCo: A New Efficient Representation for SAM Files
      2019 Data Compression Conference (DCC), pp. 93-102, March 2019
    • Ana A. Hernandez-Lopez, Jan Voges, Claudio Alberti, Marco Mattavelli, Jörn Ostermann
      Lossy Compression of Quality Scores in Differential Gene Expression: A First Assessment and Impact Analysis
      2018 Data Compression Conference (DCC), pp. 167-176, March 2018
    • Ana A. Hernandez-Lopez, Jan Voges, Claudio Alberti, Marco Mattavelli, Jörn Ostermann
      Differential Gene Expression with Lossy Compression of Quality Scores in RNA-Seq Data
      2017 Data Compression Conference (DCC), p. 444, April 2017
    • Claudio Alberti, Noah Daniels, Mikel Hernaez, Jan Voges, Rachel L. Goldfeder, Ana A. Hernandez-Lopez, Marco Mattavelli, Bonnie Berger
      An Evaluation Framework for Lossy Compression of Genome Sequencing Quality Values
      2016 Data Compression Conference (DCC), pp. 221-230, March 2016
    • Jan Voges, Marco Munderloh, Jörn Ostermann
      Predictive Coding of Aligned Next-Generation Sequencing Data
      2016 Data Compression Conference (DCC), pp. 241-250, March 2016
  • Journals
    • Yeremia Gunawan Adhisantoso, Jan Voges, Christian Rohlfing, Viktor Tunev, Jens-Rainer Ohm, Jörn Ostermann
      GVC: efficient random access compression for gene sequence variations
      BMC Bioinformatics, Vol. 24, No. 1, p. 121, March 2023
    • Ilona Rosenboom, Tobias Scheithauer, Fabian C. Friedrich, Sophia Pörtner, Lisa Hollstein, Marie‑Madlen Pust, Konstantinos Sifakis, Tom Wehrbein, Bodo Rosenhahn, Lutz Wiehlmann, Patrick Chhatwal, Burkhard Tümmler, Colin F Davenport
      Wochenende - modular and flexible alignment-based shotgun metagenome analysis
      BMC Genomics, Springer Nature, November 2022
    • Jan Voges, Mikel Hernaez, Marco Mattavelli, Jörn Ostermann
      An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data
      Proceedings of the IEEE, Vol. 109, No. 9, pp. 1607-1622, September 2021
    • Jan Voges, Tom Paridaens, Fabian Müntefering, Liudmila S. Mainzer, Brian Bliss, Mingyu Yang, Idoia Ochoa, Jan Fostier, Jörn Ostermann, Mikel Hernaez
      GABAC: an arithmetic coding solution for genomic data
      Bioinformatics, Vol. 36, No. 7, pp. 2275-2277, April 2020
    • Jan Voges, Ali Fotouhi, Jörn Ostermann, M. Oguzhan Külekci
      A Two-level Scheme for Quality Score Compression
      Journal of Computational Biology, Vol. 25, No. 10, pp. 1141-1151, October 2018
    • Jan Voges, Jörn Ostermann, Mikel Hernaez
      CALQ: compression of quality values of aligned sequencing data
      Bioinformatics, Vol. 34, No. 10, pp. 1650-1658, May 2018
    • Ibrahim Numanagic, James K. Bonfield, Faraz Hach, Jan Voges, Jörn Ostermann, Claudio Alberti, Marco Mattavelli, S. Cenk Sahinalp
      Comparison of high-throughput sequencing data compression tools
      Nature Methods, Vol. 13, No. 12, pp. 1005-1008, December 2016
  • Books
    • Jan Voges
      Compression of DNA Sequencing Data
      VDI Verlag, 2022