Coding and processing of high-throughput sequencing data

TNT members involved in this project:
Dr.-Ing. Marco Munderloh
Prof. Dr.-Ing. Jörn Ostermann
Dipl.-Ing. Jan Voges

Over the past years, technological advances in sequencing (i.e., the process of reading out genomic information) have led to a faster and more cost-efficient approach to sequence individual genomes. Because of the enormous amount of sequencing data generated by high-throughput sequencing (HTS) machines, the processing, storage, and analysis of sequencing data entails novel challenges for the scientific community. Novel processes and tools have to be developed to overcome the current limitations in terms of storage space, processing speed, and many more.

Raw sequencing data generated by HTS machines passes through a great number of different analysis steps. Our goal is to develop novel algorithms to enhance the information processing "from the tissue to the hard drive".

Show all publications
  • Jan Voges, Ali Fotouhi, Jörn Ostermann, M. Oguzhan Külekci
    A Two-Level Scheme for Quality Score Compression
    Accepted at 10th International Conference on Bioinformatics and Computational Biology (BICOB 2018), International Society for Computers and their Applications (ISCA), Las Vegas, NV (US), March 2018
  • Ana A Hernandez-Lopez, Jan Voges, Claudio Alberti, Marco Mattavelli, Jörn Ostermann
    Lossy compression of quality scores in differential gene expression: A first assessment and impact analysis
    Accepted at 2018 Data Compression Conference (DCC), IEEE Computer Society Conference Publishing Services (CPS), Snowbird, UT (US), March 2018
  • Jan Voges, Jörn Ostermann, Mikel Hernaez
    CALQ: compression of quality values of aligned sequencing data
    Bioinformatics (Advance article btx737), Oxford University Press, November 2017, edited by Bonnie Berger