The Institut für Informationsverarbeitung (TNT) has extensive expertise in the field of audio signal processing and develops a wide range of innovative algorithms for various application areas. These include, among others, methods for data compression of audio signals as well as techniques for the automated classification of acoustic content. Another research project deals with algorithms for the localization of acoustic events and techniques for estimating impulse responses in solid bodies. Additionally, we are researching the automation of the analysis of early childhood language acquisition using spontaneous speech samples and automatic speech recognition to identify potential language support needs. This exposé provides you with an insight into selected current research projects of our institute.
In live applications like concerts or internet broadcasts, the delay added by an encoding method is a problem. The technique developed at TNT enables nearly delay-free data rate reduction while maintaining high audio quality.
The method developed at TNT detects rotor blade damage with significantly fewer sensors than comparable methods. Audio classification techniques are used to detect damage in airborne sound signals, even in the presence of background noise.
For cochlear implant users, speech intelligibility in noisy environments is challenging. The excitation pattern compression techniques developed at TNT allow the use of binaural processing strategies, making it possible to improve speech intelligibility.
Successful language acquisition forms the basis for social participation, academic success, and long-term career prospects. In the context of increasing societal challenges—particularly the shortage of professionals in the education and therapy sectors and insufficiently standardized diagnostic procedures—the use of AI-based automation technologies offers significant innovation potential for identifying language support needs. A central component of our approach is the automatic transcription of children's spontaneous speech samples, which serves as a foundation for further analysis of both oral (e.g., lexicon, morpho-syntactic structures, articulatory features) and written language skills (e.g., reading speed, accuracy, and comprehension). Our system architecture employs state-of-the-art models from speaker diarization, automatic speech recognition, and linguistic analysis. We adapt these models for domain-specific use and continuously develop them to ensure high robustness, accuracy, and interpretability in the diagnostic context.
Irrelevance Reducing Encoding, Classification Methods, Artificial Neural Networks, Lossless Encoding Methods, Audio Feature Design, Adaptive Vector Quantization, Context-Adaptive Binary Arithmetic Coding, Time Difference of Arrival Localization, Head-Related Transfer Function