Audio and Speech Signal Processing
Prof. Dr.-Ing. Waldo Nogueira
Organisation
Exercises and labs supervised by:
Prof. Dr.-Ing. Waldo Nogueira
Course Numbers: 36460 (Lecture), 36462 (Exercises/Labs)
Dates (Winter Semester 2016/2017,
preliminary, 4 CP):
Lecture dates (preliminary):
(Raum 1307, Gebaeude 3408: Mehrzweckgebäude) Termine am Fr. 21.10., Fr. 28.10., Fr. 11.11., Fr. 25.11., Fr. 02.12., Fr. 09.12., Fr. 16.12., Fr. 23.12., Fr. 20.01. 09:00-11:00
Excercise dates (preliminary):
(Raum 1307, Gebaeude 3408: Mehrzweckgebäude) Fr. 28.10., Fr. 11.11., Fr. 25.11., Fr. 02.12., Fr. 09.12., Fr. 16.12., Fr. 23.12., Fr. 20.01. 11:00-12:00
(Raum 1122, Gebaeude 3408: Mehrzweckgebäude) Fr. 04.11., Fr. 18.11., Fr. 13.01., Fr. 27.01. 09:00-11:00
Vorläufige Veranstaltungs- und Raumübersicht
Exam:
- Written
- Duration: 60 minutes
Contents
- Introduction
- Fundamentals of speech acoustics: Mechanisms of speech production speech, sound classification, sound representation
- Fundamentals of perception: pitch, intensity and timbre
- Spectral analysis of audio and speech signals
- Speech Models: Physical models of speech
- Fundamentals of speech perception
- Spectral transforms of audio and speech signals
- Models based of speech production: Predictive analysis based on speech production (LPC)
- Application 1: Text-to-Speech Synthesis
- Application 2: Automatic Speech Recognition: Cepstrum analysis, Hidden Markov Models
Goals of the Lecture
In this Lecture the students will develop a methodology to analyze code, recognize and synthesize audio signals
using signal processing techniques. More concrete the student should acquire the theoretical and practical competences
related to:
- Fundamentals of acoustics, physiological and perception of sound
- Fundamentals of digital signal processing of audio signals
- Methods for modeling and processing audio and speech signals
Prerequisites
Necessary Prior Knowledge:
Fundamentals of Digital Signal Processing, Probability and Information Theory.
Non necessary (but recommended) Prior Knowledge:
Knowledge of the Lectures: "Digitale Signalverarbeitung", "Statistische Methoden der Nachrichtentechnik", "Informationstheorie" and "Quellencodierung" , Fundamentals of Matlab.
Basic Literature
- Quatieri, T. F. 2001. Discrete-Time Speech Signal Processing: Principles and Practice. Prentice Hall
- Rabiner, L. R. and R. W. Schafer. 2007. Introduction to Digital Speech Processing. Foundations and Trends in Signals Processing, Vol. 1, Nos. 1-2,2007
Additional Literature
- Rabiner, L. R. and R. W. Schafer. 1978. Digital Signal Processing of Speech Signals. Prentice Hall
- O'Shaughnessy, D. 1999. Speech communications: human and machine. Wiley, John & Sons
- Rabiner, L. R. and B. H. Juang. 1993. Fundamentals of Speech Recognition. Prentice Hall
- Park, Sung-won. Linear Predictive Speech Processing
- Spanias, Andreas. 1994. "Speech Coding: A Tutorial Review". Proceedings of the IEEE
- Pan, Davis. 1995. "A Tutorial on MPEG/Audio Compression". IEEE Multimedia Journal
- Rabiner, Lawrence. 1989. "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition". Proceedings of the IEEE
Supplementary Lectures
Software
- Matlab, Praat, Audacity, Sonic Visualizer