Audio and Speech Signal Processing

Prof. Dr.-Ing. Waldo Nogueira

Organisation
Contents
Goal of the Lecture
Prerequisites
Literature
Supplementary Lectures
Software

Organisation

Exercises and labs supervised by: Prof. Dr.-Ing. Waldo Nogueira

Course Numbers: 36460 (Lecture), 36462 (Exercises/Labs) Dates (Winter Semester 2016/2017, preliminary, 4 CP):
Lecture dates (preliminary):
(Raum 1307, Gebaeude 3408: Mehrzweckgebäude) Termine am Fr. 21.10., Fr. 28.10., Fr. 11.11., Fr. 25.11., Fr. 02.12., Fr. 09.12., Fr. 16.12., Fr. 23.12., Fr. 20.01. 09:00-11:00
Excercise dates (preliminary):
(Raum 1307, Gebaeude 3408: Mehrzweckgebäude) Fr. 28.10., Fr. 11.11., Fr. 25.11., Fr. 02.12., Fr. 09.12., Fr. 16.12., Fr. 23.12., Fr. 20.01. 11:00-12:00
(Raum 1122, Gebaeude 3408: Mehrzweckgebäude) Fr. 04.11., Fr. 18.11., Fr. 13.01., Fr. 27.01. 09:00-11:00

Vorläufige Veranstaltungs- und Raumübersicht
Exam:

Written
Duration: 60 minutes

Introduction
Fundamentals of speech acoustics: Mechanisms of speech production speech, sound classification, sound representation
Fundamentals of perception: pitch, intensity and timbre
Spectral analysis of audio and speech signals
Speech Models: Physical models of speech
Fundamentals of speech perception
Spectral transforms of audio and speech signals
Models based of speech production: Predictive analysis based on speech production (LPC)
Application 1: Text-to-Speech Synthesis
Application 2: Automatic Speech Recognition: Cepstrum analysis, Hidden Markov Models

Goals of the Lecture

In this Lecture the students will develop a methodology to analyze code, recognize and synthesize audio signals using signal processing techniques. More concrete the student should acquire the theoretical and practical competences related to:

Fundamentals of acoustics, physiological and perception of sound
Fundamentals of digital signal processing of audio signals
Methods for modeling and processing audio and speech signals

Prerequisites

Necessary Prior Knowledge:
Fundamentals of Digital Signal Processing, Probability and Information Theory.

Non necessary (but recommended) Prior Knowledge:
Knowledge of the Lectures: "Digitale Signalverarbeitung", "Statistische Methoden der Nachrichtentechnik", "Informationstheorie" and "Quellencodierung" , Fundamentals of Matlab.

Basic Literature

Quatieri, T. F. 2001. Discrete-Time Speech Signal Processing: Principles and Practice. Prentice Hall
Rabiner, L. R. and R. W. Schafer. 2007. Introduction to Digital Speech Processing. Foundations and Trends in Signals Processing, Vol. 1, Nos. 1-2,2007

Additional Literature

Rabiner, L. R. and R. W. Schafer. 1978. Digital Signal Processing of Speech Signals. Prentice Hall
O'Shaughnessy, D. 1999. Speech communications: human and machine. Wiley, John & Sons
Rabiner, L. R. and B. H. Juang. 1993. Fundamentals of Speech Recognition. Prentice Hall
Park, Sung-won. Linear Predictive Speech Processing
Spanias, Andreas. 1994. "Speech Coding: A Tutorial Review". Proceedings of the IEEE
Pan, Davis. 1995. "A Tutorial on MPEG/Audio Compression". IEEE Multimedia Journal
Rabiner, Lawrence. 1989. "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition". Proceedings of the IEEE

Supplementary Lectures

Software

Matlab, Praat, Audacity, Sonic Visualizer

Last changed: 2016-10-18