TNT logo LUH TNT

Expressive Database for Image-based Facial Animation Systems

TNT members involved in this project:
Stella Graßhof, M.Sc.
Show all

The expressive database includes recordings of 145 sentences (about 20 minutes), each with three expression states: neutral, smile while speaking, and smile after speaking. Along with the original spoken text, we offer three types of data for each sentence:

  1. Video in yuv format, with 576 x 720 @ 50 frame/second
  2. Audio in wav format
  3. Phonetic information in txt format

As an example, the data for the sentence "Upton saw a disaster coming." is listed below. For quick review, the image size of avi is only 288 x 360 @ 50 fps.

expression mode audio wav file video yuv file mpeg avi file
neutral s001_1.wav s001_1.yuv s001_1.avi
smile while speaking s001_2.wav s001_2.yuv s001_2.avi
smile after speaking s001_3.wav s001_3.yuv s001_3.avi

A detailed description of the database can be found in the readme-file and the text corpus, which we offer as a separate download here:

  • Detailed description of the full database (included in full database download)
  • Original text corpus of the 145 sentences (included in full database download)
  • Full database includes above description and corpus file. Please be aware that the size of the provided zip-file is about 30GB, while the total size of the unpacked database is about 62GB.
  • Additionally we offer the avi files of the 435 sentences as zip separately, which has a size of only 4GB

Terms of Use

By Download you agree to the following terms of use:
  1. We provide the data as is, without any guarantee.
  2. The data must only be used for non-commercial use. (If you intend a commercial usage, contact the corresponding person in charge on top of this website.)
  3. If the provided data (or parts of it) yields to any publicly or partly publicly available publications of any kind, the below mentioned paper must be cited as source. Please be aware that this especially includes academic use of any kind!
The article to be cited:
"Realistic Facial Expression Synthesis for an Image-based Talking Head" by Kang Liu and Joern Ostermann, IEEE Conference on Multimedia and Expo, ICME2011 , p. 6, Barcelona, Spain, July 2011

The provided data has been recorded in 2011 by:
Kang Liu
Institut für Informationsverarbeitung
Leibniz Universität Hannover
Appelstr. 9A, 30167 Hannover
Germany