Expressive Database for Image-based Facial Animation Systems

TNT members involved in this project:

Nobody is listed for this project right now!

The expressive database includes recordings of 145 sentences (about 20 minutes), each with three expression states: neutral, smile while speaking, and smile after speaking. Along with the original spoken text, we offer three types of data for each sentence:

Video in yuv format, with 576 x 720 @ 50 frame/second
Audio in wav format
Phonetic information in txt format

Examples

As an example, the data for the sentence "Upton saw a disaster coming." is listed below. For quick review, the image size of avi is only 288 x 360 @ 50 fps.


expression mode	audio wav file	video yuv file	mpeg avi file
neutral	s001_1.wav	s001_1.yuv	s001_1.avi
smile while speaking	s001_2.wav	s001_2.yuv	s001_2.avi
smile after speaking	s001_3.wav	s001_3.yuv	s001_3.avi

Download

A detailed description of the database can be found in the readme-file and the text corpus, which we offer as a separate download here:

Detailed description of the full database (included in full database download)
Original text corpus of the 145 sentences (included in full database download)
Full database includes above description and corpus file. Please be aware that the size of the provided zip-file is about 30GB, while the total size of the unpacked database is about 62GB.
Additionally we offer the avi files of the 435 sentences as zip separately, which has a size of only 4GB

Terms of Use

By Download you agree to the following terms of use:

We provide the data as is, without any guarantee.
The data must only be used for non-commercial use. (If you intend a commercial usage, contact the corresponding person in charge on top of this website.)
If the provided data (or parts of it) yields to any publicly or partly publicly available publications of any kind, the below mentioned paper must be cited as source. Please be aware that this especially includes academic use of any kind!

The article to be cited:
"Realistic Facial Expression Synthesis for an Image-based Talking Head" by Kang Liu and Joern Ostermann, IEEE Conference on Multimedia and Expo, ICME2011 , p. 6, Barcelona, Spain, July 2011

Additional Information

The provided data has been recorded in 2011 by:
Kang Liu
Institut für Informationsverarbeitung
Leibniz Universität Hannover
Appelstr. 9A, 30167 Hannover
Germany