Human Motion Databases

TNT members involved in this project:

Multimodal Motion Capture Dataset (TNT15)

Timo v. Marcard, Gerard Pons-Moll, and Bodo Rosenhahn

Video-based human motion capture has been a very active research area for decades now. The articulated structure of the human body, occlusions, partial observations and image ambiguities makes it very hard to accurately track the high number of degrees of freedom of the human pose. Recent approaches have shown, that adding sparse orientation cues from Inertial Measurement Units (IMUs) helps to disambiguate and improves full-body human motion capture. As a complementary data source, inertial sensors allow for accurate estimation of limb orientations even under fast motions.
In the research landscape of marker-less motion capture, publicly available benchmarks for video-based trackers (e.g. HumanEva, Human3.6M) generally lack inertial data. One exception is the MPI08 dataset, which provides inertial data of 5 IMUs along with video data.
This new dataset, called TNT15, consists of synchronized data streams from 8 RGB-cameras and 10 IMUs. In contrast to MPI08 it has been recorded in a normal office room environment and the high number of 10 IMUs can be used for new tracking approaches or improved evaluation purposes.

3D scans	8 RGB-cameras	10 IMUs

In order to download the dataset and obtain more information please visit our project page at TNT15 dataset.

Multisensor-Fusion for 3D Full-Body Human Motion Capture (MPI08)

Gerard Pons-Moll, Andreas Baak, Thomas Helten, Meinard Müller, Hans-Peter Seidel, and Bodo Rosenhahn

In this work, we present an approach to fuse video with orientation data obtained from extended inertial sensors to improve and stabilize full-body human motion capture. Even though video data is a strong cue for motion analysis, tracking artifacts occur frequently due to ambiguities in the images, rapid motions, occlusions or noise. As a complementary data source, inertial sensors allow for drift- free estimation of limb orientations even under fast motions. However, accurate position information cannot be obtained in continuous operation. Therefore, we propose a hybrid tracker that combines video with a small number of inertial units to compensate for the drawbacks of each sensor type: on the one hand, we obtain drift-free and accurate position information from video data and, on the other hand, we obtain accurate limb orientations and good performance under fast motions from inertial sensors. In several experiments we demonstrate the increased performance and stability of our human motion tracker.

3D scans	Multiview sequences	5 sensors

In order to download the dataset and obtain more information please visit our project page at MPI08 dataset.

Pose Tracking and Motion Capture Test Sequences (TPAMI09 benchmark)

Thomas Brox, Bodo Rosenhahn, Juergen Gall, and Daniel Cremers
N. Hasler, B. Rosenhahn, T. Thormählen, M. Wand, J. Gall, and H.-P. Seidel

Due to several requests and interest of other research groups on our image sequences we use for model driven tracking, we decided to make them freely available for your own tests and experiments. The sequences (usually) contain the images, projection matrix(es), 3D model and pose init (as 4x4 matrix and a list of joint angles (optional)).

In order to download the dataset and obtain more information please visit our project page at TPAMI09 benchmark.

MOOF Dataset

MOOF (MOvements Of the Feet) is a video dataset designed to support the evaluation of 3D foot motion reconstruction from monocular video. Existing benchmarks for human motion capture either lack diversity in foot movements or are limited to a single controlled indoor environment, making it difficult to assess foot pose reconstruction in isolation. MOOF addresses this gap by focusing specifically on complex foot articulations.

The dataset comprises 41 videos of 15 subjects (9 female, 6 male), captured at 30 fps. Video durations range from 4 to 37 seconds, totaling 14,589 frames. Recordings include subjects performing movements with complex foot motions — such as ankle circles, ankle stretches, and heel–toe walking — and are augmented with in-the-wild dance and ballet videos collected online.

Each video is annotated with 2D ground truth keypoints for three foot landmarks per foot — big toe, small toe, and heel — obtained via a semi-automatic annotation pipeline. Evaluation on MOOF uses foot-specific metrics: PCK_F (Percentage of Correct foot Keypoints) and N-FKE_2d (Normalized 2D Foot Keypoint Error).

Example images from the MOOF dataset with 2D foot keypoint annotations — Example images from the MOOF dataset with annotated 2D foot keypoints (big toe, small toe, heel).

Download

MOOF is available for non-commercial research purposes only. To request access, please download and complete the Data Use Agreement (DUA) below, then send the signed document to moof-dataset@tnt.uni-hannover.de. You will receive a download link once your request has been reviewed.

↓ Download Data Use Agreement (PDF)

Citation

If you use the MOOF dataset in your research, please cite the following paper. For more details on the method, visit the FootMR project page.

@InProceedings{wehrbein26footmr,
    author    = {Wehrbein, Tom and Rosenhahn, Bodo},
    title     = {Improving 3D Foot Motion Reconstruction in Markerless Monocular Human Motion Capture},
    booktitle = {International Conference on 3D Vision (3DV)},
    year      = {2026},
}

Human Motion Databases

Multimodal Motion Capture Dataset (TNT15)

3D scans

8 RGB-cameras

10 IMUs