Video Inertial Multiple People Tracking

TNT members involved in this project:
Prof. Dr.-Ing. Bodo Rosenhahn
Show all


Video-based multiple people tracking has been a very active research area for decades now. Yet, (partial) occlusions, motion model assumptions, and image ambiguities makes it very hard to accurately track all persons.

For example, it is often assumed that humans have low velocities and nearly zero accelerations, when being filmed. However, these assumptions are often violated, e.g. in sports. Also, several effect make it difficult to correctly interpret image information. Varying lighting conditions, a changing view on a person from a camera over time, or people changing their cloth violate the assumption of appearance constancy. Also humans that a similarly dressed, e.g. people in workwear, make the appearance information misleading or ambiguous.


To tackle this issue, we propose a new interesting problem called Video Inertial Multiple People Tracking:

A scene is filmed from a video camera, and each person to be tracked has an IMU (inertial measurement unit) attached to its back. The task is to simultanesouly perform Multiple People Tracking, and to assign each trajectory to the corresponding IMU device.


Conceptual benefits of the VIMPT task are:

  • VIMPT provides an automatic labeling of each trajectory to a person-specific ID.
  • Motion assumptions can be weakend or totally neglected as local motion measurements are provided by the IMU devices.
  • Appearance assumptions can be weakend or totally neglected as the IMU signal allows to identify a person based on its measurements.
  • Fusing video information with user-specific IMU signal provides more consistent trajectories, as each person provides local motion information at all times, independent of the visibilty of a person, or any change in the appearance.



We provide the first VIMPT dataset named VIMPT2019, which contains video and IMU information, as well as ground-truth data.



  • Conference Contributions
    • Roberto Henschel, Timo von Marcard, Rosenhahn Bodo
      Simultaneous Identification and Tracking of Multiple People using Video and IMUs
      Computer Vision and Pattern Recognition Workshops (CVPRW), June 2019