What is 3D human sensing

Bachelor thesis: Real-time Multi-person 3D Human Pose Estimation with a Single Camera

Real-time Multi-person 3D Human Pose Estimation with a Single Camera in Surveillance Videos

The Fraunhofer Institute for Optronics, System Technology and Image Evaluation IOSB is one of the largest institutes for applied research in the field of image acquisition and evaluation in Europe. The Video Evaluation Systems (VID) department deals with the automatic evaluation of signals from moving imaging sensors in complex, possibly non-cooperative scenarios. This sensor system is used, for example, in the field of reconnaissance and surveillance as an integrated component in flying, space-based or mobile land-based platforms. For this purpose, VID develops and integrates image evaluation algorithms for autonomous or interactive systems.


 Source: [1]



Today, video footage from surveillance cameras is an important tool for investigating crimes and identifying suspects. The analysis of the huge amounts of data obtained by numerous cameras poses enormous challenges for the police investigation authorities. This is why systems are needed that support staff in recognizing attacks in real time. A first step in detecting violent activity is making 3D pose estimates of people within crowds.



As part of the bachelor thesis, the literature on real-time multi-person 3D pose estimation should be developed. Furthermore, based on the results of the research and the following sources, a method is to be expanded which enables multi-person 3D pose estimation on full HD video data in real time.



  • Subject:Computer science, mathematics, applied physics or comparable
  • Good understanding of the (theoretical) basis of deep learning
  • Good programming skills (ideally Python)
  • Experience with the deep learning framework Pytorch is advantageous.
  • Ability to work independently
  • Willingness to familiarize oneself with new subject areas and enjoyment of bringing in your own ideas

If you are interested, please send us your application documents (short cover letter, tabular CV, excerpt from grades) in electronic form to Mickael Cormier.



[1] Mehta, Dushyant, et al. "Xnect: Real-time multi-person 3d human pose estimation with a single rgb camera." arXiv preprint arXiv: 1907.00837 (2019). https://arxiv.org/pdf/1907.00837.pdf

[2] Zhao, Long, et al. "Semantic graph convolutional networks for 3D human pose regression." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019. https://arxiv.org/pdf/1904.03345.pdf