By combining appearance-based and model-based approaches, human articulated poses are estimated from a monocular video. A silhouette is extracted from each frame, then similarity between a silhouette image and a motion model (= a set of joint angles), which is obtained from CMU motion capture data and projected onto 2D image, is measured. Based on evaluation of similarity and joint angle smoothness, the optimal sequence of poses are determined.


The system includes the following components.
1. Silhouette extraction
2. 2D projection of Motion Capture Data
3. Similarity measurement between silhouette and motion data
4. Dynamic Programming Matching to estimate a sequence of articulated poses


Human Gait Tracking
A movie to demonstrate the estimated articulated poses for a walking straight sequence.
(Green parts are near limbs and red parts are far limbs.)
Download: avi [1.1 MB]

