3d modeling of motion dynamics.
The human visual system is exquisitely sensitive to an enormous range of human movements. We can distinguish coarse differences between simple motions (left leg up vs. right hand down), actions (walking vs. running) and activities (playing vs. dancing). We can also distinguish far more subtle differences in movement style. For instance, we can identify friends by their walking styles, infer mood and intent from hand or arm gestures, or evaluate the grace and athleticism of a ballerina. These discriminative abilities depend on a visual system that can process shape and motion information in 3D. In contrast, most existing models in computer vision and neuroscience are based on the analysis of 2D image features and 2D image motion. While some 3D models exist, they typically represent 3D shape at levels that are too fine (cloud of points) or too coarse (skeletons) to be effective. Moreover, existing approaches do not explicitly model 3D dynamics, which is critical for distinguishing movements and movement styles. As a consequence, there is currently no computer vision system with movement discrimination abilities comparable to those of humans.
In this project, we are developing bio-inspired algorithms for discriminating human movements and movement styles. Recent results from co-PI Connor's lab revealed a rich representation of static 3D shape structure in the ventral pathway of visual cortex, carried by populations of neurons with diverse selectivities for 3D configurations of 3D structural fragments. Drawing on these biological findings, we propose an analogous representation of moving 3D shapes based on 4D (space+time) structure-in-motion (SiM) fragments. We are studying how the ventral pathway encodes SiM fragments using an evolutionary stimulus morphing strategy that proved successful in the recent studies of static 3D shape. We are using the SiM fragment models emerging from this approach to develop algorithms for automatically extracting candidate SiM fragments from videos acquired by a network of stereo cameras. We are developing hybrid system identification and clustering techniques to model the temporal evolution of these candidate SiM fragments and to learn a dictionary of human movements. This dictionary will in turn influence the design of the stimulus used in the neural experiments. We are developing classification methods in the space of hybrid systems for recognizing movements and their styles. We plan to evaluate our methods using a real-time tele-immersion system for teaching Tai Chi.
This research is supported by NSF Grant 0941463.