Available at: http://digitalcommons.calpoly.edu/theses/1195
Date of Award
MS in Computer Science
Interfacing with a computer, especially when interacting with a virtual three di- mensional (3D) scene, found in video games for example, can be frustrating when using only a mouse and keyboard. Recent work has been focused on alternative modes of interactions, including 3D tracking of the human body. One of the essential steps in this process is acquiring depth information of the scene. Stereo vision is the process of using two separate images of the same scene, taken from slightly different positions, to get a three dimensional view of the scene. One of the largest issues with dense stereo map generation is the high processor usage, usually preventing this process from being done in real time. In order to solve this problem, this project attempts to move the bulk of the processing to the GPU. The depth map extraction is done by matching points between the images, and using the difference in their positions to determine the depth, using multiple passes in a series of openGL vertex and fragment shaders. Once a depth map has been created, the software uses it to track a person’s movement and pose in three dimensions, by tracking key points on the person across frames, and using the depth map to find the third dimension.