Structure from Motion
Benjamin Leib
bgleib

The Algorithm

This program implements structure from motion-- that is, it recovers the 3D structure of a scene from a series of video frames taken of it. This involves several steps. First, the algorithm chooses 500 keypoints in the first frame which will be easy to track. It does this by using a Harris corner detection algorithm and picking the 500 strongest corners. See below for a visualization of the points chosen.

Next is uses optical flow to track those keypoints through the frames. This process involves calculating the optical flow for every pixel in each frame and then using interpolation to find the optical flow for each keypoint and moving it by that amount in the next frame. See below for a visualization of the tracked movement of 20 of the keypoints.

Keypoints which go out of the image during any frame are eliminated. See below for the keypoints in the first frame which were eliminated.

After these points are tracked, their positions throughout the frames are used to recover their 3D structure and the position of the camera throughout the video using the structure from motion algorithm described by A Sequential Factorization Method for Recovering Shape and Motion from Image Streams, Morita and Kanade, 1997.

Results

See below for several viewpoints of the final structure determined by the algorithm, along with the camera positions. While not perfect, the front and side faces of the house which were seen in the video are very well reproduced by the algorithm.