CS143 Introduction to Computer Vision:
Project 5 Tracking and Structure from Motion

Gili Kliger (gkliger)

 

Overview

In this assignment, we reconstruct the 3D shape of an object from a series of observations. The three essential steps are:

    Keypoint selection
    Feature tracking
    Structure from motion

Keypoint Selection

We use a Harris corner detector to select keypoints. Here's what the selected keypoints look like overlayed on the first video frame:


Feature Tracking

We implement a Lucas-Kanade tracker for the keypoints we've detected, which computes optical flow between successive video frames and moves the selected keypoints from the first frame along the flow field. Here's what the movement of 20 random keypoints over a sequence of frames looked like:


Structure from Motion

Now we use the discovered keypoint tracks as input for the affine structure from motion procedure. For the n images and m tracked features, we first center each feature's coordinates, and then construct a measurement matrix D of the x and y coordinates for each over time. We then compute the SVD of D to get the matrices U, W, and V, where D = U W V'. We take the first 3 columns of U, the first 3 columns of V, and the upper left 3x3 block of W, and create motion and shape matrices. Here are 3 views of the reconstructed structure:

Here's the predicted camera motion for the X, Y, and Z axis:

Results

The Lucas-Kanade tracker tracked features over time well. On the other hand the structure from motion algorithm did not recover the 3d box shape quite as well as we might like.