Project 5: Tracking and Structure from Motion

CS 143: Introduction to Computer Vision

Project Overview

In this project I reconstruct the 3D shape of an object from a series of observations . There are three essential components to this: keypoint selection, feature tracking, and structure from motion..

Keypoint Selection

I used the harris corner detector to detect all the keypoints and select the 500 strongest keypoints.

Feature Tracking

In this part, I use Kanade-Lucas-Tomasi tracker to track features. This essentially involves computing optical flow between successive video frames and moving the selected keypoints from the first frame along the flow field.

The first assumption the KLT tracker makes is brightness constancy. A point should have the same value after translation in the next frame (where I is the image function):

Take the Taylor expansion of I(x + u, y + v, t + 1), where Ix/Iy is the x/y-derivative of the image and It is the temporal gradient:

Therefore:

And by the first equation:

This is only one constraint when we have two unknowns (u and v). We get more by assuming that nearby pixels at points pi, i ∈ [1, 225] (in a 15×15 box around the pixel) move with the same u and v:

We can solve this overconstrained linear system via least-squares (abbreviating the above to Ad = b):

This results in two equations with two unknowns for each pixel. I solved u and v by interting the 2x2 matrix on the left: -hand side and multiplying it by the vector on the right-hand side.

20 random keypoints' 2D path over the sequence of frames

points that moved out of frame at some point

Structure from motion

The major steps are :