Tracking and Structure from Motion

Original
Images

Optical Flow

Project Description

In this project I used the Kanada-Lucas-Tomasi tracker method to calculate the optical flow of observed interest points through a video clip. In the second part of the project, I used the optical flow paths of the interest points to reconstruct the 3D geometry of the object in the video.

Feature Tracking

The starting point for this algorithm was to use an interest point detector to select the points that would be tracked. The stencil code suggested the Harris Detector to find corner points, and I decided to set my quality threshold to the top scoring 200 points. After the first 200 it seemed to me that there were too many spurious points from the background region being selected.

Once the points were selected, I used the KLT tracker to follow the points through the images in the `hotel' film sequence. I used the Matlab `gradient' function to calculate the horizontal and vertical gradients for use in the KLT sequence. I used a 15x15 pixel window as the neighborhood for calculating the displacement of the center pixel. I think this operation could have been sped up a little by using integral images for the Ix, Iy, It matrices, but my code had a bug. I used convolution instead to find the summed neighborhood values. The code for both of these methods is included with my handin.

The animated image below shows the predicted locations of the feature points I used though the whole image sequence. It is clear that there is very good agreement for all of the points that are actually located on the hotel. There is one point that is on the field in the background that isn't moving correctly. To capture the correct motion for this point, I would have had to use an image pyramid.

The paths of each feature point
during the image sequence. Each
point has a different track color.

Feature points (the blue circles) that
drifted off of the image during the sequence.

Structure from Motion

After correctly tracking the feature points through the image sequence, I used the factorization methods in "Shape and Motion from Image Streams under Orthography: a Factorization Method," Tomasi and Kanade, 1992 and "A Sequential Factorization Method for Recovering Shape and Motion from Image Streams," Morita and Kanade, 1997 to calculate the 3D coordinates of the feature points and the view direction vector of the orthogonal camera viewing the image sequence. I followed the method outlined by Morita and Kanade as closely as possible.

My 3D reconstruction of the hotel is shown in several different views below, as well as the estimated path of the camera. An interactive 3D model of the hotel reconstruction is at the top of this page.

A view from in front and above the hotel.	A view from above and to the left.
A view of the back side of the model. This clearly shows the 3D deformation. The red vectors are the different camera view vectors as they moved through the sequence.	A view from the front in height-mapped color.
The estimated 3D camera trajectory through time.

cs143 - Project 5: Tracking and Structure from Motion

Project Description

Feature Tracking

Structure from Motion

cs143 - Project 5:
Tracking and Structure from Motion