Tracking and Structure from Motion

Dylan Field, CS 143

Objective

The objective of this assignment was to implement the Kanade-Lucas-Tomasi tracker and reconstruct the geometry of an object from observations of key points over time.

Introduction

This project attempts to estimate the geometry of a 3D object given a video sequence that shows different views of the object. To do this, we first select corner points of the image using the Harris corner detector. We then estimate the per-pixel optical flow and track those key points over the video sequence. After determining the locations of the key points over time, we use linear algebra to reconstruct a matrix representing the motion of the camera and the real world geometry of the scene.

Harris corner detection

The Harris corner detector does a fairly decent job ob detecting key points.

Optical Flow

After we detect key points to track, we calculate the x/y-derivative of the image and its temporal gradient.

Using the method outlined in this paper, we can use the above information to calculate per-pixel motion for each frame.

Then we can use that estimated motion to track key points across video frames.

Motion of 20 Random Points

Motion of All Points

However, some of the points go off the frame. We just discard this data.

Motion of Points That Move Out of Frame

Structure from Motion

Finally, we use the locations of tracked key points to estimate the real world geometry of the scene and camera movement. (See Tomasi and Kanade, 1992 or Morita and Kanade, 1997 for more details.)

We can also plot the motion of the camera over time.

Motion of Camera in 3D

x-dimension of Camera Motion

y-dimension of Camera Motion

z-dimension of Camera Motion

Credits

Hari Narayanan helped me understand the Structure from Motion algorithm.