The objective of this assignment was to implement the Kanade-Lucas-Tomasi tracker and reconstruct the geometry of an object from observations of key points over time.
This project attempts to estimate the geometry of a 3D object given a video sequence that shows different views of the object. To do this, we first select corner points of the image using the Harris corner detector. We then estimate the per-pixel optical flow and track those key points over the video sequence. After determining the locations of the key points over time, we use linear algebra to reconstruct a matrix representing the motion of the camera and the real world geometry of the scene.
The Harris corner detector does a fairly decent job ob detecting key points.
After we detect key points to track, we calculate the x/y-derivative of the image and its temporal gradient.
Using the method outlined in this paper, we can use the above information to calculate per-pixel motion for each frame.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Then we can use that estimated motion to track key points across video frames.
However, some of the points go off the frame. We just discard this data.
Finally, we use the locations of tracked key points to estimate the real world geometry of the scene and camera movement. (See Tomasi and Kanade, 1992 or Morita and Kanade, 1997 for more details.)
We can also plot the motion of the camera over time.
Hari Narayanan helped me understand the Structure from Motion algorithm.