CSCI1430 Project 5 : Structure from Motion

Michael Price (mprice)


The goal of the project is to recover 3 dimensional shape and orthographic camera data from a video of an object.


Algorithm


The basic algorithm can be broken up into the following steps:

Choose points to track- We use a Harris corner detector on the first frame of the sequence to identify trackable candidates.
Compute optical flow and track points- For each frame, we compute the optical flow between pairs of consequtive frames and update the tracked point position, adding the flow offset. If a point gets too close to the edges of the image (within 7 pixels), it is removed and no longer tracked.
Solve for the 3d structure problem as a factorization problem- as described in Tomasi and Kanade, 1992. We input our tracked points, and receive as output the 3d structure, as well as camera directions.


Results

The results I got seem quite reasonable.
Here are some images of the paths of 20 points,randomly chosen from the initial 500, throughout the sequence:
The circles show the initial position of each tracked point.



Here are the points that were originally created, but ultimately removed because they left the frame at some point throughout the sequence:
Here are three views of the structure that was ultimately generated:



And finally, here is a better view of the camera's motion throughout the sequence:

Extra Credit: For extra credit, I attempted to implement two-view structure from motion with a perspective camera, using the 8-point algorithm. Unfortunately, my implementation is not quite working, and I have nothing to show for it other than the code, but I am leaving this note here for posterity.