Nicholas Ragosta
CSCI 1430 Project 5
Structure from -----> Motion
Goal:
Our goal is to recover 3d structure information from a series of differently oriented 2d images.
Method:
Detecting Points to Track
Points to be tracked are obtained from the first frame. Corner points are chosen as easily tracked interest points, and are located
with a harris corner detector. The harris corner detector locates points that have high vertical and horizontal gradients. Only the 500
strongest corner points were used for tracking. The points found for the hotel image are seen below.
|
Interest Points to Track
|
Tracking Points Frame to Frame
- A Kanade Lucas Tomasi tracker was implemented to follow the interest points obtained from the harris corner detector.
- Kanade Lucas Tomasi assumes constant brightness through sequential frames and similar motion for nearby pixels. Using these constraints it is possible to follow the motion of image pixels.
- The function optical_flow takes in two sequential images and returns the displacement vectors u and v for each pixel in the image.
- First x and y gradients are obtained from the first of the sequential images, the temporal gradient is obtained by simply subtracting the second of the images from the first.
- Assuming brightness constancy and region similar motion provides a system of equations solvable for each pixel that relates that pixels displacements to its x, y and temporal gradients.
- Keypoints are moved by adding calculated displacement. The returned displacement field has floating point displacement, meaning that interest points can be moved
by floating point values. Interp2 is used to interpolate between surrounding pixels to find the displacemtn of interest points that have an index that is floating point.
- The motion paths of 20 randomly sampled interest points are displayed below.
Interest Points That Move Out of Image Frame
It is possible for some of the interest points obtained in the first frame to move out of the image window by the final frame. Such points must be discarded.
As the proj5 function loops through sequential image pairs, computing displacement fields it also searches for any interest point whose location has become nan after
movement. The indices of these interest points are recorded. likewise, the indices of interest points that never drift out of frame are recorded. Below are displayed
interest points that drift out of frame during motion.
|
|
Motion Paths
|
Drifted Points
|
Obtaining Structure from Motion
- Combining all of the tracked coordinates for all interest points we can create a matrix D that contains all motion information.
- This matrix can then be decomposed into two matrices S, which contains 3d point coordinates, and M, which contains camera directions such that D is equal to the product of M and S.
- The M and S matrix pair obtained through decomposing D is not unique. To eliminate affine abiguity of these coordinate and camera matrices, an additional
refinement to the M and S matrices must be made.
- By imposing the constraint that all camera direction vectors must be orthogonal, and solving the matrix equations described in A Sequential Factorization Method for Recovering Shape and Motion from Image Streams, Morita and Kanade, 1997.
The matrix L is obtained.
- By performing a Cholesky decomposition on the matrix L, a matrix A is obtained which is used to modify M and S to produce unique motion and shape matrices.
- The shape matrix obtained is used to produce a mesh representative of the 3d shape.
Results
The 3d coordinates of all interest points are plotted below. Coordinates are plotted in three different orientations for easy observation of object structure.
Camera motion is plotted below. Camera motion is seperated into x,y and z components and plotted vs frame.
|
|
|
X Camera Motion
|
Y Camera Motion
|
Z Camera Motion
|