CS143 Introduction to Computer Vision Project 5 - tracking and Structure from motion Hung-I Chuang Login:hichuang

CS143 Introduction to Computer Vision Project 5 - tracking and Structure from motion

Hung-I Chuang Login:hichuang

Introduction

In this project, we will reconstruct the 3D shape of an object from a series of video frames. In order to reconstruct the shape, we need to select tracking points, feature tracking using optical flow, and structure from motion.

algorithm

Brown University - Computer Science

We use Harris Corner detector to detect points as our keypoints candidates, and pick top 500 points in term of corner strength to be our keypoints. Figure below is the first image of our series photos and 500 points we select to become our keypoints.

Key Points Selection

We implement a Kanade-Lucas-Tomasi Tracker for the points we select (500 points above). We computes optical flows between two successive video frames and move the keypoints along through the flows to the new positions in the second frame.

From the KLT assumptions: 1.Brightness constancy 2. Small motion 3.Spatial coherence, and least square, we can get the below formula to calculate flow field in two frames.

Feature Tracking

We keep tracking keypoints from the first frame to the last one and discard the points which move out of the borders. The upper-left figure shows the optical flow between two frames r and g represents u and v respectively. The upper-right figure shows which points move out of frame (draw in green outline). The lower figures shows randomly selected 20 keypoints and their moving path through the video.

Finally, we use the keypoints we tracked as the input of affine structure, and solve the problem by using the method in Shape and Motion from Image Streams under Orthography: a Factorization Method (Tomasi and Kanade 1992). And eliminate the affine ambiguity by solving QQT using least square and Cholesky decomposition. Method described in A Sequential Factorization Method fro Recovering Shape and Motion from Image Streams.

Figures below are the reconstructions of 3D shape of the hotel from the video. And another figure shows camera positions, each color shows a dimension of the position.

Structure from Motion