CS 143 / Project 2 / Local Feature Matching

The goal of this project is to implement a local feature matching algorithm using a simplified version of the SIFT pipeline.

Algorithms

Interest point detection

First we use the Harris corner detector (Algorithm 4.1 in Szeliski) to detect the interest points in the images. To compute the horizontal and vertical derivatives of the image, we convolve the original image with two vectors [-2 -1 0 1 2] and [-2 -1 0 1 2]T. The non-maximum suppression is implemented by using the COLFILT function: after we've thresholded (threashold = 0.01) the cornerness score, COLFILT can apply MAX function on each window of the image so that we can get local maximum conrness on each sliding window.

Local feature description

We implemented a SIFT-like location feature descriptor as described in Szeliski 4.1.2. For each interest point, we take a feature_width*feature_width matrix, divide into a 4*4 grid of cells. For each cell we build a histogram of the gradient in 8 different orientations, which gives us a 128-dimension vector for each interest point. While building the histogram, we also apply a Gaussian filter on the gradient so that a closer point contributes more to the histogram than a farther point. Then the resulting normalized 128-dimension vector is the descriptor of a local feature.

Feature matching

For feature matching we use the nearest neighbour distance ratio test: for each interest point we find its two closest neighbours (in feature description space). If the ratio of the two distance d1/d2 is less than a threshold, we consider the closest neighbour is a match to this interest point. Here we use 0.75 as the ratio threshold.

We also calculate d2-d1 as the confidence score for each feature match. A higher confidence score means it is more likely to be a correct match. By ordering on the confidence score and limiting the output size to 100, the good match rate went from 85% (146/172) to 93% (93/100).

Results

Notre Dame

Here is the top 100 local feature matches from our implementation, where 93 are correct and 7 are incorrect.

Other Examples