CS 143 Project

CS 143 / Project 2 / Local Feature Matching

Overview

The goal of this assignment is to create a local feature matching algorithm using techniques described in Szeliski chapter 4.1. The pipeline we suggest is a simplified version of the famous SIFT pipeline. The matching pipeline is intended to work for instance-level matching -- multiple views of the same physical scene.

Interest Point Detection

Harris corner detector was used to find the interest points. This corner detector will calculate the gradient of each pixel in x and y directions and look for pixels with high variation in both directions. The figure below shows the result of applying this corner detector with a Notre Dame cathedral image and a treshold of 0.1. The white points on the image represent the corners. The black area was tresholded.

Result of Harris corner detector on Notre Dame cathedral image.

Local Feature Description

After finding the interest points, these features were described using a SIFT-like local feature descriptor. The get_features function will, for each interest point, do the following steps:

Get a patch with dimensions given by a parameter (usually, 16x16), with the interest point being on the center of the patch
Divide this patch in a 4x4 grid
For each cell of this grid, build a histogram representing the gradient direction of the pixels in the cell
Concatenate these 16 histograms, in a 128 dimension histogram, which represents the patch

Feature Matching

For the feature matching, the nearest neighbor distance ratio test was used. The function will, for each feature from the first image calculated in the previous step, do the following steps:

Find the two nearest neighbors on the other image
Calculate the ratio of the distances of these two neighbors
Threshold this ratio

The following figure shows an example of applying this algorithm with two Notre Dame cathedral images using a treshold of 0.85. On the image, the circles with green border represents correct matches, while the circles with red border represents wrong matches.

The following figure shows an example of applying this algorithm with the same images but with a treshold of 0.9. As it can be seen, this allowed more false positives, which gives a worse result than the previous treshold.

The figures below show the results of applying the process, but with different tresholds used with the Notre Dame cathedral images in different pairs of images. As it can be seen from the images, it still find some correct matches, but its result is worse compared to the Notre Dame pair.

Conclusions