CS 143 / Project 2 / Local Feature Matching

Overview

The goal of this assignment is to create a local feature matching algorithm using techniques described in Szeliski chapter 4.1. The pipeline we suggest is a simplified version of the famous SIFT pipeline. The matching pipeline is intended to work for instance-level matching -- multiple views of the same physical scene.

Interest Point Detection

Harris corner detector was used to find the interest points. This corner detector will calculate the gradient of each pixel in x and y directions and look for pixels with high variation in both directions. The figure below shows the result of applying this corner detector with a Notre Dame cathedral image and a treshold of 0.1. The white points on the image represent the corners. The black area was tresholded.

Result of Harris corner detector on Notre Dame cathedral image.

Local Feature Description

After finding the interest points, these features were described using a SIFT-like local feature descriptor. The get_features function will, for each interest point, do the following steps:

  1. Get a patch with dimensions given by a parameter (usually, 16x16), with the interest point being on the center of the patch
  2. Divide this patch in a 4x4 grid
  3. For each cell of this grid, build a histogram representing the gradient direction of the pixels in the cell
  4. Concatenate these 16 histograms, in a 128 dimension histogram, which represents the patch

Feature Matching

For the feature matching, the nearest neighbor distance ratio test was used. The function will, for each feature from the first image calculated in the previous step, do the following steps:

  1. Find the two nearest neighbors on the other image
  2. Calculate the ratio of the distances of these two neighbors
  3. Threshold this ratio

The following figure shows an example of applying this algorithm with two Notre Dame cathedral images using a treshold of 0.85. On the image, the circles with green border represents correct matches, while the circles with red border represents wrong matches.

The following figure shows an example of applying this algorithm with the same images but with a treshold of 0.9. As it can be seen, this allowed more false positives, which gives a worse result than the previous treshold.

The figures below show the results of applying the process, but with different tresholds used with the Notre Dame cathedral images in different pairs of images. As it can be seen from the images, it still find some correct matches, but its result is worse compared to the Notre Dame pair.

Conclusions

The Harris corner detection, with SIFT descriptors and nearest neighbor matching can be used to find feature matching between two images from the same scene. This procedure has a lot of free parameters that gives different results. These parameters will change in each pair, and the result of the procedure is also variable according to the image pair.