The goal of this project is to implement a local feature matching algorithm using a simplified version of the SIFT pipeline.
First we use the Harris corner detector (Algorithm 4.1 in Szeliski) to detect the interest points in the images. To compute the horizontal and vertical derivatives of the image, we convolve the original image with two vectors [-2 -1 0 1 2] and [-2 -1 0 1 2]T. The non-maximum suppression is implemented by using the COLFILT
function: after we've thresholded (threashold = 0.01) the cornerness score, COLFILT
can apply MAX
function on each window of the image so that we can get local maximum conrness on each sliding window.
We implemented a SIFT-like location feature descriptor as described in Szeliski 4.1.2. For each interest point, we take a feature_width*feature_width matrix, divide into a 4*4 grid of cells. For each cell we build a histogram of the gradient in 8 different orientations, which gives us a 128-dimension vector for each interest point. While building the histogram, we also apply a Gaussian filter on the gradient so that a closer point contributes more to the histogram than a farther point. Then the resulting normalized 128-dimension vector is the descriptor of a local feature.
For feature matching we use the nearest neighbour distance ratio test: for each interest point we find its two closest neighbours (in feature description space). If the ratio of the two distance d1/d2 is less than a threshold, we consider the closest neighbour is a match to this interest point. Here we use 0.75 as the ratio threshold.
We also calculate d2-d1 as the confidence score for each feature match. A higher confidence score means it is more likely to be a correct match. By ordering on the confidence score and limiting the output size to 100, the good match rate went from 85% (146/172) to 93% (93/100).
![]() ![]() |
![]() ![]() |
![]() ![]() |
![]() ![]() |
![]() ![]() |