The goal of this assignment is to create a local feature matching algorithm using techniques described in Szeliski chapter 4.1. The pipeline we suggest is a simplified version of the famous SIFT pipeline. The matching pipeline is intended to work for instance-level matching -- multiple views of the same physical scene.
Harris corner detector was used to find the interest points. This corner detector will calculate the gradient of each pixel in x and y directions and look for pixels with high variation in both directions. The figure below shows the result of applying this corner detector with a Notre Dame cathedral image and a treshold of 0.1. The white points on the image represent the corners. The black area was tresholded.
![]() |
![]() |
After finding the interest points, these features were described using a SIFT-like local feature descriptor. The get_features function will, for each interest point, do the following steps:
For the feature matching, the nearest neighbor distance ratio test was used. The function will, for each feature from the first image calculated in the previous step, do the following steps:
The following figure shows an example of applying this algorithm with two Notre Dame cathedral images using a treshold of 0.85. On the image, the circles with green border represents correct matches, while the circles with red border represents wrong matches.
The following figure shows an example of applying this algorithm with the same images but with a treshold of 0.9. As it can be seen, this allowed more false positives, which gives a worse result than the previous treshold.
The figures below show the results of applying the process, but with different tresholds used with the Notre Dame cathedral images in different pairs of images. As it can be seen from the images, it still find some correct matches, but its result is worse compared to the Notre Dame pair.
The Harris corner detection, with SIFT descriptors and nearest neighbor matching can be used to find feature matching between two images from the same scene. This procedure has a lot of free parameters that gives different results. These parameters will change in each pair, and the result of the procedure is also variable according to the image pair.