For project 2, I was tasked with creating a program which would generate SIFT-like features from images and then match those features to features in other similar images. The basic pipeline for this project was to gather interest points from two images, create features from those interest points, and then to match the features from those two images. This pipeline can be built in a number of different ways and also allows for many free parameters. I found that adding just a few lines of code, and making seemingly small changes to free parameters resulted in the matching accuracy of my algorithm more than doubling. Over the course of three days of incrementally improving my alogrithm I managed to go from less than 20% accuracy amoung my top 100 most confident matches to 71% accuracy.
![]() ![]() |
Top 100 and Top 300 Most Confident Matches
The incremental improvements I made included the following:
Programming a Harris Corner dectector proved to be the most algorithmically difficult portion of this assignment.. However, corner detection was actually relatively simple once I found the following formula: g(I^2x)g(I^2y)-[g(IxIy)]^2-alpha[g(I^2x)+g(I^2y)]^2. For me, the most difficult portion of this assignment was debugging and adjusting parameters. In particular, I spent a large number of hours hunting down what turned out to be two small and relatively simple problems.
The first of those issues was that I was using a wide gaussian blur in order to preprocess my images before looking for interest points. This blurring resulted in a 40 percent decrease in the accuracy of my results. I've since determined that any amount of preprocessing blur, at least for my algorithm, results in a decrease in performance. I suspected that this might be due to the strength of the blur used when calculating image derivatives later in my interest point generation. However, after testing I determined that that was not the case.
The second bug which held me up was an even simpler issue. My points were being ordered from least confident to most confident, as opposed to the other way around. This meant that, when I visualized what I thought were my top 100 best matches, I was in fact visualizing my 100 worst matches.
Each of the following images demonstrates one the weaknesses of my feature matching algorithm.
![]() |
My algorithm appears to actual perform somewhat reasonably on this image of the Capital Building. However, the algorithm clearly performs nowhere near as well as it does on the image of Notre Dame. Amoung other issues, differences in scale between the two pictures probably account for a significant portion of mis-matches.
![]() |
This set of Eiffel Tower images shows even poorer performance than the first set. This demonstrates issues SIFT feature matching has with self-similarity.
![]() |
This third set of images demonstrates the fact that SIFT features are incapable of capturing more abstract similarities, such as those between a giraffe and Miley Cyrus.