CS 143 / Project 2 / Local Feature Matching

Overview

For project 2, I was tasked with creating a program which would generate SIFT-like features from images and then match those features to features in other similar images. The basic pipeline for this project was to gather interest points from two images, create features from those interest points, and then to match the features from those two images. This pipeline can be built in a number of different ways and also allows for many free parameters. I found that adding just a few lines of code, and making seemingly small changes to free parameters resulted in the matching accuracy of my algorithm more than doubling. Over the course of three days of incrementally improving my alogrithm I managed to go from less than 20% accuracy amoung my top 100 most confident matches to 71% accuracy.

Top 100 and Top 300 Most Confident Matches

The incremental improvements I made included the following:

  1. Not using a Gaussian weighting function which made pixels near an interest point make a greater contribution to image features resulted in a 1% decrease in accuracy on Notre Dame test images. Including this weighting function when testing on images of the Capitol Building resulted in what appeared to be a significant increase in feature matching accuracy. This may be due to more texture or noise surrounding the interest points in my image of the Capitol.
  2. Precomputing versions of images filtered with oriented sobel filters, as opposed to filtering image blocks before calculating gradients, resulted in a 15% increase in feature matching accuracy. This is likely due to image padding which occured when filtering the smaller image blocks.
  3. Raising the elements of each feature to a power less than 1 (effectively increasing contrast between gradients) resulted in a 2% increase in performance on the test images of Notre Dame.
  4. Adjusting the alpha parameter for my Harris Corner Detector resulted in a 3% increase in algorithmic performance.
  5. Taking the absolute value of gradients before summing their intensities for use in a histrogram resulted in a 6% increase in algorithmic performance

Issues

Programming a Harris Corner dectector proved to be the most algorithmically difficult portion of this assignment.. However, corner detection was actually relatively simple once I found the following formula: g(I^2x)g(I^2y)-[g(IxIy)]^2-alpha[g(I^2x)+g(I^2y)]^2. For me, the most difficult portion of this assignment was debugging and adjusting parameters. In particular, I spent a large number of hours hunting down what turned out to be two small and relatively simple problems.

The first of those issues was that I was using a wide gaussian blur in order to preprocess my images before looking for interest points. This blurring resulted in a 40 percent decrease in the accuracy of my results. I've since determined that any amount of preprocessing blur, at least for my algorithm, results in a decrease in performance. I suspected that this might be due to the strength of the blur used when calculating image derivatives later in my interest point generation. However, after testing I determined that that was not the case.

The second bug which held me up was an even simpler issue. My points were being ordered from least confident to most confident, as opposed to the other way around. This meant that, when I visualized what I thought were my top 100 best matches, I was in fact visualizing my 100 worst matches.

Feature Detection Run On Alternate Images

Each of the following images demonstrates one the weaknesses of my feature matching algorithm.

My algorithm appears to actual perform somewhat reasonably on this image of the Capital Building. However, the algorithm clearly performs nowhere near as well as it does on the image of Notre Dame. Amoung other issues, differences in scale between the two pictures probably account for a significant portion of mis-matches.

This set of Eiffel Tower images shows even poorer performance than the first set. This demonstrates issues SIFT feature matching has with self-similarity.

This third set of images demonstrates the fact that SIFT features are incapable of capturing more abstract similarities, such as those between a giraffe and Miley Cyrus.