CS 143 / Project 2 / Local Feature Matching

In this project, the task was to find distinct interest points in two views of the same object, compute a local feature at each interest point, and then finally match similar features between the two images. My particular implementation is just the baseline implementation, which is unable to match points between images which differ drastically in scale and lighting.

Interest Point Detection

I used a basic version of the Harris corner detector. At a high level, this algorithm wants to try and find image patches which, if shifted by a short distance, result in drastic changes in self-similarity. To accomplish this, the Harris corner detector uses the second-order moment matrix to calculate a "cornerness" score (a proxy for the autocorrelation surface) for each pixel in the image, discards any scores below a given threshold, and then finally takes a local maximum for each connected component in the remaining pixels.


gaussianD = fspecial('sobel');
x_derivatives = imfilter(image,gaussianD);
y_derivatives = imfilter(image,gaussianD');
window = fspecial('gaussian',[7 7],3);
Ix2 = imfilter(x_derivatives.*x_derivatives,window);
Iy2 = imfilter(y_derivatives.*y_derivatives,window);
Ixy = imfilter(x_derivatives.*y_derivatives,window);
%calculate cornerness
alpha = .04;
M_det = Ix2.*Iy2 - Ixy.^2;
M_tr = Ix2 + Iy2;
cornerness = M_det - alpha*(M_tr).^2;
%threshold
threshold = 0.05;
cornerness(cornerness < threshold) = 0;

For the derivative filter, I decided to used a Sobel filter. The book recommends Difference of Gaussian, which I initially tried. It worked when processing the full-size Notre Dame, but when I ran the 'proj2' file, it didn't play nicely with the downsampling. The Sobel filter was the only derivative filter which worked ([-1 0 1] didn't work either). I also had to suppress outputs near the edges because any points which can't have a 16x16 window drawn around them break get_features. Otherwise no real design decisions were made. Pretty straightforward.

Feature Construction

The next step of the image processing pipeline is the construction of a feature at each interest point. In general, a feature is a set of distinguishing information about a point in the image. Here, a feature takes the form of a 128-dimensional vector representing a series of gradient histograms computed in the neighborhood of each interest point.


x_gradients = imfilter(image,[-1 0 1]);
y_gradients = imfilter(image,[-1 0 1]');
for k = 1:length(x)
    x_window = x(k)-7:x(k)+8;
    y_window = y(k)-7:y(k)+8;
    x_feat_gradients = x_gradients(y_window,x_window);
    y_feat_gradients = y_gradients(y_window,x_window);
    angles = atan2(y_feat_gradients,x_feat_gradients);
    %need to coerce angles to [0,2pi] so they bin nicely
    angles(angles < 0) = angles(angles < 0) + 2*pi - pi/8;
    bins = [0 pi/4 pi/2 3*pi/4 pi 5*pi/4 3*pi/2 7*pi/4 2*pi] - (pi/8);
    %now handle breaking things into subwindows
    feat = zeros(1,128);
    corners = [1 5 9 13];
    for i = 1:length(corners)
        for j = 1:length(corners)
            subwindow = angles(corners(i):corners(i)+3,corners(j):corners(j)+3);
            start = 1 + 32*(i-1) + 8*(j-1);
            count = histc(subwindow(:),bins);
            feat(start:start+7) = count(1:8);
        end
    end
    features(k,:) = (feat/norm(feat)).^.88;
end

For each interest point, I just calculate the angle of the gradient at each pixel in its neighborhood and throw it into a series of histograms. The gradients are calculated using a different filter here, because Sobel wasn't giving me promising results. Again, pretty straightforward. I add 2*pi to all the negative entries to make the histogram a bit more natural, and at the end take everything to a power less than 1 to "shrink" the vectors and make them easier to compare in the next step. I arrived at 0.88 after messing around and finding that it gave me pretty decent matches.

Feature Matching

The final step of the pipeline is feature matching, i.e. deciding which points in one image correspond to those of another similar (but slightly changed, e.g. through lighting or change in viewpoint) image. Here, a simple ratio of Euclidean distances is used to decide which points match and how well.


threshold = 0.85;

for i = 1:num_features1
    distances = sqrt(sum((repmat(features1(i,:),num_features2,1)-features2).^2,2));
    [dists,indices] = sort(distances,'ascend');
    nnmeasure = dists(1)/dists(2);
    if nnmeasure < threshold
        matches(i,:) = [i indices(1)];
        confidences(i) = 1-nnmeasure;
    end
end

Here, as described in the textbook, I threshold on the ratio between the distance (in feature space) from a point's nearest neighbor to its second-nearest neighbor. I picked the threshold to be 0.85 because it was the smallest threshold that would give me more than 20 good matches. It doesn't really seem that harsh to demand that the best match be 15% closer than the next best match, but apparently that's a pretty tall order.

Results and comparison

Best 20 matches for Notre Dame example

All matches for Notre Dame example (threshold = 0.95, 35 good, 111 bad)

The algorithm happens to pretty poorly on the given Notre Dame data, likely because my code contains few bells and whistles which would increase accuracy. I actually get around 50% (13 good, 15 bad) match accuracy when leaving the threshold at 0.85, but we get a more (I think) representative percentage (about 24%) with more points. Many of the poor matches are pretty out of line. Many of them leap between the two (admittedly similar) spires of the cathedral as there's no geometric verification going on.

The above images show the deteriorating performance of the algorithm as the object under consideration rotates. The images compare the objects differing by rotations of 20, 30, and 40 degrees, respectively. The first two images are pretty good, with zero misses in the first and only two in the second. The third image, double the rotation of the first, ends up wildly mismatching everything but the green/yellow/pink cluster on the front edge.

The images used in the last example are from the Amsterdam Library of Object Images (http://staff.science.uva.nl/~aloi/). I would have put something there to show deterioration under lighting changes, but the changes in the data sets they have aren't drastic enough.