I stuck relatively close to the baseline implementation for all parts of the project, but found that I was still able to acheive relatively good performance.
I implemented a simple Harris corner detector.
The code for this algorithm is relatively straightforward and explains my method well:
function [x, y, confidence, scale, orientation] = get_interest_points(image, feature_width)
% 1. Compute derivatives using sobel filter
s = [1, 0, -1; 2, 0, -2; 1, 0, -1];
dx = imfilter(image, s);
dy = imfilter(image, s');
dx2 = imfilter(dx, s);
dy2 = imfilter(dy, s');
dxdy = imfilter(dx, s');
% 2. Filter with large guassian
g = fspecial('gaussian', 25, 1.5);
ax2 = imfilter(dx2, g);
ay2 = imfilter(dy2, g);
axy = imfilter(dxdy, g);
% 3. Compute scalar interest measure
alpha = 0.06;
har = (ax2 .* ay2) - axy .^ 2 - alpha .* (ax2 + ay2) .^ 2;
% 4. Suppress features close to edges
% (we wouldn't be able to build a descriptor there anyways)
har(1 : feature_width, :) = 0;
har(end - feature_width : end, :) = 0;
har(:, 1 : feature_width) = 0;
har(:, end - feature_width : end) = 0;
% 5a. Find connected components
CC = bwconncomp(im2bw(har, graythresh(har)));
% 5b. Take the max from each component region
x = zeros(CC.NumObjects, 1);
y = zeros(CC.NumObjects, 1);
for i = 1 : CC.NumObjects
region = CC.PixelIdxList{i};
[~, ind] = max(har(region));
[y(i), x(i)] = ind2sub(size(har), region(ind));
end
end
This algorithm generated a reasonable set of interest points for each image. Below are images representing the sets returned for the two main test images.
![]() ![]() |
![]() ![]() |
I implemented the baseline SIFT-like feature descriptors as outlined in the header comment of the function stencil.
The basic approach is as follows:
Here's the equivalent code:
function [features] = get_features(image, x, y, feature_width)
angle_bins = -180:45:180;
features = zeros(length(x), 128);
% for each keypoint
for i = 1 : length(x)
% Get window
window = image(y(i) - feature_width/2 : y(i) + feature_width/2 - 1, ...
x(i) - feature_width/2 : x(i) + feature_width/2 - 1);
% Get gradient of window
[gmag, gdir] = imgradient(window);
% Weigh magnitudes with gaussian
gmag_weighted = imfilter(gmag, fspecial('gaussian', 10, sqrt(feature_width/2)));
% Transform to cells
gmag_cols = im2col(gmag_weighted, [feature_width/4, feature_width/4], 'distinct');
gdir_cols = im2col(gdir, [feature_width/4, feature_width/4], 'distinct');
% for each cell in 4x4 array
descriptor = zeros(1, 128);
for j = 1 : size(gdir_cols, 1)
col = gdir_cols(:, j);
[~, inds] = histc(col, angle_bins);
buckets = zeros(1, 8); % Sum magnitudes for each histogram bucket
for k = 1 : length(inds)
buckets(inds(k)) = buckets(inds(k)) + gmag_cols(k, j);
end
start_ind = (j - 1) * 8 + 1; % Compute start and end indices into descriptor
end_ind = (j - 1) * 8 + 8;
descriptor(1, start_ind : end_ind) = buckets;
end
descriptor = descriptor / norm(descriptor); % Normalize descriptor
features(i, :) = descriptor; % Add to features
end
% Trick: raise each element of feature matrix to number < 1
power = 0.8;
features = features .^ power;
end
I decided to complete the baseline, brute force implementation of calculating the euclidian distance between each pair of features.
I then sorted each column of the the resulting matrix to obtain the nearest and second nearest neighbors for each feature in the first set.
After computing the ratio of the distances from these two neighbors, I rejected all matches in which it was greater than 0.8.
My confidence values were simply one minus the ratio of nearest neighbor distances.
function [matches, confidences] = match_features(features1, features2)
num_features1 = size(features1, 1);
num_features2 = size(features2, 1);
matches = zeros(num_features1, 2);
confidences = zeros(num_features1, 1);
% find the euclidian distance between each pair of descriptors in feature
% space
distances = zeros(num_features2, num_features1);
for i = 1 : num_features1
for j = 1 : num_features2
distances(j, i) = norm(features1(i, :) - features2(j, :));
end
end
thresh = 0.8;
[sorted_distances, inds] = sort(distances); % sort distances in ascending order
% calculate ratio of nearest neighbor to second nearest neighbor
% if above threshold, add to matches list and save ratio as confidence value
for i = 1 : num_features1
ratio = sorted_distances(1, i) / sorted_distances(2, i);
if ratio < thresh
matches(i, :) = [i, inds(1, i)];
confidences(i) = 1 - ratio;
end
end
% keep only those that matched
match_inds = find(confidences > 0);
matches = matches(match_inds, :);
confidences = confidences(match_inds);
% Sort the matches so that the most confident onces are at the top of the list
[confidences, ind] = sort(confidences, 'descend');
matches = matches(ind, :);
end
For the Notre Dame test images, 96 of my 100 highest confidence pairings correctly matched.
![]() |
![]() |
Of the total matches, 210 of 229 were correct (92%).
![]() |
![]() |
The only improvement I made was to perform the 'trick' mentioned in the feature descriptor stencil code (i.e. raising each element in the descriptor to a power less than one).
Other than tweaking a few parameters (I set the threshold for nearest neighbor ratios up from 0.6 to 0.8), I stuck almost entirely to the baseline implementation. I found that it did a good job of matching local features for the initial test images.
Other images were a bit more tricky, and my lack of accounting for scale or orientation definitely hurt my algorithm's performance. Below are the results of matching features in two other sets of images:
Both have some correct matches, but neither are close to the performance I was able to obtain on the Notre Dame images.