CS129 / Project 6 / Automated Panorama Stitching

For this project, I implemented a basic version of the image stitching algorithms detailed in Multi-Image Matching using Multi-Scale Oriented Patches. We can use the technique outlined in the paper to construct panoramas from multiple images. In the simplest case, however, we use the technique to stitch two images together that were taken pointing in slightly different directions. The images below show example input images.

The technique can be broken down into a number of separate steps. I outline my implementation of each as follows:

Harris Interest Point Detector

The Harris Interest Point Detector was already implemented for us for this project. Given an image, it determines location in the image that constitute 'interesting' points. These are typically things that we can think of as 'corners' - importantly, features that are invariant to projective transformations. For example, the following images demonstrate good, and bad, points - the good point is clearly a corner, whereas the bad point is a random point along an edge.

Adaptive Non-Maximal Suppression

The next step upon receiving the interest points from the detector is to choose a good representative sample from the set. The algorithm makes sure that we choose points that score highly according to the metrics used by the Harris Detector, but also balances them so that they are not all clustered together in a single location in the image.

Feature descriptor extraction

The next part of the project is to extract a feature descriptor for a small neighbourhood surrounding the interest points. Ideally, the feature descriptor would be rotation-invariant and have sub-pixel accuracy. For this implementation, I merely make sure to normalize the patches and smooth the neighbourhood by filtering the image. This still gives satisfactory results, too. The image patches above are examples of the outputs of this step, however they are further normalized to have a mean of 0 and standard deviation of 1.

Feature matching

Having performed the previous steps on two input images A and B, the next step is to compare the set of features from the two images and match them together. For each feature from A, we find the 'closest' feature from B according to a distance metric. We do the same for features from B to A. The result is a set of pairs of features, one feature from image A and one from image B. We make sure to reject outliers using Lowe's 1-NN and 2-NN technique described in the paper, which eliminates about 80% of false positives at a cost of 10% of correct matches. The following image shows the final interest point sets for images A and B.

Recover the homography using RANSAC

The next step in the process is to use the pairs of corresponding features to determine the homography the map one image to the other. This is a fairly straightforward task - we can select four points from image A and their corresponding points in image B, then solve a system of linear equations to obtain the homography. The RANSAC algorithm then determines how many of the remaining points also line up using this homography. The homography selected to be the 'correct' one is the homography with the greatest number of points lining up.

At this point, we have successfully lined up the images. Various further techniques can be used to blend the images together. The following images show the results of applying the algorithm to various photos.