Project 6 Writeup: Automated Panorama...

Project 6 Writeup: Automated Panorama Stitching
Basia Korel (bkorel)
April 17, 2010

This project entailed seamlessly stitching two or more images together to create a panorama

Input

The first step of this project is to start with a set of images that are taken from the same position, thus they have the same center of projection.

Left Input Image	Right Input Image

Recover Homographies

A projective transformation matrix H must be calculated before warping one image into the second. The transformation is a homography, p'=Hp can be solving for using a system of linear equations. p' and p are vectors or correspondence points that in the first part of the project, the user defines. A projective mapping has eight degrees of freedom, thus there must be at least four pairs of correspondence points to solve for eight unknowns. I used the transformation matrix defined in the paper "Projective Mappings for Image Warping". The transformation matrix can be solved for using x = A\b in Matlab. A is the matrix of linear equations and b is a column vector of correspondence points for the left image. Although 4 pairs of correspondence points are required to solve for the transformation, I generated the A matrix in a for loop such that any number of correspondence points can be used.

User-defined correspondence points

Warp Images

The right image is projectively warped using the computed homography. An inverse warp is performed, mapping pixels in the output to the input image. Pixel values are re-sampled with bilinear interpolation to avoid aliasing.

Warped Images

Composite Warped Images

The size of the final composited image is computed by warping the corners of the input images. The images are composited together to create a mosaic by simply copying the left image into the warped (right) image.

Image Mosaics

Detect Features

Instead of allowing the user to specify correspondence points, the second part of this project consisted of implementing automatic feature detection and matching based on the paper Multi-Image Matching using Multi-Scale Oriented Patches". First the Harris corner detection algorithm, which was provided by the TAs, is used to extract features or interest points in an image using both corner and edge detection. Adaptive Non-Maximal Suppression (ANMS) is applied to limit the number of features extracted, while ensuring the the remaining features are spatially well distributed and their corner strengths are relatively high. ANMS defines a minimal suppression radius as follows:

$r_{i} = min_{j}|x_{i}-x_{j}|, s.t. f(x_{i})<c_{robust}f(x_{j}), x_{j}\epsilon I$

$x_{i}$ is an interest point in the image,

is the set of all interest points, $f(x_{i})$ is the corner strength for the given interest point, and $c_{robust}$ = 0.9. A minimal suppression radius is calculated for every interest point, and the 500 interest points with the largest radius are selected.

All results from Harris Interest Point Detector:

Strongest 500 from Harris:

500 Features from ANMS:

Match Features

Next we must find a correspondence between pairs of features from each image. First a feature descriptor is extracted for all the 500 interest points output from ANMS. Feature vectors are useful because for a given interest point they provide a local description of the image that will be useful for feature matching across images. A 7x7 patch of pixels around each interest point is sampled using a spacing of 5 pixels between samples. The Harris corner detect does not select interest points within 15 pixels of the image border, thus I did not have to handle special cases when sampling pixels. The feature vector is normalized by subtracting by the mean and dividing by the standard deviation.

To match features, I compute the distance (error), using the dist2 function provided by the TAs, between every possible pair of features from the left and right image. For every pair of features an error ratio is computed which is the error for the best match divided by the error for the second best match. Thresholding on the ratio is better than thresholding on just the distance for the best match because "the distributions of e1/e2 for correct and incorrect matches are better separated than the distributions of e1 alone."¹ I threshold on ((max(ratio) + min(ratio)) / 2), and every pair of features with a ratio above the threshold will be discarded.

Resulting matched features:

Robustly Recover Homographies

The resulting features extracted from the matching phase will still contain outliers. The RANdom SAmple Consensus (RANSAC) algorithm is used to eliminate remaining outliers and compute a robust homography. First, four feature pairs are selected at random. Second, a homography is computed from the feature pairs. Third the homography is applied to all of the features and the inliers are stored such that SSD(p',H*p)<0.5. Steps 1 through 3 are repeated for 1000 iterations and the largest set of inliers are maintained. Finally all of the projective transformation matrix that is recomputing using all of the inliers. Projective warping and compositing follow to produce the final mosaic.

Inliers recovered from RANSAC:

Panoramas

notes

¹"Multi-Image Matching using Multi-Scale Oriented Patches". Brown et al.