CS129 Project #6: Automatic Panorama Construction

Bryce Aebi
November 25, 2012

Introduction

The object of this project was to create panoramic images composed of several photographs of the same scene. At a high level, this is done by finding corresponding "interest points" between images. From these corresponding points, it is possible to derive a homography that allows one image to be "warped" to achieve the same perspective as the first. (One cannot simply overlap different images of the same scene taken from the same position, as angles and distances change depending on the direction the photograph was taken). The corresponding parts of the images can be overlapped and the process can be repeated to append additional photos to the panorama.

Diana's Bath, New Hampshire

Defining Correspondences

To find correspondences between two images, it is first necessary to find the "feature points" of each image. Corners are chosen to be the feature point in this project. If one examines a square "window" of pixels around a corner in an image, any shift in this window should change the average intensity of the window. If this window was instead placed at any point on a white image, any shift would not change the intensity at all. As to be expected, no feature points would be found on a white image. Likewise, if the window was placed on a black line in a white image, any shift along this line would not change the intensity either. Corners provide unique features in images, as there is little uncertainty about their actual location.

Image B
Image A
Image B Harris Points
Image A Harris Points
These corner feature points are known as Harris interest points. Simply acquiring the Harris points provides too many feature points often clustered in regions of the image that have high frequencies. It is ideal to have an even spread of interest points so that there is less ambiguity as to which points from two images of the same scene match each other. The number of interest points are reduced with an algorithm called non-maximal suppression. In general, this algorithm works by taking into account each interest point's "strength" as well as its distance from other interest points. An interest point that is close to another point must be sufficiently strong to avoid suppression.

Image B Suppression
Image A Suppression
Once the Harris points have been suppressed to a reasonable number, it is possible to pick out the corresponding points between the images. First, a square window of pixels is centered around each interest point. Each window in the image is compared to every single window in the other image, which provides a series of similarity "scores" (based on normalized intensity variations). Because there is often repetition in images, it is possible to falsely match two points. To reduce false matches, features with the highest scores are not chosen to be the correspondence features. Instead, for each feature, the ratio is calculated between its two highest scores with the features in the other image. If this ratio is close to 1, it means that the two best matches had very similar scores. It is therefore unclear which match is the correct match and which match is a false positive. On the other hand, ratios farther from 1 imply that there is a clear best match. These points are chosen to be the corresponding features.
Image B Correspondence Points
Image A Correspondence Points

Recovering the Homography with RANSAC

Only 4 corresponding points are required to recover the homography necessary to transform one image to match the perspective of the first. Despite the careful elimination of unusable features, some correspondences are still false. If one recovers the homography by taking into account the influence of each pair of interest points, the false matches will negatively affect the result.

To circumvent this problem, the RANSAC (Random Sample Consensus) algorithm is used. This algorithm randomly selects 4 correspondences, calculates the homography, then counts the number of correspondence that are appropriately transformed by the homography. The algorithm repeats a fixed number of times. The homography with the highest correspondence count is used for the final image warping.

Once the images are warped, the corresponding regions are overlapped and their pixel intensities averaged. The entire process can be repeated by running the algorithm on the previously stitched image and the next image of the scene.

Warped and Composited Images
Cropped

Results

The quality of a panorama proved highly dependent on the algorithm's sampling and RANSAC parameters. It was necessary to adjust the number of Harris features to suppress and the error tolerance levels and iterations of the RANSAC algorithm to achieve a satisfactory result.

When stitching together multiple images, a poor alignment between two of the images would adversely affect subsequent stitching, as the blurred overlapping regions would not provide recognizable feature points.

Lastly, because different photos of the same scene were taken under slightly different conditions, there are occasionally visible seams in the panoramas. This could perhaps be fixed with a Poisson blending algorithm.