Project 6: Automated Panorama Stitching

Tristan Hale (thale)

Overview
The goal is to stitch together multiple images into a single mosaic of images while preserving the natural perspective of the scene. By finding associated points on each image, we can do a perspective transform to warp the images till they fit together nicely. One large challenge is finding exact points from the two images that match the same real-world spot. Using feature matching and the RANSAC algorithm, this is done automatically.

Algorithm

To recover homographies, we need to solve the linear system matching the four points on the first image to the corresponding four points on the second image. This gives us the 8 unknowns we need to make a perspective transform matrix. We simply multiply the matrix times the pixel coordinates to get the new coordinates.
To automatically detect four matching image points, we use a feature matching system followed by the RANSAC algorithm. Before matching features, we have to come up with them. First, we find interest points on both images using the Harris Corners algorithm. We then generate feature vectors for these points, where the feature is the area around the point. We take the 40x40 region around the pixel, but average values of the pixels so that we can form a smaller, lower-frequency 8x8 region. This constitutes a feature of a point. To match features, we compare all features of one image to all features of the other image, using a squared difference error. After finding the best two matches for each feature, we assign them each an error value, which is the SSD of match one divided by the SSD of match two. This gives a lower error to the matches that have a more singular match, and higher error to more ambiguous matches.
Finally, the best 100 matched pairs are run through the RANSAC algorithm to come up with the best four pairs of points to use to recover the homographies. The RANSAC algorithm works by selecting random pairs of points, calculating the homographies, and seeing how well this transform matrix lines up with the matched pairs we have. More matches, means a more likely correct matrix. After a number of iterations (I used 5,000), the best homography is used for the perspective transform.

Results
I ran the program on the assignment pictures and some pictures of my street, and the results look pretty good. There is nothing fancy to clean up the final pictures. I just warped the second image and pasted it onto the first one.

Image 1	Image 2	Resulting stitched image




This warp shows rectifying an image that is known to be square. The user clicks on the four corners of the square and the resulting image is warped so the square depicted in the original image becomes a square on the image plane.