CS 129 Project 1 Writeup

Ian Strickman (istrickm)
September 17 2012

Introduction

For this project, I implemented two versions of an image alignment algorithm for aligning three color channels of the same image, such as those found in Prokudin-Gorskii Collection. Images from this collection were taken sequentially with three different color filters, so the negatives have three separate greyscale images for each of the three RGB channels. The problem with this strategy is that minute movements either by the photographer or the subjects (i.e. the camera moved between shots or the subjects shifted a little bit in place) will throw the images out of alignment, and so if one naïvely puts them on top of one another, the channels won't perfectly align and the images will look wrong. An alignment algorithm figures out how to align the three channels and then outputs an image that, if aligned correctly, will be a good color image of the original scene.

Algorithm and Design Decisions

The first algorithm of the two algorithms takes the three channels and for two pairs (Blue/Green and Blue/Red) shifts pixels in the second channel and checks the quality of the alignment. The shifts are done between [-15, 15] in both the x and y directions, and the metric used for checking alignments is the Normalized Cross-Correlation. With this alignment metric, we want to find the alignment with the highest summed Normalized Cross-Correlation because that will indicate that the two channels are most similar at that alignment. To optimize the alignment metrics, we take both of the channels, shift the second, and then crop both so that wrapping effects because of the shift aren't taken into effect. This remarkably increases the quality of the resulting images, because the overlap of those regions shouldn't affect the final image, and usually they line up very poorly because they are definitely not similar.

The second algorithm uses a simlar strategy to the first, but does it over an image pyramid created from the original image instead of just on the original image. For this algorithm, we first construct an image pyramid from the first image. This means we take the image and apply a gaussian filter to blur it, and then scale the image down by a factor of 2. We repeat this process with the resulting image until we have reached a fairly small image. We then apply the first algorithm to this smallest image, but with a window between [-5, 5] instead of [-15, 15]. We take the resulting calculated shift, double it, and then apply it as an initial shift on the second smallest image. We then apply the same algorithm again, still with a window between [-5, 5], but using the previously calculated shift as a starting point instead of [0, 0]. We then take that calculated shift, double it, and pass it up again to the third smallest image, etc. At the end of this algorithm, we will have a shift at the level of the original image that we can then use to display the resulting image.

There are two beneficial consequences of using this approach:
I implemented one type of extra credit: Better Features. For each channel, instead of just running the greyscale images through my algorithms, I applied a vertical and horizontal sobel filter to each channel, and then summed the resulting vertical and horizontal edge detected images, and used those to generate my alignments. The idea with this is that the actual color channels might differ greatly in intensity of any given pixel, but if we align the high gradient areas (i.e. edges of objects), then we will get a better overall alignment. This worked quite well for many of the images, although it did falter a bit on some of the images with more blurry edges or more gradients around the edges of the picture (such as these)

Results Images

Results where Edge Detection via Sobel Filtering helped

Without Edge Detection With Edge Detection

Results where Edge Detection via Sobel Filtering didn't help

Without Edge Detection With Edge Detection