Ben Swanson - Assignment 1

Febuary 6, 2010

This assignment was to create a single and multi scale image aligner for the Prokudin-Gorskii photo collection. The input to the process are three exposures with a red, blue, and green filter respectively. The goal is to align the three images such that visible artifacts are minimized.

The algorithm used here is an image pyramid, which processes the image at increasingly larger scales in order to save time. The matching metric I employed was Normalized Cross Correlation, which proved much better than the sum of squared differences metric. Additionally, to save time in computing the metric I eliminated a thick boundary of the image on each side during matching to avoid statistics on the image boundaries which might confuse alignment.

Extra Credit

Whitebalance

For automatic white balance I used the gray world method, which says that the mean value of each color channel should be equal. This is implemented by simply calculating the mean value of each color channel and scaling two of the channels so that their mean matches the minimum of the three means. I did this to avoid values of colors in the double representation which are greater than 1.0. I also chose to do this before I did alignment in the hopes that the whitebalancing would lead to better matching performance.

Contrast

To automatically adjust contrast I used the goal that for each color channel it should span the range from 0.0 to 1.0 completely across all values for that color in the double representation. This meant finding the minimum value and subtracting it from all color values in that channel, and then scaling all values back up so that the maximum value was 1.0.

Cropping

I employed two cropping methods. The first was to automatically crop out any edges which did not have true data from all three color channels. This is determinable directly from the red and green offsets. The second was to attempt to detect large single color bars on the boundary which are clearly not part of the actual photographs. I did this by testing the variance for each row or column over the average value of the three color channels. I then tuned a threshold for this parameter such that once a variance over the threshold was found, everything up to that row or column would be cropped. I tuned the threshold conservatively such that no image was too cropped, but this made it such that for some images, the variance threshold was too low and not enough cropping was performed.

Analysis

While my implementation works, I have two major sources of concern, the cropping procedure and the speed.

To do automatic cropping of the second type mentioned above, I first averaged the value of each color channel at each pixel to get the average color, and then calculated the variance of a row or column over this statistic. I experimented with a threshold on this variance for determning when the "real" photo began, and it was not possible to find one which did a good job on all images. The real culprit was the third image below of the man in the blue robe sitting in a chair. At threshold levels which did a good job of automatically cropping most images this image would be cropped down to almost nothing. This makes me suspect that using the variance of mean illuminance is not a good statistic. I also experimented with calculating the variance on each color individually and exiting my search whenever any color channel individually exhibited variance above a threshold, but I ran into the same problem.

Another weakness of my implementation is in its speed. I tried using large pads on the alignment areas, which is to say that I cropped out a large amount of the image and computed alignment statistics only on the center. This was the only method I could come up with to improve my speed which proved effective, but led to me having to process the large and small images in separate batches with different settings of this padding factor.

Sample Outputs