Project 1: Color Alignment
cs195-g: Computational Photography Patrick Doran
Spring 2010 pdoran
Parameter values decided to make the algorithm work for as many images as possible.
Variable Value Explanation
S [ 128, 128 ] The smallest size to scale an image in the image pyramid
W [ 2 ] The window size such that the offsets examined at each scale are [ -2:2 ] or [ -2,-1,0,-1,-2 ] in horizontal and vertical directions
B [ 0.20 ] The percent of the border to ignore such that the examined area is the internal 64% (0.8*0.8)
SR [ 0.9 ] The sigma rule such that if a border column/row is 'too far' (mean +/- 0.9*sigma) from the mean value, it is automatically cropped
C [ 0.01 ] The percent of values to ignore to create the "high" and "low" thresholds for automatic contrasting (only applied to saturation and value)
High level description of the algorithm.
1 Load the image and separate the image into it's 3 components by dividing into three even vertical components (that is the way the images are stored). Note that this may cause a division through an image rather than the black border of the negative.
2 Create a gaussian pyramid of each of the images to be aligned.
3 Align the images (align everything to blue):
  1. Start at the lowest resolution of the gaussian pyramid and check for the best alignment at [-W,W] pixel range. This pixel range at the next higher resolution image would represent [-(W^2),(W^2)] because it is twice the size of the current resolution. Thus, a shift of 1 at the lowest level can represent W^n where n is the depth of the pyramid. A shift of 1 at the second lowest level represents a shift of W^(n-1) and so on such that the maximum shift representable is the sum from 1 to N of W^(n-i-1).
  2. At the next highest resolution, shift the image by twice the current shift (and then update the current shift by doubling it). Then proceed to look for the best alignment
  3. Choose the best alignment by looking at some metric between the two image at each of the window location. Better results can be gained by ignoring the borders and looking at the internal pixels only
  4. Update the shift by adding it to the current shift (which should have been doubled already because we moved up a scale).
4 Apply automatic cropping
5 Apply automatic white balancing
6 Apply automatic contrasting
General Comments

All metrics tried have about the same effectiveness, which I found surprising considering applying sum of squared differences or the normalized cross correlation metric directly to the RGB images does not compensate for the brightness of each of the image. There were 2 images that none of the metrics could align properly without making other images unaligned (the goal being to apply the same algorithm to all images without user input): image 3 - 00153v.jpg and image 15 - 01861a.tif. They could be aligned, but it required tweaking input parameters specifically for those images and that ruined more images than it fixed.

The algorithm simply takes the height and divides by 3 and that does not necessarily separate the images along the black borders. If each of the images were completely extracted from their black borders then this problem could be resolved, likely enhancing results.

The algorithm assumes that the camera was only translated, it does not consider affine transformations like rotation. That seems like a logical next step to possibly improve this algorithm. Though it will be more computationally expensive.

Bells and Whistles
Automatic Cropping
Automatic cropping tries to remove two types of edges: black/white borders and colored edges. Black/white edges are the from the negative and scans of the negatives. They are abnormally bright compared to the rest of the image. The colored edges are side effects of the alignment process. These are have abnormal variations in the RGB channels. The cropping algorithm works by calculating the standard deviation between RGB channels of each pixel and their mean. Then it calculates the mean of both of those metrics across of each row and column. Then it calculates the mean of those metrics across the whole image. Rows and columns that are too far away from the mean of both metrics are removed. Too far away being some number of standard deviations from the mean of the metrics. This works well for most images, those some borders persist anyway.
Automatic White Balance
Automatic white balance is done using the gray world assumption. First, it calculates a mean gray (mean of the mean R, G, and B). From there it scales the R, G, and B channels by the mean gray / mean channel (listed as balance in the statistics) Then it finds that max value of all R,G,B and normalizes all the channels. This generally does not help any of the images as it mostly tints the images toward a color that was not present in the input or darkens the input.
Automatic Contrast
Automatic contrast seeks to make the images more vibrant. It is done by converting the image to HSV space and stretching the saturation and value channels. This is done by selecting a low and a high threshold for saturation values, stretching all the values by this scale and cutting off values so they are from 0 to 1. Then the image is converted back to RGB space. Low and high thresholds are selected by sorting the values and choosing the them to be value of the NUM_PIXELS*C and NUM_PIXELS-NUM_PIXELS*C of the sorted S and V values. In other words, remove the largest and smallest C percent values and stretch the rest. This often results in enhanced images, but occassionaly the images are clearly artificially vibrant. That can be fixed by simply not stretching to the range 0 to 1, but some smaller range such as half the distance of the current low to 0 and high to 1.