I used a image pyramid based multiscale image aligning algorithm to align Prokudin-Gorskii glass plate images, thereby producing a color image. I decided to use normalized cross correlation as the metric for image matching, as it was obvious that the different color sections of each plate had different intensities, and I thought NCC could compensate for that.
The algorithm is composed of a single scale and multiscale component. Both take two images as arguments, as well as a search range in x and y. The single scale function iteratively shifts the second image over the search range, and records the NCC between the two images. It then returns the shift vector that produced the highest NCC. The multiscale function takes the same input arguments as the single scale one, in addition to an integer value that indicates how deep to build the image pyramid. This function creates copies of the images that are half the size of the originals, and recursively calls the multiscale align function on these with the depth argument decremented by one and the search range halved. It then calls the single scale align function on the original images, only with the search range set to a 5x5 search area centered at two times the shift vector returned from the recursive call. The [-2,2] range was chosen because it was the smallest search area with the least alignment errors in the images. The number of levels is set so that the search area at the smallest image of the pyramid is on the order of 5x5.
Because the intent of this project was to align the images, and not the borders around the images, I only ran the algorithm on the middle 9/16 (3/4x3/4) of the image. The resulting shift vector from the alignment algorithm was, however, applied to the whole image. On a few images, this improved performance dramatically.
The image is cropped by first reducing the image to the overlapping area of the three color plates. Secondly, A function determines which rows and columns have averages close to white (for each color channel), and crops the image to include only the area between the first and last non white rows and columns. A similar process is used to remove black from the borders. Averages > 240 are considered whitish, and averages < 50 were considered blackish. These are not optimal choices for each image, but did overall noticeably remove border pixels without cropping out too much of the original images.
![]() | ![]() |
Significant time was spent developing a unsharp mask to increase contrast, before I discovered I could generate an unsharp filter using fspecial. The built-in filter worked better, but some edges in the smaller images have distorted colors.
![]() | ![]() |
I used a grey world hypothesis based algorithm that scales the average rgb values of the image to (138,138,138). 138 was chosen, as scaling to 128 made some images look too dark.
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
The above algorithm gave reasonable results for most images, particularly the large .tiff files. I was unable to porperly align the image of the man in the blue robe. I believe this was because the robe had high values in the blue channel, but very low ones in the red channel. Restricting the search range would have given more reasonable results for this image, but hurt the alignment of other images.
The auto-crop function is clearly not robust enough to remove the border in all images. In general, the white balance and contrast enchancing functions did not adversely effect the image.