Introduction
The Prokudin-Gorskii collection consists of a sets of three black and white photographs captured using red, green, and blue filters. The objective of this project is to align these images and make one color image while minimizing visual artifacts.
My algorithm was successful in aligning all the images in the default assigned set of 16 images and many images in the online Library of Congress data set as well. Some examples are shown below.
![]() |
![]() |
![]() |
![]() |
Algorithm Overview
Extract Interior Pixels for Image Alignment
The proper translation offset from green to blue and red to blue needs to be calculated for alignment. Image alignment will only be performed on the center 70% of the image to reduce edge effects and other visual artifacts.Compute Image Pyramids
Image pyramids are then preconstructed to optimize the translation offset search in the next step. A subsampling factor of two is used between each pyramid level. At least two pyramid levels are constructed. In order to scale the number of pyramid levels beyond two for larger images, the following equation is used:
This ensures that a new pyramid level is added everytime the width/height doubles. For example, the number of pyramid levels is 3 for a 500x500 image, 4 for a 1000x1000 image, 5 for a 2000x2000 image, etc.Create Unsharp Masks and Search for Translation Offsets
Unsharp masks are created to emphasize the edges in the images. New unsharp masks are created at each pyramid level. Searching is done over a wider range at the smaller and coarser pyramid levels. The minimum sum of squared differences metric is used to determine the best offset at each pyramid level. The translation offset estimate is refined as larger and finer pyramid levels are processed.Image Alignment
After the two translation offsets (green to blue and red to blue) are computed, the image color channels are aligned into one color image.Automatic Cropping
After alignment, the image is then cropped automatically by detecting the straight lines in border regions of the image. This step attempts to conservatively eliminate edge artifacts and is described in greater detail in the Automatic Cropping section below.Automatic Contrasting
After cropping, the colors in the image are statistically analyzed using a histogram and then retouched to create a more vivid and realistic image. This step is described in greater detail in the Automatic Contrasting section below.
Better Features: Unsharp Mask
An unsharp mask (left) is created by subtracting the original image (right) from a gaussian blurred image. This emphasizes the edges in an image because the difference will be greater in areas with higher contrast. This increases the performance of the minimum sum of squared differences metric.
![]() |
![]() |
Automatic Cropping
Create Grayscale Border Regions
First, 4 grayscale border regions (top, bottom, left, right) are created for pixels within 10% of image height/width of the edge. Shown below on the right is the grayscale right border region of the coloured image on the left.
Run Canny Edge Detector on Grayscale Border Regions
Next, a canny edge detector is run on the grayscale border regions to extract the edges. Shown below on the right are the extracted edges of the grayscale right border region on the left.
Perform Hough Transform to Detect the "Best" Straight Line
Next, a hough transform is performed to translate the pixels in the edges to polar coordinates. Each pixel in the edges corresponds to a single curve in the polar coordinates plot below. The intersections or "peaks" in the plot correspond to various straight line detections in the border region. For example, the square in the plot below directly corresponds to the green line in the image on the right. The best several peaks are selected and their corresponding lines are calculated. The line that is the farthest from the edge and still meets all length requirements is selected as the "best" straight line. The hough transform technique was selected because it still works on partially occluded lines.
Crop Each Region with "Best" Straight Lines
The original image is then cropped for each border region according to the "best" straight line. The final cropped image is shown below on the right. As you may notice, sometimes not all the undesirable edge artifacts are cropped off. This is because in order to not accidentally remove good regions from the image, the algorithm is fairly conservative and is parameterized to err on the side of cropping too little versus cropping too much.
Automatic Contrasting
After cropping, automatic contrasting is performed in order to correct the colors in the image. The input image is divided into separate color channels and a histogram is constructed for each color channel. Next, the pixels under the 0.05th percentile and and above the 99.95th percentile are ignored to remove outliers. The mean for the remaining pixels is then calculated and then subtracted from each pixel in the color channel. The pixel values are then rescaled and offset to stretch the color range to 0-255. The color channels are then combined to form the final image. This algorithm was inspired by a similar technique described in the GIMP documentation.
![]() |
![]() |
Extra Data
The algorithm generally performed well on other Prokudin-Gorskii images from the Library of Congress archive that were tested. Even most of the images with some damage were successfully aligned.
![]() |
![]() |
![]() |
![]() |
Failure Case
There is one damaged Prokudin-Gorskii image from the Library of Congress archive that the algorithm failed on. There is probably too much damage in the interior pixels of one of the color channels for alignment.
![]() |
![]() |
Source of Title Image
Giza, Pyramids of. [Photograph]. In Encyclopædia Britannica. Retrieved from http://www.britannica.com/EBchecked/media/108145/Pyramids-of-Giza-southwest-of-Cairo-Egypt