Project 1 Image Alignment with Pyramids

CS 129: Computational Photography

Hang Su

Sep. 17th, 2012

Fig 1 Alignment results (click to enlarge)

1 Introduction

Prokudin-Gorskii Collection records an early attempt to capture color photos. It contains 1,902 b&w triple-frame images made with color separation filters, from which we can get color photos after proper alignment of each frame. This project aims at solving the alignment problem.

2 Algorithm

Since the three frames of each image usually have quite small transformations, simple “shift&match” strategy works pretty well (here we assume no rotation/scaling is involved): shift a frame with various displacements, find the place where it fits the best with other frames.

Fig 2 Find best alignment according to SSD

Fig 3 Find best alignment according to normalized cross correlation (cosine)

2.1 Alignment with Pyramids

For large images, doing this directly requires searching a large space, thus takes quite long time (several minutes for a low resolution image). Using image pyramids can speed up the process a lot. On single department machine, doing alignment on low resolution images (1024 px height) takes less than 1 seconds, while on high resolution images (~10000 px height) it takes about half a minute.

2.2 Image Features

A straightforward way is to compare pixel intensities directly. However, since the images to be matched do not actually have the same brightness values, other image features could work better. I tried using gradient magnitude at each pixel, which improves the performance vastly. Comparison can be found in Section 4.

2.3 Matching Metrics

Two matching metrics are tried in this project:

· Sum of squared differences: sum( (image1-image2).^2 )

· Normalized cross correlation: dot( image1./||image1||, image2./||image2|| )

These two metrics are very similar (F ig2 & Fig3), though experiment (see Section 4) shows the second metric is a better choice in this specific case.

3 Better Alignment

The method discussed in Section 2 works quite well for this dataset. However in cases where rotation/scaling need to be considered, it’s apparently not enough. Also, this naive “shift&match” approach is not able to get sub-pixel precision results.

To overcome these drawbacks, more advanced methods should be used. For example, several matching points can be found using SIFT descriptor (Fig 4); then using RANSAC we can find reliable matches, an d finally use those correspondences to compute transformation matrix. This is similar to what will be done in Project 6, so here I leave it for then.

Fig 4 Matching of SIFT descriptors

4 Experiments

4.1 On Prokudin-Gorskii Collection

Fig 5 & Fig 6 Performance comparison of different features & matching metrics

The collection contains 1,902 b&w triple-frame images. From the website of Library of Congress I downloaded two version of this dataset:

· High-resolution TIFF images, 128.6 GB, urls

· Low-resolution JPEG images, 319 MB, urls

For each dataset I tried using different image features and matching metrics, and manually counted the number of photos correctly aligned. Fig 5 & Fig 6 shows the performance comparison. It worth mention that using gradient magnitude as image feature and normalized cross correlation as matching metric, only 8 out of 1902 images are wrongly aligned, getting an accuracy of 99.58%.

4.2 Image from Other Sources

I tried to use exactly the same algorithm on other data from other sources. For example, Magnificent CME Erupts on the Sun (Fig 7) shows a pseudo-color composition.

Fig 7 Alignment & composition of pseudo-color image (click to enlarge)

Last update: Dec. 17th, 2012