CS 129 Project 1 Writeup
Greg Yauney (gyauney)
17 September 2012

I first implemented single-resolution alignment, the algorithm for which is relatively simple: each alignment in the suggested 30x30 pixel window of possible alignments is considered. The sum of squared differences is calculated for each alignment, and the alignment with the lowest SSD is considered the best. To facilitate this use of SSD, the borders are first cropped off of each plate before any alignment is done (this is, unfortunately, not dynamic—10% is taken from each side).

The coarse-to-fine image pyramid alignment algorithm, however, is slightly more complex. Succinctly, it involves: blurring the input image with a gaussian filter and then subsampling it to a quarter of its current size (each dimension is halved by discarding every other row and column). This process is performed three times (I also tried a more dynamic approach wherein successive levels were generated until the smallest was smaller than 32x32, but that produced weird alignments for some images) so that we have an image pyramid with four levels. Then, starting with the smallest of these levels, it searches for an alignment in a much smaller window (I chose to use 8x8 pixels, admittedly somewhat arbitrarily). The alignment is then propagated downward to the next, larger image in the pyramid by multiplying the shifts by two and incorporating them into the next search. This continues all the way down the pyramid until the original, largest image is reached and aligned.

Here's a sampling of the final aligned photos produced by both algorithms, along with the elapsed time for each. As you can see, the multi resolution version looks just as accurate yet is much faster, especially in the larger cases. (It's worth noting that I've scaled down the high resolution .tifs so that they fit nicely into the table; rest assured that I ran the alignment algorithms on the full-size images.)

single resolution		multi resolution
3.296 seconds	00125v.jpg 321x273	0.5122 seconds
571.1735 seconds	01064u.tif 2993x2570 (displayed at 400x343)	153.3201 seconds
1.3297 seconds	01044v.jpg 321x273	0.5210 seconds
3.2973 seconds	00170v.jpg 321x273	0.0468 seconds
544.9957 seconds	00458u.tif 2993x2592 (displayed at 400x346)	157.6313 seconds

Both algorithms fail on these last two images, along with plenty more I've elected not to show for the sake of brevity. This makes me think that these particular images are unsuited for the SSD method—I'd implement normalized cross correlation if I had the time.

I unfortunately didn't implement anything above and beyond for extra credit, so there is of course not much else to say.

CS 129 Project 1 WriteupGreg Yauney (gyauney)17 September 2012

CS 129 Project 1 Writeup
Greg Yauney (gyauney)
17 September 2012