Project 4 Writeup

Alexander Hills (ahills)
March 12,2010

This project was an experiment in scene-completion, culminating in an intelligent brute-force alignment of 120 pictures (per image tested). The idea is this: take an image (call it source), get 120 of the most similar images from a 6.5 million image repository, then "complete" the source image. The source has data "missing" from it, a chunk cut out. This is represented by a pixel mask the size of the source, indicating what part of the original image has been "lost". The next step is to take the 120 most similar images and match them with the original as closely as possible, finding the best matches and reconstructing the source with data taken from the new images.

There are a number of different ways of determining what constitutes a "similar" image, and of figuring out what portion of an image to paste into the source. The way I used is called an image pyramid. Say we have a damaged source image, called source, and an image to patch it with, called patch. First, we scale patch to the size of source. Second, we scale them both down (in the first step, I cut the height and width in thirds), reducing the resolution of the images (using intelligent filtering, say, bicubic interpolation). Third, we shift the patch around until it most closely matches the source image. Then we keep track of the best shift (some x,y translation), scale the image up some (my second scale was to 66% of the original), and shift it around again, starting from the optimal shift we already calculated, and continue until we've reached the original image size. This essentially constitutes refining a search further and further.

We do this for all 120 "matching" images, and rank their best fits. The ranking I used is a simple sum of squared differences (ssd) metric. I took the euclidean distance between the colors of every pixel in both images, and summed them all to determine a total number representing the distance they were from each other. Then I used these numbers to rank the patch images, and took the best four and patched the original image. The patching process is as follows:

Use a graphcut algorithm described here to determine the best possible "cut" between the two images. The masks used were the original mask (indicating the "damage" to the source image), and a dilated version of the original mask (100 pixels in all directions). The area between these masks is where the optimal cut was calculated. This produces a mask which is then used to calculate a blurred version of the patch image incorporating the coloring of the source image. (this method was done using poisson blending, described here. I used my non-transparent blend, I found that with the images were already made to be so similar that rendering one transparent actually made it look worse, every time)

Finally, this algorithm was applied to a large number of images. Here are the more interesting results (some images are stretched so they all fit on the page nicely):

Results Images

Image 4
This is nice (and quite amusing. The juxtaposition of cultures is fun). But this is nicer.

Image 26
Best Decent, you can see an odd building-over-building artifact at the bottom Again, decent. Some clipping of houses happened, and the color chages are a bit jarring.

Image 35
These three are amazing. I dare you to tell me where the cut between the images is Sky and oceans are just made for this The worst part, I suppose, is the color difference. But its darn pretty

Image 41
Impossible to tell where that cut is... Almost impossible in this one, you can see an artifact at the top left of the lake And the lake is replaced by... a beach! What?