Project 4 Writeup

Tim St. Clair (tstclair)
March 16, 2010


My implementation of this project is very different from the methods that were suggested in the project handout. The algorithm works by dividing images up into small patches. Completing an image region is done by matching patches from other images using a nearest neighbor approach. Specifically, a patch is matched by finding a patch in the support images (provided matches) using a nearest neighbor approach on the neighboring patches.

For this project I used 32x32 pixel patches, inspired by the tiny images project. The provided match images are 1024x768 pixels, which is 92160 patches over 120 images. As each patch is 32x32x3=3072 dimensions, finding the nearest neighbor for a patch would obviously take an unreasonably long amount of time. To get around this I used Principle Component Analysis (PCA) to greatly reduce the dimensions of each patch. Trying to perform PCA on 92160 patches would also take an unreasonable amount of time, and require more memory than most computers currently have. So, I used a random set of approximately 32 patches per image, or 3840 total. Once the images are converted to PCA space, I weight the dimensions by their variance to computed the disstance between images, since the first basis acounts for far more image information than later bases.

Below is a display of the first 32 bases (normalized in color space), with the first basis in the top left, advancing down and then across.

Below are examples of one of the images (src_img001.jpg) with the patches converted to PCA space, and then reconstructed using 8, 20, 32, and 64 bases respectively. For this project I used 32 bases. One interesting thing to point out is that there is not a lot of green in the training (matched) images, hence there is no green in the first 8 bases. Also note that this image was NOT used to compute the bases.

Once the PCA bases are computed, and the 120 matched images are converted into PCA space, the algorithm is able to find the nearest neighbor match for a given patch over all 120 images in a few seconds. On an intel centrino 2.16 GHz processor, computing the PCA bases takes approximately 10 minutes. Projecting the 120 matches to PCA space takes approximately 4.5 minutes, and finding the results takes only 14 seconds.


I am fairly pleased with my results, but I think they could be greatly improved using some of the suggestions in the following further work section. I thought the most noticeable problem in my results is the blocky nature of the filled in patch, due to using unform square patches. I actually thought that the results using only 10 images were actually as good or better than using the full set of 120, especially for the MIT building (example 1). Overall, I would say the results look very good from a distance, but up close some of the finer details are not blended well.

Further Work

There are several extensions to this project which I would like to explore given more time. The first would be to explore the effects of using different patch sizes. Another addition which I think would potentially help a great deal would be to use the graph cut algorithm to find a better boundary between patches in the results image, which would hopefully get rid of the blocky look of the patches. One more idea would be to try and propogate edge structure better. This could be done either by weighting neighbors with more structure more heavily, or by shifting the matched patches to find a better alignment.

Results Images

The results below were computed on only 10 training images. The columns show (1)The PCA bases, (2)The result showing the matched patches as PCA reconstructions, (3) the result with the matched patches taken directly from the original images, and (4) the same as 3 but averaged with a poisson blend.

These results were computed using the full set of 120 matched images.