Final Project: Image Analogies

Diana Huang (dkh) - April 19, 2010

Description

For this project, I implemented the algorithm described in the paper Image Analogies. It describes general methods by which to create an image analogy between three images. One pair (A and A') that describes the relationship we would like to replicate and one image (B) that we would like to find the analogy to (B').

For example:

::::

Using this technique, it is possible to create a wide array of different applications, from replicating simple filters to performing texture transfer onto a new image.

The Algorithm

The algorithm for this project has a few different parts.

At its core, what the algorithm is trying to do is to find a matching point p from A' to every point q in B'. In order to accomplish this task, we do a pixel-by-pixel synthesis of B' so that we can use some of the previously generated results to improve the results of future searches.

In order to do this, we first construct a Gaussian pyramid, each level half the size of the previous one. This allows us to the synthesis on several levels and prevent small-scale problems from replicating themselves in the higher resolutions. This becomes key when we are dealing with large images.

Then we construct features for each pixel q at each level l of an image A using a concatenation of the luminance values. These features use the values of not only the value of the luminance at q, but also includes the 5x5 neighborhood of q at level l, the 3x3 neighborhood of q at level l-1, and also the corresponding neighborhoods of A'. This will allow us to capture information from both A and A' as well as the levels l and l-1.

This diagram from the paper displays the neighborhoods that are used to construct the features:

So what metrics do we use to find the correct p for a pixel q at a given level l? We would like the new image to have properties of both A' and B. But we would also like to have a choice between picking a value because it closely resembles A' and the properties of A' and picking a value because it has properties of B. We find the value that has most closely resembles B by doing an Approximate Nearest Neighbors (ANN) search using the feature vector for pixel q.

In order to make sure that the new image B' preserves some of the features of A', we also do a coherence search A' for pixels that could match better visually than what currently exists in the structure of A'. We then compare the results of the ANN to the results of the coherence search and choose which one is 'better' using an attenuation factor.

Now that we have this pixel p, we can copy over this pixel's luminance value in A' into B' at position q at the current level l. Once we ave done this process for all the levels, we can use the luminance values to create a new version of the image using the finest level L.