CS 143 / Project 1 / Image Filtering and Hybrid Images

Walter White and Voldemort.

Introduction

In this project, I aim to implement the idea behind the paper "Hybrid Images" by Oliva, Torralba and Schyns (2006). The work consisted of first coding an image filter operation akin to MATLAB's built-in imfilter operation; and to subsequently use my image filter to create hybrid images.

Motivation

Hybrid images are a form of optical illusion dependent on viewing distance. Seen from far away, the brain might interpret an image very differently than from close by, because the higher frequencies of the image may not be readily available. Dali exploited this idea in various of his perspective changes: what looks like a woman staring out to the sea looks like a portrait of Abraham Lincoln from a far enough distance. With computational image processing, it should be a straightforward enough task to manipulate images so as to bring out the high frequencies in one image, the low in another, and then to combine two images so as to create one hybrid, ambiguous image.

Steps

Image filtering. To create my_imfilter I read images into MATLAB, which stores them as a multidimensional array (pixel height x pixel width x number of channels). The image needs to be padded so that the filter can fit onto even the edge pixels; I decided to pad the images with zeroes, which can have the disadvantage that in blurring operations a certain darkening of the edges can take place. After padding, I loop over the actual image pixels, computing the dot product of the filter and the pixel 's immediate area, and using that value as value of that pixel in the filtered image. After this loop, I return the unpadded image of the same resolution.

The original image

A blurred image; notice the dark edges due to the padding.

A filter with a larger standard deviation.

A Laplacian filter - sensitive to higher frequencies

A Sobel filter applied to the image - sensitive to vertical edges.

Hybrid images. To create hybrid images, I take two equally sized-images of the same resolution and follow the algorithm below:

  1. Filter the first image with a Gaussian filter, blurring the image and removing the high frequencies. Some experimentation with the cutoff frequency (that is, the standard deviation of the Gaussian curve) is necessary to provide the best results.
  2. Subtracting a blurred version of image 2 from the actual image, leaving behind the high frequencies of image 2 - a similar effect could be obtained by filtering with a Laplacian filter.
  3. Adding the filtered version of image 1 to the filtered version of image 2.

Results

I will use one of the given test cases as an example of the entire procedure.

The original cat picture.

The original dog picture.

The cat has been filtered so only the high frequencies remain.

The dog is filtered so only low frequencies remain.

High cat + low dog = confused catpug. Cug. Pat.

The cat and the dog make for biological disaster. This image works really well because of the good alignment.

Here I combine Marilyn Monroe with Albert Einstein. Here the high frequencies belong to Marilyn.

The same two images are blurred, this time with Albert Einstein in the high frequencies. A slightly different cutoff frequency was chosen.

A beetle and a Beetle. This one is not as effective due to the poor alignment.

A foray into before-and-after. The scene is the Plaza de la Revolucion in Havana, Cuba. ONe picture is taken on May Day celebrations, the other is taken on an every day. The illusion works somewhat: from far away, the street seems empty, from close up, the crowd is visible. Yet due to the inaccurate alignment the illusion is not flawless and the picture in close up is very noisy. The picture would be more effective if the shot were taken from exactly the same place on both days.

The universally evil Voldemort is conjoined with the good-guy-gone-bad Walter White. Here the bald heads are quite well-aligned, but as the original paper mentions, effects of perceptual grouping are important in a successful hybrid image. Here Walter White's glasses and facial hair make for obtrusive features making it harder for the viewer to adequately perceive Voldemort.

Conclusion

With my_imfilter up and running, it is not too hard to create hybrid images. Some improvements can be made though. On a technical level, zero-padding could be replaced by reflecting the image, and we could work with not one but two cutoff frequencies (one for the high pass, one for the low) for optimal results. On a less technical level, my experiments with various hybrid images taught me it is necessary to pay attention to alignment and to the role of perceptual grouping, without which the resulting images are not as convincing as they could be.