Project 1: Hybrid Images
Nathan Malkin, September 2011
CS 143: Computer Vision
This project as a Twitter post
In this project, we create hybrid images: images that look differently depending on how far away from them you are. #bieber
Introduction and explanation
The effect
Look closely at the image above. What do you see? Why, it's Justin Bieber -- the outline of his face is clearly visible. Now move back from the screen. You can no longer make out the lines that defined Mr Bieber. In their place, only the abstract shapes are visible. Who is it? Why, it's a baby!
What's happening here?
This illusion exploits the "multiscale perceptual mechanisms of human vision". When viewing an image from close up, we are able to identify the high-frequency parts of it (those with prominent and well-defined edges), and they dominate our perception. However, when viewing an image from a distance (or for very short bursts of time -- on the order of 30 ms), the high frequencies of the image cannot be perceived. We therefore perceive the image on the basis of low frequencies alone.
Hybrid images use this by incorporating the high frequencies from one image and the low frequencies of another. Then, when looking at the image from up close, the high frequencies dominate, and you see the first image. But when you move away, only the low frequencies are visible -- and they form the second image.
Generating hybrid images
As explained above, hybrid images are obtained by combining the high frequency values of one image with the low frequency values from another. It follows, then, that the first step in creating hybrid images is separating the frequencies in each of the images. This is done by creating Gaussian and Laplacian pyramids.
Building image pyramids
The Gaussian and Laplacian pyramids can be built by following these simple steps:
- Take your source image and apply a Gaussian filter to you
- Subtract the blurred image from the original. The result is the next level of the Laplacian pyramid.
- Take the blurred image and scale it down (in our case, by a factor of 2). The result is the next level of the Gaussian pyramid.
- Repeat as necessary, with the current level of the Gaussian pyramid serving as the source image.
Gaussian pyramid
Laplacian pyramid
Why did I just do that?
Applying a Gaussian filter to an image achieves a blurring or smoothing effect. The effect also corresponds to a reduction of the frequencies in the image, since high frequencies occur in regions of contrast, and the blurring reduces that contrast. The successive application of the Gaussian filter therefore results in an image that is dominated by lower frequencies (and is progressively blurrier).
In contrast, images in the Laplacian pyramid were created by subtracting the Gaussian-blurred images from the original (not blurred or less-blurred) versions. This means that they encode the high(er) frequencies that were lost due to blurring.
Reconstructing the image
It's an important observation to the Laplacian pyramid can be used to reconstruct the original image from any level of the Gaussian pyramid.
Suppose we had the final image in the Gaussian pyramid -- the smallest and most blurry version of the original image. To get the previous level of the Gaussian pyramid, we could upscale it (perhaps using upsampling). This would produce the Gaussian-filtered version of the previous level. How do we undo the blur?
Luckily, the information we need is encoded in the corresponding level of the Laplacian pyramid. (Recall that we found it by subtracting the filtered version from the unfiltered.) By adding it to the upscaled image, we get the previous pyramid level.
This process can then be repeated until we reach the bottom level of the Gaussian pyramid. But that is just our original image. VoilĂ ! We have reconstructed our source image.
Creating the hybrid image
And now, the final step: making the hybrid image. To do it, we will combine the low frequencies from one image with the high frequencies of another. We do this by taking the final (topmost) level of the Gaussian pyramid and adding successive levels of the Laplacian pyramids to it.
At some point (this is our cutoff frequency), we switch over to the second image and add the remainder of its Laplacian pyramid levels to the result. This means that it is only contributing to the high frequencies of the final image.
The result (to reiterate one last time) contains the low frequencies from the first image and the high frequencies from the second image. This is our hybrid image.
Adding color
All the operations above can be easily extended into the third dimension, color. So, for example, the Gaussian filter is applied separately to the red, green, and blue components of the image; each ise downscaled and upscaled separately; and each pyramid level actually consists of three images: the red, green, and blue intensity values as separate two-dimensional images.
Results
Here are some images that have been generated using the method above.
John Quincy Adams, in the shadow of his father
A cat and a dog
The same image, in grayscale.
A donkey and a horse
Albert Einstein and Marilyn Monroe
The same pair of images, but now Marilyn Monroe is in the foreground.
A hidden message
Catman (Derek and Nutmeg)
The same image, but with color
Natalie Portman, as the Black Swan
Rhino/Car
Tank/Rhino
Tiger/Cat
Tiger/Cat (grayscale)
Who came up with this?
As early as 1938, Salvador Dali used an effect reminiscent of hybrid images in one of his paintings. The technique presented here is based on a SIGGRAPH 2006 paper by Oliva, Torralba, and Schyns. (The paper also contains references to previous work on this topic.)