CSCI1430 Project 1: Hybrid Images

Soravit Beer Changpinyo (schangpi)

1. Introduction

A hybrid image is a combination of the low-spatial frequencies of one picture with the high-spatial frequencies of another picture. Its interpretation changes as a function of viewing distance; high spatial frequencies dominate when one looks from a short distance. Once stepping away from the image, one switches to the low-frequency interpretation. The goal of this project is to create a hybrid image from any two source images.

2. Algorithm

2.1 Grayscale Images

First, Gaussian and Laplacian pyramids are generated for both source images. Gaussian pyramids can be generated by repeating the following process: applying a Gaussian filter to smooth the image and sampling the result. In this assignment, so that the images in the pyramids can be combined easily, instead of downsampling (reducing size of the image) at each iteration, I used a Gaussian filter with greater radius than that in the previous level. The radius increases exponentially by the factor of 2; i.e. the radius values for the first iterations are 3, 6, 12, 24, ...
Laplacian pyramids are simply images, each of which represents the difference between two adjacent levels of images in the Gaussian pyramid; G_i - G_(i+1), where G_i is the filtered image at ith iteration.

Second, the first N1 levels of the Laplacian pyramid of one image are combined with the last N2 levels of the other Laplacian pyramid as well as the last level of the Gaussian pyramid of the other image, where N1 and N2 are the cutoff numbers. These values are adjusted manually for each combination of two images to get the best result.

The number of pyramid levels is fixed at 8 (determined by how blurred images are in the last level of the Gaussian pyramid). Examples of pyramids will be presented in the next section.

2.2 Colored Images

The algorithm remains the same as for the grayscale images, except that the images have one more dimension (for color component). The cutoff numbers are adjusted.
To combine a grayscale image with a colored image, I made three copies of the grayscale image to represent each color component.

As stated in the handout, the images for the Laplacian pyramid are visualized by adding 0.5, so light gray values are positive and dark gray values are negative.

3. Results

3.1 Gaussian and Laplacian Pyramids

3.1.1 Grayscale Images

3.1.2 Colored Images

3.2 Hybrid Images from Sample Images

Original image 1 Hybrid image Original image 2

3.3 More Hybrid Images

Original image 1 Hybrid image Original image 2
Michelle and Barack Obama
Opened and Closed Eyes
Sad and Happy Faces
Durian and Ukulele
Shoes and Butterfly
Changes of Shanghai over time

3.4 Colored Hybrid Images

No colored components Colored high-frequency component Colored both components Colored low-frequency component

4. Discussion

Creating successful hybrid images is not easy practically. However, according to the results above, I think that there is some set of rules/strategies one can follow to produce good hybrid images.

First, I would like to discuss about features of source images. I found that good high-pass filtered images should have high-frequency features such as whiskers, fur, hair, thorns, etc. Moreover, such features should be important factors in identifying the high-frequency image as different from the low-frequency one. For example, we can distinguish between a cat and a dog by the cat's whiskers; between a durian and an ukulele by the durian's thorns; between Derek and his cat by the cat's fur; by Michelle and Barack Obama by Michelle's long hair. As these high-frequency features disappear at a far distance, we would lack some information about the high-passed filtered image if we used its low-frequency components instead. Examples are shown below where one might perceive a durian as a watermelon, or a cat as a bear.





Good low-pass filtered images, on the other hand, should "lack a precise definition of object shapes and region boundaries" (Oliva et. al). In other words, one should be able to determine what the object is from a far distance just by looking at its approximate shapes and regions. Examples of such objects used in the above experiments are car, dog with dark nose, ukulele, Eistein with a tie and mustache, Derek with smooth skin and dark hair.

Moreover, it is also important that the source images have similar shapes and share a lot of edges so that they blend. This is probably why alignment is necessary.

Second, I would like to discuss about the effect of color on hybrid images. According to the above results, using color only for the high-frequency component seems to work best. Color provides additional information about the high-frequency component at a short distance, but rarely does so at a long distance (For instance, we can see that the yellow face is crying at a short distance because of the light blue tears). Using color for the low-frequency component seems to make the low-frequency component more noticeable from a short distance, which is undesirable.

Lastly, cutoff frequencies also play an important role. This is justified only by the fact that different cutoff numbers are used in the above experiment. Other parameters (number of pyramid levels, size of filters) can also be even more optimized based on the sources images.