Hybrid Images
Charles Yeh, 9/26/2011
To create hybrid images which change by viewing distance, the high frequencies of one image are mixed with the low frequencies of another. This is done by first making gaussian and laplacian pyramids each for the two images.
This is done iteratively where the image (1) is blurred by a gaussian filter, then subsampled such that only one pixel in 4 is kept as shown by the following image: . The difference between the unblurred image and the blurred one is also calculated. This difference (2) is stored into the laplacian pyramid.
-
(1)=
(2)
This results in a new image a fourth the size of the old which is stored for use in the gaussian pyramid. The same thing is then done to this new image, and so on.
So that there is a list of differences, named the laplacian pyramid, and a list of blurred images, named the gaussian pyramid.
Original Image:
Gaussian Pyramid (they actually get smaller but are resized here for convenience):
Laplacian Pyramid:
Then, the smallest image is taken from the gaussian pyramid of the first image to use as a starting point for the hybrid image. Images from the laplacian pyramid of the first image are added to the starting image up to the first cut-off (all images here are resized for viewing convenience).
Using a cutoff of 5:
Starting Image:
(Smallest level of the gaussian pyramid of Wolf Blitzer)+
+
+
+
+
(Smallest [cutoff = 5] levels of the laplacian pyramid of Wolf Blitzer)+
+
(Biggest [N - cutoff - 1 = 2] levels of the laplacian pyramid of the wild wolf)
=
Resulting image with Wolf Blitzer's low frequency and high frequencies.
Experimentation
The size of the gaussian filter was an adjustable parameter that could be experimented with. A smaller size meant less defined higher frequencies, most likely because the filter wasn't sufficiently blurring the image for subsampling to accurately define the un-sampled image. By blurring more, the difference would better define both low and high frequencies, as shown in the left image below where one level of firefox's high frequencies is better defined than 3 levels of firefox's high frequencies in the right image. The same goes for the low frequencies.
High frequency firefox and low frequency chrome:
|
|
Smaller sizes also resulted in the low frequencies showing stronger, but this could be made up for by increasing the cut-off. However, there is still a difference no matter which cutoff you choose: the image with the smaller filter size will always be less defined in either low or high frequencies.
Filter size of 3 and filter standard deviation of 5.
Cutoff at 3 of 8 levels. (3 levels of chrome and 4 level of firefox laplacian pyramids)
Flipping the images so that firefox's low frequencies and chrome's high frequencies are shown yielded an interesting observation: because chrome's logo was originally lower frequency than firefox's, it seemed more natural when chrome's low frequencies and firefox's high frequencies were used.
Filter size of 3 and filter standard deviation of 5; cutoff at 5 of 8 levels.
The filter size also needed to be smaller for smaller images, which is expected since blurring should be proportion to image size for it to "accurately" blur details for subsampling. For smaller images such as Wolf Blitzer and the wolf's, a smaller size of just 4 pixels was used.
Wolf Blitzer low frequency and wild wolf high frequencies. 8 pyramid levels; cutoff of 5; filter size of 4; sigma of 5.
Another adjustable parameter was the filter's standard deviation, or sigma. For small filter sizes, changes resulted in essentially no difference. The image on the left has a sigma value of 5 while the image on the right has a sigma value of 20. The difference is most apparent with the tie where higher frequencies are better defined in the image with the higher sigma. The tie is more visible in the right miniature image than in the left miniature image.
![]() Cheney low frequencies and Bush high frequencies. 8 levels; cutoff of 10; filter size of 20; sigma of 20. |
![]() Cheney low frequencies and Bush high frequencies. 12 levels; cutoff of 10; filter size of 20; sigma of 20. |
The number of pyramid levels to generate was one adjustable parameter. Generally, this was left alone since decreasing the pyramid levels is the same thing as increasing the cutoff and increasing the pyramid levels simply gives more room for lower cutoffs of low frequencies. It would only be worth changing the number of pyramid levels if there was a cutoff of one on an image large enough to be split up further. Usually though, increasing the number of pyramid levels also required increasing the cutoff for a good hybrid image.
![]() Cheney low frequencies and Bush high frequencies. 8 levels; cutoff of 5; filter size of 20; sigma of 20. |
![]() Cheney low frequencies and Bush high frequencies. 12 levels; cutoff of 9; filter size of 20; sigma of 20. |
Coming up with each image took a lot of experimentation to a good filter size, sigma, cutoff, and number of pyramid levels. I found that the smaller the filter size, the more pyramid levels could be added for finer tuning of the cutoff. However, this resulted in the hybrid image being more ambiguous and less easily recognizable as either the low or high image.