Nabeel Gillani - Hybrid Images

Algorithms and Decisions

Pyramids

The first decision I made for this step was how big the Gaussian filter half-width should be. I let sigma equal 1/3 * half-width (found on lecture slides). I initially started with a fairly large half-width of 9, but then found that it was difficult to recover low-frequency information in the hybrid image without including most of the low-freq. pyramid levels (often, all but one). So, I shrunk the half-width to approx. 4 to achieve more modest blurring - this value is changed manually according to the image set.

I used a cell array to store each pyramid. I built the procedure as follows: The base of the Gaussian Pyramid is the image itself. The base of the Laplacian Pyramid is the Image minus the image filtered with a Guassian of some size with some sigma (determined experimentally). Each successive level of the Gaussian pyramid is then just a subsampled version of the previous Gaussian level, and each successive Laplacian level is just the subsampled Gaussian level minus this same Gaussian level filtered with another Gaussian (note: we don't filter the first term in the subtraction since this is basically taken care of by imresize). Hence, we use a difference of Gaussians approach to approximate the Laplacian filter. pyramids.m shows the code for this procedure.

Hybrid Image

hybridImage.m shows the code for this procedure. The purpose of this routine was to "undo" the pyramid building phase, i.e. select some number of levels from the top of the low-pass pyramid and others from the bottom of the high-pass pyramid to ultimately sum together to produce the hybrid image. We start construction of the hybrid image by first taking the peak of the gaussian pyramid for the low-passed image (i.e., the level with information about the lowest frequencies), and then upsample this. I wrote a helper function, fixImageSize, to account for the fact that imresize may make the image a little bigger than the original size due to rounding issues. One thing I decided to do after upsampling was to blur. Even though imresize includes in it some sort of blurring/averaging during upsampling, I found that imresize alone led to images that were either speckly or distorted due to too much high or low frequency information (hence, images apearing very white or very black).

To continue building the image, we add on the filtered upsampled image to the corresponding low-pass laplacian image until we reach our cutoff1 parameter. For my implementation, I simply let cutoff2 = N - cutoff1, which appeared to yield decent results in most cases. After summing the low-pass levels, I summed the bottom N - cutoff1 levels of the high-pass filtered image's laplacian pyramid to arrive at the final hybrid image.

Results

I depicted three different types of transitions in the hybrid images below: 1) Change of expression, 2) Morph between objects, 3) Change over time.

Change of expression

Surprised Dwayne Wade (high freq) becomes a smiling Lebron (low freq).


Low Frequency Image	High Frequency Image	Hybrid Image: N=10, cutoff1=8, cutoff2=2, sigma=1

Morph between objects

Cute bunny (high freq) becomes a cute dog (low freq).


Low Frequency Image	High Frequency Image	Hybrid Image: N=8, cutoff1=5, cutoff2=3, sigma=4/3

Change over time

A basketball rotates over time.


Low Frequency Image	High Frequency Image	Hybrid Image: N=12, cutoff1=8, cutoff2=4, sigma=1

Here are some additional results from the images provided in the data gallery.


Low Frequency Image	High Frequency Image	Hybrid Image: N=10, cutoff1=8, cutoff2=2, sigma=4/3


Low Frequency Image	High Frequency Image	Hybrid Image: N=8, cutoff1=4, cutoff2=4, sigma=5/3

Summary

Overall, we can see that parameter tweaking for our filter support size and sigma (half-width = 3*sigma in every case), number of pyramid levels, and cutoff frequency all produced hybrid images of varying degrees of quality. The parameters mentioned in the results above were all determined via experimentation. I found that decreasing sigma would often lead to a better hybrid image, since a smaller sigma implied a less dramatic blur between successive pyramid levels, and hence, more of a chance to capture some of the details found in the low-passed image. Decreasing sigma and increasing the cutoff frequency for a particular low-pass image both achieved similar effects (capturing more low freq. details), thereby, giving us two different ways of making sure that the "balance" between represented high and low frequences gives us the desired dual hybrid image effect.

Looking forward, some ways I could improve this implementation include using two independent cutoff frequences for each pyramid. I decided not to go this route because my results weren't as good when the cutoff frequences were independent, but this is probably because I wasn't normalizing the sums in my resultant image. This is something that, with further investigation, I think I could leverage to produce higher quality hybrids.

Shoutouts

Shoutout to vmoussav for helping me understand pyramid synthesis and image reconstruction better.

Overview