steveg - Morphing

Steve Gomez (steveg) . March 26, 2010

In this project, we find smooth morphs between images with corresponding features.

Morphing color is trivially easy with alpha blending, but we also want to align correspondences between images. We'll look at computing a smooth warping for each pixel given a sparse set of pixel correspondences in images, and apply those to create shape and color morphing. We compute the average face from a collection of face data, show how to exaggerate an individual's features, and compare face morphing to morphing of similar outdoor scenes.

Mathematics of morphing

The hard part of this task is understanding how to interpret differences in feature location (e.g. "where is the left eyeball in the image?") between the two images to morph. These spatial differences are important when we talk about finding the proper reference colors in the original images when synthesizing a new pixel in a morph.

Here, we have some ratio of warp to apply, and the first step is (given some corresponding features, which are manually selected and saved), compute the weighted average location of the those features in the morph. This is straightforward and works the same as the color blending we'll eventually do (for ratio x and points p,q, an average point is x*p + (1-x)*q). We then find the difference from this morphed location back to its original point in each image. This gives us a backwards map from where the pixel (and its color) exists in the originals.

We're almost out of the woods, but we've only morphed the sparse set of pixels given as correspondence, when we want to morph an entire image. We need the backwards mapping for all pixels, but most of them are unknown.

Fortunately, if we assume the morphing is smooth in the unknown regions then this is equivalent to solving a system of Poisson equations, where the 'knowns' are offset values on a 2d grid. This is essentially a Poisson image fill from a few known offset 'colors' onto a blank canvas. For the implementation, I'm reusing my code from the color diffusion experiment in the last project. With the solved morph field, we now have offsets back to the reference images. Finding the colors, I use Matlab's meshgrid command and add coordinates back to the offsets to get color indices. Then I just grab the reference colors and average for each pixel with alpha blending.

Morphing faces

We morphed the faces of our class using manually selected correspondence features. In my implementation, I used 36 control points, covering the eyes, ears, nose, jaw, mouth, and neck. I experimented with controlling the silhouette of the hair and the hairline but this was difficult due to the varied faces. All my results can be shown here, and I made an animation morphing through the portraits (20 steps between each pair). The warp is linear with the step number, but the color dissolve is a half-periodic sine function (so the steep transition in color happens in the middle of the transfer then flattens out). This looks a little better to me than linearly blending the color.

The left side of the image is using the control points to morph. On the right, we have a simple cross-dissolve of the colors. Putting these side by side really shows how necessary the feature morphing is in making realistic blends. For instance, here is a freeze-frame where the cross-dissolve looks horrible, while the morphing succeeds.

Mean face ... grrr

I also computed the mean face from the class.

I averaged all control points for each portrait to get the 'mean control points', then computed the warps for all portraits from this mean control. The colors of the mean face are averaged using the backwards mapping to get corresponding color contribution in each. As expected, the face looks relatively handsome and reflects our mostly male demographic.

Exaggerating features

With the mean face, we can make some assessment about which of an individual's facial features differ from the norm. We've also already written the tools to vary the differences between two faces (i.e. face morphing). My approach here is to use the average face and its mean control points, and face morph it with an individual's face and points, to exaggerate that person's unique features. I can do this by morphing beyond [0,1] in the blend parameter, to extrapolate a new warp in the direction of the individual, making a caricature. Mathematically, this is exaggerating the shape differences on the individual's face from the mean face.

Below are some experiments with this. First we have the individual's face, then morphed at ratio -0.75 (where zero is the identity morph) away from the mean face, with the original color. We see that distinct features of the face (shape in the upper lip) are exaggerated. In the third photo below, we also extrapolate in the color parameter. This effect is almost like saturating the colors that difference from the mean. One particularly salient color patch is the brightness on the forehead, which is exaggerated because the mean face has some hair hanging on the bangs, making the area darker. Here, the difference in light is emphasized.

Morphing scenes and skylines

One application I experimented with was trying to morph between photographic scene matches that are "coarsely similar". This is coming off the scene completion project, where we tried to fill holes in image smartly using large amounts of photographs from the web. My idea was that, if coarsely similar scenes can be morphed into another realistic scene, we have another data point to pump into the completion algorithm in finding a best match.

I have an animation below of several outdoor scenes morphing between one another. Coarsely similar scenes were gathered from top-scoring coarse-level scene matches provided by James Hayes from his Scene Completion project. The assumption is that, because these all score well as matches against a common photograph, they likely match well with one another and will have similar structure. In this case, I used 7 control points manually picked, and (given that these are all outdoor shots with a horizon) the points are distributed regularly along the x-axis and at the horizon on the y. So we expect the horizons to blend into one another, with the sky and ground relatively constant between images. The effect is interesting, at least (earthquake-y?).

I'm not convinced we could synthesize images for scene completion this way. In blending, the semantic integrity is really random (even with good matches), and we also get foreground objects (that I did not try to account for in the control points) that fade in and out. Maybe if we restricted the morphing to only change the warp, and NOT try to change color, we could get new images with reasonable structure. More to investigate there.

At the same time, this exploration was interesting. I computed the "mean scene", similar to the class face. This is shown below. While it looks fairly chaotic, this average tells us a few clear things: 1) people usually take outdoor photos when the sky is blue, and this blueness fades near the horizon; 2) cities photographed from a distance (as these are) tend to be framed in the center of photos, with more green and trees along the perimeter. The foreground ghost object (buildings, trees) also look very cool in this photo and give us some idea of how the photographers were composing their shots.

All images

Blend results at 25, 50, 75 percent