CS129 Project 2: Image Blending

This algorithm explores image blending by gradient-domain processing, allowing a user to implant a region of a source image into a target image. The most basic implementation of such an algorithm would directly copy the source pixels into the target image, but for obvious reasons, the resulting image is less than convincing. The most noticeable problem with pixel copying is that it creates very noticeable seams, or high frequency pixel areas, at the edges of the copied region. To create a more perceptually subtle blending process, it must be noted that human visual perception is more sensitive to gradients than to individual intensities. Therefore, to create a seamless blend, the original pixel gradients of both the target image and the copied source region must be preserved as much as possible.

Algorithm

The main steps of the algorithm are laid out below, with additional detail given for the inner steps of the imblend function, which performs the image blending. The objective of the imblend function is to find the vector x in the equation Ax = b, where A is a matrix representing unknown gradients, and b is a vector representing the desired output gradient and values. The vector x will be the final blended image.

Read in the source, mask, and target images.
Expand the boundaries of the source and mask, based on the given offset values, to ensure that each input image is the same size.

imblend

Produce the sparse matrix A by pre-computing the indeces and values for each of its elements, to emulate the unknown discrete laplacian gradients over the masked region of the final image (x).
Produce the vector b by evaluating the discrete laplacian over the source image, bounded by the mask region, and combining those values with known pixel values of the target image.
Knowing the formulations of A and b, solve for x.
Clip pixel values of x that extend outside the valid intensity range and reshape to the proper image dimensions.

To increase efficiency and decrease runtime, the values used to build the A matrix were precomputed. To that end:

First, the mask image was filtered so as to identify the edge pixels of the mask. These pixels will require some of the values from the target image when computing the b matrix.
Each mask pixel was assigned a value of 4 in the A matrix, and the neighbors of that pixel in the mask were assigned a value of -1, to emulate the discrete Laplacian.
Pixels outside the mask were assigned a value of 1 in the A matrix, as their values in the final image will be the same.
Neighbors that fell outside the mask region were not assigned values of -1, and later the values of these neighbor pixels in the target image were added to the b vector.
These precomputed indices and values were used to generate a sparse representation of the A matrix, which saves space and allows for faster and more efficient matrix operations.

Extra Credit

A realignment step was added to the basic pipeline to fine-tune the hard-coded offsets through a greedy local-minimum search. Starting from the hard-coded alignment, each one-pixel move, in the 8 possible directions, was scored by taking the sum of squared differences between the source and target laplacian gradients, within the mask region. At each iteration, the best-scoring move became the new offset and this iterative refinement was repeated until a local minimum was found. In general, this realignment produced relatively small adjustments, but in some cases the effects were more noticeable, better aligning edges in the mask region:

Results

The results of the algorithm's blending operation are displayed below. The target, mask, and source images are displayed to the left, followed by the output of the direct pixel-copying strategy, and the final image, after blending and realignment, is displayed to the far right. Failure cases are presented and addressed after.

Failure Cases

In this first failure case, the blending operation works fairly well to replicate the overall background color, but the blended region obstructs some high-frequency areas in the target image, creating obvious visual defects. The algorithm fails to recognize and preserve these areas of the target image, instead preserving the gradient of the source image. Since the algorithm is founded on human perceptual sensitivity to gradients, destroying the sharp gradients of strong edges in the target image undermines the performance of the algorithm. To counteract such effects, it would be possible to preserve the stronger gradients in both the target and source images.

In this second failure case, the source image appears to blend too much, and it takes on a ghostly appearance. Fortunately, that sort of works for the particular image contents. But, this case illustrates a potential limitation of the algorithm: in most of the results, the algorithm shifts the overall hue of the source content towards that of the surrounding target image. In the trebuchet image above, the wood of the source image takes on a green hue from the grass of the target image. In the target image below, the dominant color is blue, so the blue channel of the source image will be emphasized. However, the source image does not have a very strongly expressed blue channel, so when the blue channel is emphasized, to meet the intensity of the blue channel in the target image, it dominates the other channels, which are also receding due to weak red and green channels in the target image. This weakness of the algorithm could possibly be addressed by taking the gradient of all channels together instead of one channel at a time and minimizing the overall difference between gradients.

CS129 Project 2: Image Blending

Reese Kuppig (rkuppig)

Algorithm

Extra Credit

Results

Failure Cases