CS129 Project 2: Image Blending

Reese Kuppig (rkuppig)

This algorithm explores image blending by gradient-domain processing, allowing a user to implant a region of a source image into a target image. The most basic implementation of such an algorithm would directly copy the source pixels into the target image, but for obvious reasons, the resulting image is less than convincing. The most noticeable problem with pixel copying is that it creates very noticeable seams, or high frequency pixel areas, at the edges of the copied region. To create a more perceptually subtle blending process, it must be noted that human visual perception is more sensitive to gradients than to individual intensities. Therefore, to create a seamless blend, the original pixel gradients of both the target image and the copied source region must be preserved as much as possible.


Algorithm

The main steps of the algorithm are laid out below, with additional detail given for the inner steps of the imblend function, which performs the image blending. The objective of the imblend function is to find the vector x in the equation Ax = b, where A is a matrix representing unknown gradients, and b is a vector representing the desired output gradient and values. The vector x will be the final blended image.

To increase efficiency and decrease runtime, the values used to build the A matrix were precomputed. To that end:

Extra Credit

A realignment step was added to the basic pipeline to fine-tune the hard-coded offsets through a greedy local-minimum search. Starting from the hard-coded alignment, each one-pixel move, in the 8 possible directions, was scored by taking the sum of squared differences between the source and target laplacian gradients, within the mask region. At each iteration, the best-scoring move became the new offset and this iterative refinement was repeated until a local minimum was found. In general, this realignment produced relatively small adjustments, but in some cases the effects were more noticeable, better aligning edges in the mask region:

xc_result_08.jpg result_08.jpg

Left: original offset. Right: after realignment, the rainbow aligns better with the left edge.


Results

The results of the algorithm's blending operation are displayed below. The target, mask, and source images are displayed to the left, followed by the output of the direct pixel-copying strategy, and the final image, after blending and realignment, is displayed to the far right. Failure cases are presented and addressed after.

target_01.jpg mask_01.jpg source_01.jpg
naive_result_01.jpg result_01.jpg
target_02.jpg mask_02.jpg source_02.jpg
naive_result_02.jpg result_02.jpg
target_03.jpg mask_03.jpg source_03.jpg
naive_result_03.jpg result_03.jpg
target_04.jpg mask_04.jpg source_04.jpg
naive_result_04.jpg result_04.jpg
target_05.jpg mask_05.jpg source_05.jpg
naive_result_05.jpg result_05.jpg
target_07.jpg mask_07.jpg source_07.jpg
naive_result_07.jpg result_07.jpg
target_08.jpg mask_08.jpg source_08.jpg
naive_result_08.jpg result_08.jpg
target_09.jpg mask_09.jpg source_09.jpg
naive_result_09.jpg result_09.jpg
target_10.jpg mask_10.jpg source_10.jpg
naive_result_10.jpg result_10.jpg

Failure Cases

In this first failure case, the blending operation works fairly well to replicate the overall background color, but the blended region obstructs some high-frequency areas in the target image, creating obvious visual defects. The algorithm fails to recognize and preserve these areas of the target image, instead preserving the gradient of the source image. Since the algorithm is founded on human perceptual sensitivity to gradients, destroying the sharp gradients of strong edges in the target image undermines the performance of the algorithm. To counteract such effects, it would be possible to preserve the stronger gradients in both the target and source images.

target_06.jpg mask_06.jpg source_06.jpg
naive_result_06.jpg result_06.jpg

In this second failure case, the source image appears to blend too much, and it takes on a ghostly appearance. Fortunately, that sort of works for the particular image contents. But, this case illustrates a potential limitation of the algorithm: in most of the results, the algorithm shifts the overall hue of the source content towards that of the surrounding target image. In the trebuchet image above, the wood of the source image takes on a green hue from the grass of the target image. In the target image below, the dominant color is blue, so the blue channel of the source image will be emphasized. However, the source image does not have a very strongly expressed blue channel, so when the blue channel is emphasized, to meet the intensity of the blue channel in the target image, it dominates the other channels, which are also receding due to weak red and green channels in the target image. This weakness of the algorithm could possibly be addressed by taking the gradient of all channels together instead of one channel at a time and minimizing the overall difference between gradients.

target_11.jpg mask_11.jpg source_11.jpg
naive_result_11.jpg result_11.jpg