Algorithm

The input consists of three components: a source, a target, and a mask which defines which region of the target are to be maintained, and which regions of the source are to be brought to the target. The mask is a binary image: a 0 means the target background should be maintained, while a 1 means the source object should be copied over.

All three images need to be the same size during the blending process. This can be true at the outset, or the source and/or mask can be cropped/extrapolated as needed to fit the target image. The images must be the same size because the matrix manipulations of the algorithm requrie it.

Speaking of matrix manipulations, the actual algorithm requires solving a system of matrix equations. Because each pixel serves as an unknown in the output image, the matrices can get quite large.

The first matrix, A, is a sparse height*width × height*width. This ensures every pixel has its own column in the system of equations. The second matrix, b, is a column vector of height height*width.

These matrices are filled by iterating over the mask and determing which values to choose for each pixel. If the mask is 0 at a certain pixel, then that pixel in the final output will be the same as the target pixel. This means that matrix A would get a 1 at that pixel index, and matrix b would receive the pixel value of the target. However, if mask has a 1 at that pixel, then we must calculate the gradient of the source image.

The gradient was determined using a discrete Laplacian with at most four neighbors. Wherever we must calculate the gradient of the source, the A matrix at that pixel gets a 4 at that pixel value. Then, the neighboring pixels in the source image (left, right, above, and below) get a -1 in matrix A. In other words, the column in A that corresponds to the neighboring pixel in the source and the row in A that matches that of the current pixel receives the value of -1. The value of b at that pixel index is calculated using the following equation:

4*s(i,j) - s(i-1, j) - s(i+1, j) - s(i, j-1) - s(i, j+1)

where s(i,j) is the source image, and (i,j) are the columns and rows of that image.

Once A and b are filled, we can solve the system of equaions. To set up the equation, we have Ax = b where x is the output column vector. Once the system is solved, x is simply reshaped back to the normal dimensions of the target image.

Overview

Algorithm

Results

Here's what the program produced!