CS 129 Project 2 Writeup

Jonathan Mace (jcmace)
September 24 2012

Project Description

The goal of this project is to blend two images together. Given a source image, a target image, and a mask for the source image, I implemented three blending techniques to insert the source image into the target image. The details of the techniques and the results are outlined below. The following image gives an idea of the purpose of the project.

Part 1: First-order derivative blending

The first part of the implementation was using a simple blending technique by minimising the error on the first derivatives of the masked area. This was implemented by constructing a number of equations representing the desired properties of the blended image to have, for example, for there to be minimal gradient between pixels in the source image and pixels in the target image when we transition from a masked area to a non-masked area. With first-order derivative blending, the equations are formulated as follows:

  • v(i,j) - v(i-1, j) = s(i,j) - s(i-1, j)
  • v(i,j) - v(i+1, j) = s(i,j) - s(i+1, j)
  • v(i,j) - v(i, j-1) = s(i,j) - s(i, j-1)
  • v(i,j) - v(i, j+1) = s(i,j) - s(i, j+1)
  • For a more verbose description of the equation formulation, see the project description page.

    To implement these equations in matlab, I built a very large sparse matrix to represent the system of equations. Matlab then calculated the least-squares solution to the system of equations, which produces a smooth transition between source and target images.

    To solve the problem of masks extending to the image boundaries, I added a 1 px border prior to processing, which I then removed after processing.

    Part 2: Second-order derivative blending

    For the second part of the implementation, I formulated my equations using the second-order approach. Equations were formulated as follows:

  • 4*v(i,j) - v(i-1, j) - v(i+1, j) - v(i, j-1) - v(i, j+1) = 4*s(i,j) - s(i-1, j) - s(i+1, j) - s(i, j-1) - s(i, j+1)
  • Again, the project description contains more information about the second-order approach.

    Qualitatively, these results are a little better than using the first-order derivative. Most noticeably, the colours seem more balanced. (ie. the aeroplane superposition looks much sharper).

    Part 3: Laplacian pyramid blending

    For the third part of the implementation, I implemented Laplacian Pyramid blending. To do this, I take the source and target images, and construct a gaussian pyramid of smaller images, where each successive image is half the size of the previous. From the gaussian pyramid, I then construct a laplacian pyramid. Each entry in the laplacian pyramid is calculated by taking an entry from the gaussian pyramid, and the entry one level above it in the gaussian pyramid, and subtracting the lower-resolution image from the higher-resolution image. The levels of the source- and target- laplacian pyramids are then blended together using the same blending function. In the case of my algorithm, I used a 3x3 gaussian with standard deviation of 0.5. The result of this step is a laplacian pyramid of blended images. The output image is then reconstructed by superimposing all of the blended images back on top of each other.

    On the whole, the laplacian pyramid technique did not work as well as the first- and second- derivative approaches using the images provided. I believe that this is because the technique preserves more of the original image, and therefore requires more fine-grained masks. Alternatively, I could have used a different blending function - I chose a 3x3 gaussian with 0.5 standard deviation. I could not find a blending function that produced significantly better results.

    Results

    The following table shows the input images and the results of blending naively, and for the three techniques described above.
    SourceMaskTargetNaiveFirst-orderSecond-orderLaplacian Pyramid

    I picked some of my own images as well, which produced some variable results.

    SourceMaskTargetNaiveLaplacian Pyramid