CS129 / Project 3 / Seam Carving

Cristina's World by Andrew Wyeth resized in x and y

In this project, we resize images intelligently by removing low energy seams from the image, instead of cropping or resizing the image. The basic algorithm for seam carving is described well by Wikipedia, but I'll describe it more below. For the purposes of this assignment, we consider a seam in, say, the vertical direction to consist of one pixel for each y value in the image such that each pixel in the seam at a given y touches the pixels in the seam above it and below it. In the horizontal direction it is the same, but replace x for y. This formulation allows us to remove continuous parts of the image, which limits the amount of noise and discontinuities we generate.

Algorithm

The first step of the algorithm is to calculate an energy map of the image. We do this by looking at the difference between a given pixel and it's four adjacent pixels in each of the RGB channels. The energy of the pixel is the sum of the absolute values of the differences. This is meant to assign high energy to high gradient areas (such as edges) that we might want to preserve in the final output, and assign low energy to pixels in low gradient areas. What we want to find is the seam with the lowest cost in the image so that we can remove it. To find the lowest cost seam, we use a dynamic programming approach. At each row, we calculate the best seam up to this point for each pixel. This is done by, for each pixel, looking at the three pixels that it touches above it (i.e. top right, top center, and top left), and then choosing the one which has the best path up to that point. At each choice, we make sure that the pixel knows which pixel above it it chose, and what the cost of the best seam to that point would be (i.e. the cost of the best seam above it plus the energy at this pixel). For the top row, we assign the best seam up to that point as the cost of the pixel, and then for each row below it we employ this algorithm row by row, until we get to the bottom. Once we have done this calculation for every pixel in the image, we go through the bottom row and find the best seam up to that point (i.e. the best seam over the whole image), and then we figure out which pixels are in that seam by following the path of previous pixels we set as we were doing our calculation. Once we have figured out the best seam, we remove it from the image, making it one pixel smaller in whichever direction the seam was removed from. To resize an image by more than one pixel, you simply repeat the whole process once for each pixel you want to resize by.

Results

Original Retargeted to 200px less width

Discussion

As we can see by these results, our formulation of the seamcarving algorithm works alright for images with large areas with low detail (such as the Wyeth painting or the Lava scene), but is worse with images that include things like faces or human forms (see Mona Lisa or James Hays) or extremely well defined edges (such as the Kandinsky painting). The problem with faces is that there are areas of low energy (observe the darker side of James Hays face), but the proportions of the faces are important for us humans to recognize them as faces. The algorithm doesn't deal with this, but if we could optimize the energy function to better deal with this situation. In the Kandinsky painting, the geometric primitives such as the lines or circles are corrupted and lose their original shape. The straight black lines look especially bad because after the algorithm has run, those lines are no longer straight. One way to deal with this would be to calculate the forward energy instead of the backward energy as described in section 5 of this paper. The way that that energy function works is by optimizing the resultant image with the seam cut out, rather than just cutting the lowest current energy seam, so it will be optimized against introducing new edges. You can see at the top of this page a version of Wyeth's painting retargeted by 100 pixels in both the x and y directions. This one worked well, since the image has a lot of low detail green and white space, but on other images it is suboptimal.

N.B. All of the images in my results section have been retargeted by 200 pixels in the x direction except for the image of James Hays, which is retargeted by 100 pixels because the source image is only 160 pixels wide to begin with.