CS129 Project 3: Seam Carving

Reese Kuppig (rkuppig)

This algorithm implements the seam-carving method for intelligent image resizing. In standard image resizing, pixels are removed or interpolated between at a fixed interval that reflects the desired scale factor, yielding an image that is uniformly smaller. Image Retargeting seeks to identify and preserve the most interesting and important contents of an image, instead removing less important surrounding pixels to achieve the same desired dimensions. To calculate each pixel's importance, or energy, there are many possible methods, most involving gradient comparisons. As we have previously observed, human visual processing is especially sensitive to intensity gradients, so it follows that the least interesting pixels in an image are likely the pixels that are most similar to their neighbors.


Algorithm

The main steps of the algorithm are laid out below, and optimizations are discussed after.

Dynamic Programming

Finding the minimum-cost path through the energy image could be a complex problem, requiring exponential time to complete using an exhaustive depth-first search search. However, utilizing dynamic programming reduces the search to linear time in the number of image pixels. To minimize image distortion, we require that each pixel in the seam must be adjacent (laterally or diagonally) to the next pixel in the seam. Therefore, for each potential seam pixel, P, there are only three possible locations for the previous seam pixel, as shown below.

dyn.jpg

The dynamic approach proceeds as follows:

For each row of pixels in the image:
	For each pixel of a row:
		Find the minimum-cost path to reach that pixel by taking the minimum of the costs
		to reach each of the three possible preceding pixels and adding that to the cost 
		of the current pixel. The updated cost of the current pixel now reflects 
		the total cost of the minimum-cost path to reach that pixel.
dyn2.jpg

By this approach, each pixel is visited only once, and at each visit, at most three other pixels are examined, yielding linear time performance in the number of image pixels.

To further optimize the calculation of seam costs, the find_min_path() function takes as optional arguments the previous seam-cost image and the x coordinates of the last seam removed. With this extra information, the function only updates the pixels in the seam-cost image that could have changed with the removal of the seam, on average halving the runtime of this operation. The number of pixels that need to be updated also depends on the radius of the energy calculation filter. As the function travels row by row, the updated range is expanded by 1 pixel, to account for the propagation of previous updates. If values at the ends of the range are unchanged, then the range is shrunk to reprocess as few pixels as necessary.

Extra Credit

The baseline implementation of the image energy function used a local gradient filter, simply summing the intensity differences between each pixel and its immediate neighbors. This energy calculation yielded reasonable results. As we have discussed, human visual perception is sensitive to gradient changes. The most extreme gradient changes occur at edges, so the energy calculation was reformulated using edge-emphasizing filters.

First, a simple sobel filter, passed over the image both vertically and horizontally, acted to emphasize edges and gradient inconsistencies in textures, however, the sobel filter is vulnerable to noise. Next, a derivative of gaussian filter was tried, first smoothing the image and then emphasizing edges. The derivative of gaussian filter proved more resistant to noise, and still emphasized edges well. In the final energy calculation method tried, the responses of the derivative of gaussian filter were squared, yielding much stronger edge responses than any of the previous filters. By emphasizing edges, the seam-finding operation tended to avoid whole objects or well-defined regions, preserving what was likely important and noticeable content in the image. The energy images and final retargeted images for each energy calculation method are displayed below in Results.


Results

The results of the algorithm's image-retargeting operation are displayed below. The original image and the image scaled to the target dimensions are displayed first. Below that, the energy image and retargeting results for each energy calculation method are displayed side-by-side. Failure cases are discussed at the end.

boat.jpg

Original and Scaled

boat.jpg boat_200x400scaled.jpg

Gradient

boat_grad_energy.jpg boat_200x400grad.jpg

Sobel

boat_sobel_energy.jpg boat_200x400sobel.jpg

Derivative of Gaussian

boat_dog_energy.jpg boat_200x400dog.jpg

Derivative of Gaussian2

boat_sqdog_energy.jpg boat_200x400sqdog.jpg

wyeth.jpg

Original and Scaled

wyeth.jpg wyeth_150x248scaled.jpg

Gradient

wyeth_grad_energy.jpg wyeth_150x248grad.jpg

Sobel

wyeth_sobel_energy.jpg wyeth_150x248sobel.jpg

Derivative of Gaussian

wyeth_dog_energy.jpg wyeth_150x248dog.jpg

Derivative of Gaussian2

wyeth_sqdog_energy.jpg wyeth_150x248sqdog.jpg

dali.jpg

Original and Scaled

dali.jpg dali_246x580scaled.jpg

Gradient

dali_grad_energy.jpg dali_246x580grad.jpg

Sobel

dali_sobel_energy.jpg dali_246x580sobel.jpg

Derivative of Gaussian

dali_dog_energy.jpg dali_246x580dog.jpg

Derivative of Gaussian2

dali_sqdog_energy.jpg dali_246x580sqdog.jpg

pamir.jpg

Original and Scaled

pamir.jpg pamir_200x272scaled.jpg

Gradient

pamir_grad_energy.jpg pamir_200x272grad.jpg

Sobel

pamir_sobel_energy.jpg pamir_200x272sobel.jpg

Derivative of Gaussian

pamir_dog_energy.jpg pamir_200x272dog.jpg

Derivative of Gaussian2

pamir_sqdog_energy.jpg pamir_200x272sqdog.jpg

target_01.jpg

Original and Scaled

target_01.jpg target_01_300x321scaled.jpg

Gradient

target_01_grad_energy.jpg target_01_300x321grad.jpg

Sobel

target_01_sobel_energy.jpg target_01_300x321sobel.jpg

Derivative of Gaussian

target_01_dog_energy.jpg target_01_300x321dog.jpg

Derivative of Gaussian2

target_01_sqdog_energy.jpg target_01_300x321sqdog.jpg

monument.jpg

Original and Scaled

monument.jpg monument_300x395scaled.jpg

Gradient

monument_grad_energy.jpg owl_400x400grad.jpg

Sobel

monument_sobel_energy.jpg monument_300x395sobel.jpg

Derivative of Gaussian

monument_dog_energy.jpg monument_300x395dog.jpg

Derivative of Gaussian2

monument_sqdog_energy.jpg monument_300x395sqdog.jpg

guineapig.jpg

Original and Scaled

guineapig.jpg guineapig_300x333scaled.jpg

Gradient

guineapig_grad_energy.jpg guineapig_300x333grad.jpg

Sobel

guineapig_sobel_energy.jpg guineapig_300x333sobel.jpg

Derivative of Gaussian

guineapig_dog_energy.jpg guineapig_300x333dog.jpg

Derivative of Gaussian2

guineapig_sqdog_energy.jpg guineapig_300x333sqdog.jpg

This image demonstrates that the retargeting operation may be applied in both the x and y dimensions.

wyeth.jpg

Original and Scaled

wyeth.jpg wyeth_150x148scaled.jpg

Derivative of Gaussian

wyeth_dog_energy.jpg wyeth_150x148dog.jpg

Failure Cases

In this first case, we see the importance of emphasizing edges, as the gradient energy method yields a grotesque face. Human visual perception is also very sensitive to faces, and even small manipulations to facial structure can appear startling. As a first measure, edge emphasis will protect the shapes of faces, given enough contrast between the face and the background, as evidenced with the edge-emphasizing energy methods. Explicit face detection and reweighting of the energy function would be the most reliable method to preserve faces.

monalisa.jpg

Original and Scaled

monalisa.jpg monalisa_200x536scaled.jpg

Gradient

monalisa_grad_energy.jpg monalisa_200x536grad.jpg

Sobel

monalisa_sobel_energy.jpg monalisa_200x536sobel.jpg

Derivative of Gaussian

monalisa_dog_energy.jpg monalisa_200x536dog.jpg

Derivative of Gaussian2

monalisa_sqdog_energy.jpg monalisa_200x536sqdog.jpg

This failure case demonstrates the difficulty of preserving straight lines. Even when lines are emphasized, if they occupy most of the image, they must be cut, and any cut through a slanted line will create a noticeable sheared junction. In this case, retargeting performance could be improved by implementing a "forward-looking" energy calculation that finds the seam that "inserts" the most energy into the image, rather than just the seam that contains the least energy. This augmented method tends to better preserve lines.

kandinksy.jpg

Original and Scaled

kandinsky.jpg kandinsky_200x280scaled.jpg

Gradient

kandinsky_grad_energy.jpg kandinsky_200x280grad.jpg

Sobel

kandinsky_sobel_energy.jpg kandinsky_200x280sobel.jpg

Derivative of Gaussian

kandinsky_dog_energy.jpg kandinsky_200x280dog.jpg

Derivative of Gaussian2

kandinsky_sqdog_energy.jpg kandinsky_200x280sqdog.jpg

This case also demonstrates that parallel lines are difficult to preserve. The violation of parallel lines is especially apparent to our visual perception, as they are often indicators of perspective in an image.

vault.jpg

Original and Scaled

vault.jpg vault_200x533scaled.jpg

Gradient

vault_grad_energy.jpg vault_200x533grad.jpg

Sobel

vault_sobel_energy.jpg vault_200x533sobel.jpg

Derivative of Gaussian

vault_dog_energy.jpg vault_200x533dog.jpg

Derivative of Gaussian2

vault_sqdog_energy.jpg vault_200x533sqdog.jpg

Parallel lines are also disrupted in this image, though we can see that by uniformly de-emphasizing the lines, as in the gradient energy image, they are less distorted in the retargeted image.

owl.jpg

Original and Scaled

owl.jpg owl_400x400scaled.jpg

Gradient

owl_grad_energy.jpg owl_400x400grad.jpg

Sobel

owl_sobel_energy.jpg owl_400x400sobel.jpg

Derivative of Gaussian

owl_dog_energy.jpg owl_400x400dog.jpg

Derivative of Gaussian2

owl_sqdog_energy.jpg owl_400x400sqdog.jpg

This failure case demonstrates the difficulties that strong texture poses. By our energy calculations, the foliage actually appears to be the most important content, and the train is chipped away. This is perhaps a case where user annotation could benefit the algorithm, so that the train could be designated as a high importance area.

train.jpg

Original and Scaled

train.jpg train_300x334scaled.jpg

Gradient

train_grad_energy.jpg train_300x334grad.jpg

Sobel

train_sobel_energy.jpg train_300x334sobel.jpg

Derivative of Gaussian

train_dog_energy.jpg train_300x334dog.jpg

Derivative of Gaussian2

train_sqdog_energy.jpg train_300x334sqdog.jpg