GrabCut Segmentation With Automatic Item Selection

Matt Wilde (mwilde)


After consulting with Professor Hayes, I ended up implementing GrabCut with some minor enhancements based on Sketch2Photo and a recent survey paper.

Thus, my project consists of 3 parts:

  1. Automatic foreground selection
  2. GrabCut Hard Segmentation
  3. Matting using the Gaussian Mixture Models from GrabCut

Automatic foreground selection

The Sketch2Photo paper identifies that using saliency filtering to pick "easy" photos allows us to do a better job of automatically processing an image.

Were I to add to my pipeline, I would implement something along the lines of the simplicity filter from the photo quality assessment paper (picking images with central edges) to feed into my pipeline.

The automatic foreground selection would then take advantage of this. It looks for strong edges and picks a rectangular mask that captures the edges above a fixed threshold.

GrabCut Hard Segmentation

GrabCut refines an initial mask to segment the image based on a probabilistic model of the colors in the foreground and background of the image.

It models the colors in the foreground and background as Gaussian Mixture Models. This means that we assume that the colors in each ground fall into k clusters, and we can compute the probability that a given color is part of that model.

GrabCut then proceeds by repeating the following for N iterations:

  1. Use our existing mask to generate GMMs for the foreground and background.
  2. Generate a new mask by performing a graph cut where:

Matting using the GMMs from GrabCut

In the original paper, they suggest computing a border contour and minimizing an energy function over it using DP. I take a simpler approach inspired by a survey paper by Wang and Cohen.

Since we have computed the probabilities that a given pixel is in the foreground or background as part of GrabCut, we can use these probabilites to generate soft segmentation values.

  1. Compute a border region by morphologically dilating the mask and subtracting the morphologically eroded mask.
  2. Find the probabilities that a border pixel's color is a foreground color by querying the foreground and background GMMs.
  3. Assign the arithmetic mean of these probabilities as the alpha value for that pixel.


This is a result that would have benefitted from some filtering. Had I implemented the simplicity filter, it would have told me that this image would turn out badly due to all of the edges at the bottom and the compression artifacts.


More results: