After consulting with Professor Hayes, I ended up implementing GrabCut with some minor enhancements based on Sketch2Photo and a recent survey paper.

Thus, my project consists of 3 parts:

- Automatic foreground selection
- GrabCut Hard Segmentation
- Matting using the Gaussian Mixture Models from GrabCut

The Sketch2Photo paper identifies that using saliency filtering to pick "easy" photos allows us to do a better job of automatically processing an image.

Were I to add to my pipeline, I would implement something along the lines of the simplicity filter from the photo quality assessment paper (picking images with central edges) to feed into my pipeline.

The automatic foreground selection would then take advantage of this. It looks for strong edges and picks a rectangular mask that captures the edges above a fixed threshold.

GrabCut refines an initial mask to segment the image based on a probabilistic model of the colors in the foreground and background of the image.

It models the colors in the foreground and background as Gaussian Mixture Models. This means that we assume that the colors in each ground fall into k clusters, and we can compute the probability that a given color is part of that model.

GrabCut then proceeds by repeating the following for N iterations:

- Use our existing mask to generate GMMs for the foreground and background.
- Generate a new mask by performing a graph cut where:
- Each pixel is attached to the source with the negative log of the probability that it is in the foreground.
- Each pixel is attached to the sink with the negative log of the probability that it is in the background.
- Each pixel is attached to its 4-way neighbors with edges weighted proportionally to the Euclidian distance between those pixels colors in RGB space.

In the original paper, they suggest computing a border contour and minimizing an energy function over it using DP. I take a simpler approach inspired by a survey paper by Wang and Cohen.

Since we have computed the probabilities that a given pixel is in the foreground or background as part of GrabCut, we can use these probabilites to generate soft segmentation values.

- Compute a border region by morphologically dilating the mask and subtracting the morphologically eroded mask.
- Find the probabilities that a border pixel's color is a foreground color by querying the foreground and background GMMs.
- Assign the arithmetic mean of these probabilities as the alpha value for that pixel.

This is a result that would have benefitted from some filtering. Had I implemented the simplicity filter, it would have told me that this image would turn out badly due to all of the edges at the bottom and the compression artifacts.

Source | Result |
---|---|

More results:

Source | Result |
---|---|