Project 2: pb-lite: Boundary Detection
CS 143: Introduction to Computer Vision

Overview

In this project, we develop a simple version of pb, which is well documented by the Berkeley Vision group over the years. Our method is based on the latest paper. We simply combine feature strength to form probability values and use canny edge detection to allow for thinned contours. This naive algorithm works decently well, beating baselines usings Sobel and Canny edge detection. Evaluation is carried out against human annotations (ground truth) from a subset of the Berkeley Segmentation Data Set 500 (BSDS500)


Details

Texton Map

Traditional edge detection algorithms are sensitive to intensity changes, but fail to reason about a higher level grouping of textures. For each image, we cluster filter responses to form a texton image (shown below), and compute texture features to avoid putting boundaries in textured regions.

input image
texton image

While a texton image does not directly leave an impression of edges, the distribution of different shades of grey in a local neighborhood carries information regarding textures, which can be useful towards boundary detection.

One drawback is that we are using a fixed number of clusters (textons) when running k-means, this assumes a uniform complexity for all images, which is undesirable. A better model should be flexible enough to fit and handle arbitrary number of textons, however many needed.

Computing tg and bg

The gradient computation involves building histograms over half-discs at a given orientation and scale. This step can be done efficiently using a filtering operation in MATLAB. First we create a binary image where the things we would like to count are turned to ones, and the rest zeros. Then by convolving filters/masks that look like half-discs, we can aggregate the count over the whole image in one statement, allowing this whole procedure to be done by 3 nested for-loops below:

%initialize gradient variable
g=cell(...,...);
%loop thru each filter
for each orientation o
    for each scale s
        curr_g=zeros(...,...);
        %update each bin count
        for each bin b
            curr_im=single((im_r>=binvals(b)) & (im_r<binvals(b+1)));
            left=conv2(curr_im,masks{i,j,1},'same','symmetric');
            right=conv2(curr_im,masks{i,j,2},'same','symmetric');
            curr_g=curr_g+(left-right).^2./(left+right+eps);
        end
        g{i,j}=curr_g*.5;
    end
end

A faster solution could be using integral image, as documented in the appendix in the 2011 PAMI paper.

Output pb-lite

As shown below, a simple mean of the tg+bg values per pixel suppresses textures and highlights boundaries. This suggests that the tg and bg features should basically be proportional to pb, so a simple mean of the feature values can be a decent boundary strength indicator. Below are some visualizations of the feature strengths:

input image
feature strength image

An interesting thing to note is that a naive binning for brightness gradient yields artifacts in smooth regions (above left). This is because while brightness values does not change much in magnitude, it has a decent chance of crossing from one bin to another. This suggests that some clustering in intensity space or softer binning would be preferred.

More Results:

input image
canny edge detection
pb-lite
gpb

input image
canny edge detection
pb-lite
gpb

input image
canny edge detection
pb-lite
gpb

input image
canny edge detection
pb-lite
gpb

input image
canny edge detection
pb-lite
gpb

input image
canny edge detection
pb-lite
gpb

input image
canny edge detection
pb-lite
gpb

input image
canny edge detection
pb-lite
gpb

input image
canny edge detection
pb-lite
gpb

input image
canny edge detection
pb-lite
gpb

Results

The evaluation is done using all 200 test images from the BSDS500 dataset. Here I show the precision-recall curve as well as the F-measure computed using their benchmark code.
Precision-Recall curve for 200 test images in BSDS500
Boundary
ODS: F( 0.69, 0.57 ) = 0.62   [th = 0.10]
OIS: F( 0.67, 0.62 ) = 0.64
Area_PR = 0.54
ODS stands for optimal dataset scale; OIS stands for optimal image scale. The ODS F-measure corresponds to the most outward point on the PR curve plotted. The OIS generally reports a better performance, because the F-measure is computed by aggregating per-image counts that correspond to those that give the best F-measure for each image.