pB-Lite Boundary Detection
Kyle Cackett
The
basic pB-Lite boundary detection algorithm combines
methods of comparing per pixel texture and brightness gradients with a baseline
edge detection method such as canny or Sobel boundary
detection to outperform traditional Canny and Sobel
edge detection methods. To coax
information about texture from an image, pB-Lite convolves
the image with a bank of filters designed to respond differently to different
textures. The response at each pixel is
recorded which creates a vector of responses for each pixel describing the
texture in the vicinity of the pixel.
The high-dimensional vectors are then distilled to a single number known
as a texton ID using KMeans
clustering (the texton ID is simply the cluster ID). Gradients for the desired property
(texture/brightness) are calculated by computing the chi-square distance
between histograms in complementary half-disk regions of the image. Results are evaluated using the BSDS500
dataset which compares human annotated images to the output of boundary
detection algorithms. The results are
plotted on precision vs recall curves and an F-score
is reported to measure the success of the method. My basic pb-Lite
achieved an F-score of 0.62 using the optimal dataset scale (ODS) and 0.61
using the optimal image scale (OIS); this outperforms the canny baseline which
achieved an ODS F-score of 0.58 and an OIS F-score of 0.59. Implementation details are discussed below.
My basic filter bank consists of a set of oriented derivatives of Guassians at two different scales. The filters are computed by first creating a Guassian at the desired scale (size equal to 6*sigma). The Gaussian is then convolved with a Sobel filter to get the first derivative and finally it is rotated into the desired orientations. A figure showing my basic filter bank (16 orientations, 2 scales) is included below.
Half disk masks were created at various radii by filling an array twice the size of the desired radii with ones at all points whose distance from the origin was less than the desired radius. This array was then multiplied with an array with ones in quadrants I and II and zeros in quadrants III and IV and then rotated to the desired orientation. A figure of my basic set of oriented half disk masks (3 scales, 8 orientations) is included below.
To generate texton IDs the image is first convolved with each filter in the filter bank. This builds a vector of filter responses at each pixel and thereby assigns a pixel a coordinate in a multidimensional space (number of dimensions equal to the number of filter in the filter bank). The hope is that different textures will fall in significantly different regions of the higher dimensional space while similar textures will fall close together. This is a reasonable assumption since identical textures will respond identically to the filter bank. Once the higher dimensional representation of the image is constructed we use KMeans clustering to group pixels within the same region of the higher-dimensional texture space into clusters. We then use the cluster ID to represent the texture at a particular pixel (thereby reducing the high dimensional texture space to a single dimension). After convolving the image with the filter bank above I grouped the responses into 32 clusters. A graphical representation of the resulting texture map is included below.
![]() |
To evaluate the texture and brightness gradient at each pixel we compare the feature in complementary half disk masks. The comparison is done by grouping the feature into bins in each half-disk and computing the Chi-Square measure to compare the difference between histograms. The Chi-Square measure is given by the following formula where g and h are the histogram in the half disk region:
To speed the computation each bin is computed separately which allows the half-disk sections of interested to be extracted via a filtering operation in C instead of a loop over pixels in Matlab.
The pb-Lite algorithm assigns a confidence value to each pixel corresponding to how “sure” it is that the pixel in question lies on an edge. The confidence value is computed by combining the texture and brightness features with the baseline method. To compute my confidence value I first took the max of the brightness and texture vector at each pixel reasoning that the max would correspond to the half disk orientation and size that lies along an edge. I then normalized the brightness and texture measures and computed their mean so that they would be weighted equally. Finally, I multiplied the result with the canny baseline method and renormalized the result. A sample output of my pb-Lite implementation is shown below:
Results are evaluated using the BSDS500 dataset by plotting a precision-recall curve for each boundary detection method. The precision-recall curve plots the accuracy of edges (as compared with ground-truth edges from the human annotated set) as a function of the confidence. As the confidence goes down (recall goes up) we expect more erroneous edges so the precision will drop. Good curves lie further out in the upper right of the graph. Looking at the graph below my basic pb-lite implementation significantly outperforms the baseline canny method. My basic pb-Lite achieved an F-score of 0.62 using the optimal dataset scale (ODS) and 0.61 using the optimal image scale (OIS); this outperforms the canny baseline which achieved an ODS F-score of 0.58 and an OIS F-score of 0.59.
![]() |