pb-lite Boundary Detection

This project centers around the idea of finding boundaries in an input image. At a very high level, we look for boundaries and compare them to human annotations provided by the Berkeley Segmentation Data Set 500 to assess how well we found the salient boundaries.

This project works by utilizing the Canny edge detector as a baseline for the edges. We then build upon the Canny results by attempting to capture and quantify texture. Often, edge detectors such as the Canny edge detector will produce false positives that are actually just variability in texture or shadow. By quantifying texture and examining the gradient of the texture across pixels, we can determine how likely we think a pixel to be a boundary basd on texture. If the textures are very similar, it is unlikely to be a boundary whereas if they are very different, it is likely that there is a boundary at the pixel. This allows us to suppress false positives found by the Canny edge detector by examining the texture gradient across pixels. More specifically, my pb-lite implementation does this for both intensity and texture.

For the texture portion of my boundary detection, I start out by building a filter bank of gaussian filters convolved with sobel filters at different scales and orientations. I then convolve each of these filters with the input image to arrive at an m*n*(o*s) matrix, where m and n are the dimensions of the image and o and s are the orientations and scales of the filter bank (I used 16 orientations and 3 scales). This basically gives me a feature vector of length o*s at each pixel of the image encoding it's response to all of the different filters. In my implementation of pb-lite, I decided to introduce color to the texture quantification as well. To do this, I converted the image to L*a*b colorspace and included the "a" and "b" portions of that as two more elements of my feature vectors that I obtained by convolving the image with the filter bank. I then utilize kmeans clustering to cluster these feature vectors into 16 distinct "texton ids" (where similar pixels are clustered such that they have the same texton id).

The next step involves a set of half-disk mask pairs. I used 8 different orientations at 3 different scales, where each half-disk has an exact mirror (ie rotated 180 degrees). The next step of the process is to convolve the texton map with each of these mask pairs and build a histogram for each of the masks that counts how many times each bin shows up in the half-disk centered on each pixel. We then use the chi square distance measure to measure the difference between the histogram produced by the left-hand mask vs the right-hand mask. The larger the difference between histograms of pairs of masks, the higher the likelihood that there is a boundary at the given pixel.

For the intensity portion of my boundary detector, the implementation basically was the same as the texture portion. The primary difference was that kmeans was unnecessary for the clustering. Instead, I just binned the intensities into 32 discreet intensities and proceeded with the gradient calculation.

After completing the above steps, I have another m*n*(o*s) matrix for both texture (including color) and intensity, where m and n are the dimensions of the input image and o and s are the orientations and scales of the pairs of half-disk masks. The next step is combining these results with the Canny baseline. To do this, I take the average of the feature vector at each pixel and utilize that as my measure of the likelihood of the pixel being a boundary. I then perform an elementwise multiplication of these values with the Canny baseline to arrive at my result.

My results were good, beating the baseline Canny (with an F ODS score of .61 while the Canny had an F ODS score of .58) when run on the 10 image test set. Because my method was basically suppressing false positives within the Canny detector, my curve does not reach the right hand side of the graph (ie it never has perfect recall).