Computer Vision, Project 2: Boundary Detection

Bryce Richards

Project Description

Boundary detection is the task of determining where one object starts and another begins. Classical boundary detection algorithms use intensity discontinuities to determine edges. These methods are generally effective, but can be fooled by regions of intensity variation within one object, such as streaks in someone's hair. The pb method, developed within the last decade, partially overcomes these problems by using texture and color gradients in addition to intensity. Essentially, pb suppresses regions of uniform texture and/or color from registering as edges. In this project we implement a simplified version of pb edge detection that combines texture and intensity gradients with the classical methods canny and sobel. Our "pb-lite" edge detector outperforms canny and sobel but falls short of the more sophisticated full pb algorithm.

Algorithm Design

First, we run the sobel and canny edge detectors on the input images. These will be combined with the other steps of our algorithm to form pb-lite, and will also be used to assess whether pb-lite beat the classical algorithms. Next, we create a filter bank, which will be used to generate the texture gradient of the image. The filter bank is a collection of filters that are convolved with the image. If there are N filters, each pixel records N different responses to those filters. Thus, we can associate each pixel with an N-long vector. We apply apply k-means clustering to these vectors, and based on the results of the clustering we assign each pixel a number 1 through k (the number of the cluster that that pixel ended up in). A pixel's number represent the "texture" of that pixel. We then use this texture map to make a texture gradient of the image, which records how much the texture is changing at each pixel. We then create a similar intensity gradient, and combine these gradients with the results of canny to create our pb-lite edge map of the picture. These steps of the algorithm are described in more detail below.

Filter Bank: Our filter bank consists of derivatives of Gaussian filters at 12 different orientations and two different variances. Qualitatively, these filters are "slanted filters" that respond highly when applied to a pixel of an edge that is at a similar slant. In addition, we included two center-surround filters, which are the difference of two Gaussians of differing variance. The filter bank we used is displayed below.

Mask Collection: Another preliminary step is to create a collection of "masks." These masks are later applied to each pixel to count how many "textons" (pixels assigned a particular texture ID by the k-means clustering) are on either side of each pixel. Thus, each mask comes in a pair: one that counts pixels to the left (at some orientation) of the pixel, and one that counts pixels to the right. We used semicircular and triangular mask shapes. The mask collection is displayed below. Although it is not apparent due to resizing, the masks are of three different radii: 5, 10, and 20 pixels.

Texton Generation: We apply all 26 filters from the bank to the image. Each pixel has 28 different responses, so we associate each pixel with a 28-length vector of doubles. We apply k-means clustering (with k set to 32) to these pixel vectors, and use the results of the clustering to assign each pixel a texture ID from 1 to 32.

Texture Gradient: For each texture ID, say t, we apply the masks to the image to count how many pixels of ID t are on each side of every pixel, for every mask shape/orientation. This gives us two histograms: one counting the total number of t-pixels on the left of every pixel, and one counting the number of t-pixels on the right. We then take the chi-squared distance between the left and the right counts. This gives us a measure of how much the texture is changing at every pixel.

Final Edge Map: We create a similar intensity gradient, which measures how much the intensity of the image is changing at each pixel. We then take a weighted average of the texture and intensity gradients, and multiply this pixel-by-pixel with the canny edge detector's edge map. Finally, we scale the resulting image to have pixel values between 0 and 1. This gives us our edge map, which we interpret as the probability of every pixel lying on an edge. If a pixel's value is near 1, our pb-lite algorithm is confident that it lies on an edge; if it is near 0, our algorithm has high confidence that it does not lie on an edge.

Results Below is a chart comparing the results of pb-lite with those of canny, sobel, and gpb (the state of the art). The solid lines are those generated by MATLAB testing, while the dotted lines are copied from a figure in a research paper. These results come from running the algorithm on the 10 test images provided in the assignment.

Below is an example of the edge maps of sobel (top right), canny (bottom left), and pb-lite (bottom right). Pb-lite smooths out many of the "edges" in the camel's hair and within the brick wall that were detected by sobel and canny.

Conclusion Our pb-lite algorithm met the goal we set: it outperformed canny and sobel. However, it didn't outperform them by as much as the pb-lite in the chart on the project homepage. In fact, in a tiny segment of the performance line (see the above chart), pb-lite was outperformed by canny. Also, the algorithm was slower than expected, taking over five minutes per image. I'm sure that I could have written certain parts of the code more efficiently to speed it up. The big reason why my code is probably inefficient in places is my lack of experience with MATLAB. I already feel much more comfortable with MATLAB than I did at the beginning of class, and as time goes on I think my coding will become faster and my code more efficient.