Computer Vision, Project 2: Boundary Detection
Bryce Richards
Boundary detection is the task of determining where one object starts and another begins. Classical boundary detection algorithms use intensity discontinuities to determine edges. These methods are generally effective, but can be fooled by regions of intensity variation within one object, such as streaks in someone's hair. The pb method, developed within the last decade, partially overcomes these problems by using texture and color gradients in addition to intensity. Essentially, pb suppresses regions of uniform texture and/or color from registering as edges. In this project we implement a simplified version of pb edge detection that combines texture and intensity gradients with the classical methods canny and sobel. Our "pb-lite" edge detector outperforms canny and sobel but falls short of the more sophisticated full pb algorithm.
Filter Bank: Our filter bank consists of derivatives of Gaussian filters at 12 different
orientations and two different variances. Qualitatively, these filters are "slanted filters" that respond highly when applied to a pixel of an edge
that is at a similar slant.
In addition, we included two center-surround filters, which are the difference of two Gaussians of differing variance. The filter bank we
used is displayed below.
Mask Collection: Another preliminary step is to create a collection of "masks."
These masks are later applied to each pixel to count how many "textons" (pixels assigned a particular texture ID by the k-means clustering) are
on either side of each pixel. Thus, each mask comes in a pair: one that counts pixels to the left (at some orientation) of the pixel, and one
that counts pixels to the right. We used semicircular and triangular mask shapes. The mask collection is displayed below. Although it is not
apparent due to resizing, the masks are of three different radii: 5, 10, and 20 pixels.
Texton Generation: We apply all 26 filters from the bank to the image. Each pixel has 28 different responses, so we associate each pixel with a 28-length vector of doubles. We apply k-means clustering (with k set to 32) to these pixel vectors, and use the results of the clustering to assign each pixel a texture ID from 1 to 32.
Texture Gradient: For each texture ID, say t, we apply the masks to the image to count how many pixels of ID t are on each side of every pixel, for every mask shape/orientation. This gives us two histograms: one counting the total number of t-pixels on the left of every pixel, and one counting the number of t-pixels on the right. We then take the chi-squared distance between the left and the right counts. This gives us a measure of how much the texture is changing at every pixel.
Final Edge Map: We create a similar intensity gradient, which measures how much the intensity of the image is changing at each pixel. We then take a weighted average of the texture and intensity gradients, and multiply this pixel-by-pixel with the canny edge detector's edge map. Finally, we scale the resulting image to have pixel values between 0 and 1. This gives us our edge map, which we interpret as the probability of every pixel lying on an edge. If a pixel's value is near 1, our pb-lite algorithm is confident that it lies on an edge; if it is near 0, our algorithm has high confidence that it does not lie on an edge.