"A diamond with a flaw is worth more than a pebble without imperfections."
-Chinese Proverb
In this assignment, we tackle the common problem of boundary detection in an image. There are many well-known, naive ways of doing this by simply meausing a pixel's difference from it's neighbors and threasholding, but these methods generate many false positives, especially in textured regions. We try to avoid this "brick wall" problem by using more sophisticated methods.
The first problem to solve is classifying texture. By definition, textures have repeated elements, and we use this to our advantage. We first generate a set of filters. Each is a derivative operator convolved with a gaussian, which measures the brightness change in a particular direction over a section of the image. In order to capture different size sections and different directions of change, we have several different radii for the gaussian function and several different orientations for the derivative direction. The filter bank that I used (with two radii and 16 orientations) is shown below.
Areas with similar texture should respond similarly when convolved with each filter. For example, an area with many vertical lines will respond strongly to the horizontal derivative filters only, while an area with a spotted texture will have a somewhat average response for all orientations. We convolve the input image with each image in the filter bank to get a filter response vector for each pixel in the image. We then run a k-means algorithm (I used k = 64) to group these filter response vectors into descrete "textons." An example texton map is shown below.
We next compute oriented gradients of textons and binned pixel brightness values at each pixel. To do this efficently, we define half disc masks that capture only the pixels on one side of an oriented boundary. We take two opposing half discs centered at the same pixel and look at the difference in their texton and brightness responses. This gives us a decent measure of the probability that this pixel is a boundary of a certain orientation. For each pixel, we average these gradients for both brightness and texton images and factor in a canny baseline to get our final boundary probability. Below are the half discs of different scales and orientations that I used.
Here are example brightness and texton gradient images for the penguin example above.
Here are the results for all 10 test images. The order is: original image, sobel baseline, canny baseline, my pb-lite implementation.
Here are the results of my boundary detection as compared to the canny and sobel baselines, the state of the art algorithm (gpb), and a human baseline.