CS143
Computer Vision
Project 2 - pb-lite

Scott Newman (senewman)
October 10, 2011

Project Overview

This project was an introduction to image edge detection. The goal was to improve the quality of classic, easy edge detectors (such as Sobel or Canny) by taking into account texture and brightness information in the image to give a better description of edges, as using more traditional methods (e.g., intensity gradients) can give false positives and fail to recognize certain types of edges.

Algorithm

The pb-lite algorithm works by giving a 'score' to each pixel in an image, which abstractly represents the probability of it being a boundary or edge.

The first step of the pb-lite algorithm is the creation of a filter bank. A filter bank is a series of filters, oriented and scaled in different ways. The filter used is the first derivative of the Gaussian, taken by convolving the Gaussian with the Sobel filter. These filters will be used, in a sense, to examine the "rate of change" of texture. The utility of using different orientations and scale factors is that it allows for examining how texture changes at different angles and at varying scales (for example, within a texture, there might not be too much variation, but using a bigger filter might better represent change between one texture and another). I used 6*sigma for my Gaussian kernel size, and scaled the Sobel filter by sigma.
(example filter bank)

Once the filter bank is created, the next step is to create a texton map from the original image. The input image is filtered with all of the images in the filter bank, giving a series of texture *responses* at each pixel. These vectors of responses are then clustered using k-means into k bins (here, k = 64), assigning a unique "texture ID" to each pixel. In short, this step groups the image into a series of textures.
(texton map of a penguin)

The third step is to compute the gradients of the texton image described above, as well as a brightness (intensity) channel image from the original image. This is done by creating a histogram at each pixel of the corresponding information that surrounds it (e.g., number of textons belonging to texture ID 1, texture ID 2, etc.). Instead of calculating this by hand at each pixel, a more efficient method is to create a series of half-disks and use them as filters over the image. These half-disks are comprised of intensity values 1 or 0 only, so if a pixel is filtered with a half disk, the result will be the number of surrounding pixels within some region with some particular value, which is exactly the histogram we desire.

A common metric for computing this is the chi-square distance metric. It is defined as follows:

Where g and h represent complementary half-disks (that is, one is the other rotated 180 degrees, and summed together, they give a full disk). The following is an example of the half-disks used to compute the chi-square metric:

The final result is to aggregate this information and use it to increase the quality of an already edge-detected image. The result of computing texture and brightness gradients using half-masks as described above gives a vector of responses at each pixel. So, we would expect an image edge-detected using Sobel/Canny filtering to correlate with the pb results. The actual pb value at each pixel is computed by multiplying per-pixel the edge value in the original edge detected image by the mean of the texture gradient response vector and the brightness gradient response vector. The resulting image is normalized, and is the final pb-lite image.

Results

My pb-lite implementation outperformed the baseline Sobel/Canny filters. Comparing side-by-side, pb-lite images look much better than Sobel or Canny edge-detected images alone. For very small recall, Canny was slightly more precise, but decreases more quickly than pb-lite as recall increases. The following evaluation was performed on the ten test images:

I used 6*sigma for the Gaussian kernel when computing the filter bank. In addition, I didn't actually use the Sobel baseline at all. I used the Canny edge-detected image as a base and then calculated pb-lite on top of that. I also sped up the gradient computation by pre-computing all the bins for the histograms and then performing the half-disk filtering. This allowed me to perform computations in O(o*s*k) instead of O(o*s*k*m*n), where o = num orientations, s = num sizes, k = num bins, m = image width, n = image height.

The Sobel baseline gave an F-score of 0.45.
The Canny baseline gave an F-score of 0.58.
My pb-lite algorithm gave an F-score of 0.6, which beats the two baseline methods.

The following are the original image, the Sobel image, the Canny image, and the pb-lite image (in that order) for the ten test images:

CS143 Computer Vision Project 2 - pb-lite

Project Overview

Algorithm

Results

CS143
Computer Vision
Project 2 - pb-lite