Project 2: pb-lite: Boundary Detection

Name: Chen Xu
login: chenx

Algorithm Design

In the basic part, I follow the given pipeline, using multiscale features, which include brightness, texture and results from canny edge detector. To obtain texture map, I create a series of filter banks of odd-symmetric gaussian derivatives by convolving gaussian kernel with a sobel kernel, and rotating it with different angles. This process is used for different scales of gaussian kernel. Mask pairs are created according to different orientations and radius. By convolving the grayscale image with the filter bank, a vector feature is derived for each pixel and k-means is used to cluster the features, thus forming the tmap. Tmap and the brightness image are used to calculate the chi-distant with the masks. Finally, I use the simple mean of the features to multiply the canny results.

Specifically, the filter bank is a 16-by-2 cell matrix, corresponding to 16 orientations and 2 scales. And the masks is a 8-by-3-by-2 cell matrix, corresponding to 8 orientations in [0, 180] and 3 scales [5 10 20]. To compute chi-distance, using a single for loop to for every pixel, which is indicated by the project webpage. The size of the gaussian kernel is chosen as 6 * sigma + 1, which has the zero values at the edges. The result shows that, the pb-lite has beaten the other two baselines, canny and sobel.Fig. 1 shows the comparation of baseline canny and pb-lite, and Fig. 2 shows the comparation before and after improvement.

Fig. 1 From left to right: original image, baseline(canny) output, pb-lite output.

Fig. 2 Left: basic pb-lite, right: pb-lite after improvement.

Discussion and Extra Points

There are a lot of things can be done to improve the basic result performance, to earn extra credits, I have improved algorithm by the following steps:

Find better filter kernel size.
Find better ways to combine the features.
Use richer filter banks.
Use color as a feature channel.

1.Better kernel size

In the basic part, I use 6 * sigma + 1 as the size of the kernel, which makes the values of the edges zero. But when I reduce the size to 3 * sigma + 1, there is better performance. In Fig. 3, the performance is better especially when recall is less than 0.1. So I decide to use kernel size 3 * sigma + 1.

Fig. 3 Left: kernel size = 3 * sigma + 1, right: kernel size = 6 * sigma + 1.

2.Better ways to combine the features

In the basic part, I just calculate the means of all the features for each pixel. This is actually a linear combination of the features with a constant weight of 1/n. To improve it, I calculate the means of the features of every orientation, and find the maximum value from different orientations of each pixel, using the equation: mPb(x,y) = max(mPb(x,y,theta)). This process improves the performance a lot(Fig. 4). And now smaller kernel size results in better performance for lower recalls.

Fig. 4 Choosing the maximum value of the means from different the orientation improves the performance.
Left: kernel size = 3 * sigma + 1, right: kernel size = 6 * sigma + 1.

3.Richer filter banks

In the first step, I improved the filter banks by using 8 odd-symmetric gaussian derivatives, corresponding to 8 different orientations in [0, 180], 2 even-symmetric gaussian derivatives by convolving sobel filter with gaussian kernel twice, with 2 different orientations, and one difference of gaussian filter, which satisfies Gaussian(u, sigma) - Gaussian(u, 0.25 * sigma). Fig. 5 shows the improved filter bank.

Fig. 5 Improved filter banks: 8 odd- and 2 even-symmetric Gaussian derivative filters and one Difference of Gaussian Filter.

The improve of the performance is great, in lower recalls, the performance is better than gPb.(Fig. 6)

Fig. 6 Richer filter bank, lower recall performance is better than gPb. (kernel size = 6 * sigma + 1)

In the second step, I use a much richer filter bank. I use 8 even-symmtric gaussian derivative filters instead of 2 (Fig. 7). However, the performance drops down.(Fig. 8) So it is not true that the richer filter bank is, the better performance will be. We should carefully choose the filter bank.

Fig. 7 Improved filter banks: 8 odd- and even-symmetric Gaussian derivative filters and one Difference of Gaussian Filter.

Fig. 8 Using richer filter bank to improve the performance.
Left: kernel size = 3 * sigma + 1, right: kernel size = 6 * sigma + 1.

4.Use Color as additional feature channels

I use two color models to compare the performance:(1)RGB;(2)HSV. When using RGB model, the three channels(red, green, blue) are calculated separately as three additional feature channels. When using HSV, because the v-channels represents the value of brightness(the same as grayscale image), just calculate the H and S channels as addtional feature channels.

Fig. 9 Left: HSV, Middle: RGB, both are with a kernel size of 6 * sigma + 1, and Right: RGB with the filter bank in "Richer filter bank step 1"

The improvement by color is not that obvious as the previous improving methods. Maybe color information should not just be used as three separate channels. Better ways should be investigated.