CSCI1430 : Project 2 - pbLite

rdfong

In this project we attempt to make an improvement on the Canny edge detection method. We do so by using a subset of the pb (probability of boundary) method, described here that tries to determine the likelihood that an edge is actually the edge of an object and just part of a texture of some sort. In this basic implementation of pbLite we consider intensity and texture gradients as our means of determining the strength of an edge. The algorithm is described below.

Creating filters and masks

Given a set of orientations and sigma values create a filter bank of orientations*sigma filters. Each filter should be created by filtering a gaussian with a certain sigma by a sobel filter and then rotating the result by a desired orientation.
Next, given a set of orientations and radii, create a bank of orientations*radii of half disc masks. Each mask should be generating by creating a half disc (of 1's) with a certain radius and then rotating the mask by a desired orientation.

Binning textures

We next want to build texture descriptors for each pixel of the image by filtering the image with each item in the filter bank. What we end up with is a representation of the texture image where each because is described by a vector of dimension d, where d is the number of filters.
Next we want to bin these d dimensional vectors into texture bins. I used the default value of 64 for the number of texture types/bins. To perform the binning I run the k-means algorithm using the texture descriptor at each pixel as my data points and a k value of 64. The result is a 2 dimensional image where each pixel corresponded to the texture bin the pixel falls into.

As an example, using imagesc to display the binning results we get the following:

Building Histogram and Gradients

The next step is to build the histograms and gradient map for texture and color using the chi squared distance measurement. To start we create loop through each of the bins and we end up with a series of bin number of binary images where each image i is of value 1 at each pixel where the pixel value is within the defined range of bin i, and 0 elsewhere.
Now that we have these binary images we need to use them to create an image of chi square distance values for each half disc mask pair. For a single half disc pair, we do this by convolving the left disc with each of our binary images (call these our gs) and then our right disc (call these our hs).
To calculate the chi squared value take the sum of of (gs(i)-hs(i))^2/(gs(i)-hs(i)) for each binary image i and multiply the result by .5. Since we do this for each of our half disc pairs (say we have h pairs), we end up with h gradient values for each pixel of the image.
Do this for both textures and brightness (and whatever other image descriptors you use).

Below is a sample texture gradient map after combining and averaging the maps for each disc pair. The clusters represent where there is more likely to be an edge based on our texture gradient.

Similar images can be generated for the brightness gradient.

Combining results

The last step is to combine results. To do this I take the mean of the all of the gradient values for each pixel for each representation type (texture, brightness and color) and element-wise multiply this value by the canny baseline.

I found that this produced better results than first taking the mean of the gradient values for each representation type separately, then averaging the 3 equally. By treating every chi square value with equal weight, texture gradients (which had more bins), was weighted more. This was desirable as texture gradients are a more accurate measurement of edge strength than color/brightness are (areas of varying brightness/color could just belong to the same texture).

Results (ODS = 0.60, OIS = .60)

Original Images

We can compare the canny baseline to my results here. Qualitatively, by using pbLite, many of the extraneous edges that could be regarded as texture were eliminated by pbLite. Take the penguin for example. There are a lot fewer rock edges which is reasonable as in general one would not regard the edges of every single rock as important. The ground as a whole may almost be considered a texture of sorts. Qualitatively then, we can say that pbLite did a decent job at choosing the most important edges. Of course, the algorithm isn't perfect. It eliminates some important edges and doesn't get rid of all the unnecessary ones.

Canny

pbLite

Looking at the PR curve we can see that pbLite did substantially better than the sobel and canny baselines, verifying our results. The f-score was 0.6 which is greater than the scores of the canny baseline (.59).

Extra Credit

In addition to the basic algorithm I did two things.

First I added center surround filters to the filter bank. To create this I took the difference of gaussian filters at various values of sigma.

I also added color information to the results by running kmeans on the image using each rgb value as a three dimensional vector input to the algorithm. I grouped the colors into 16 bins and then proceeded I did with the brightness and textures (getting the gradient, finding the average of all the feature vectors, multiplying the result by the canny baseline).

Extra Credit Graph Results (ODS = .60, OIS = .60) (original on the right for comparison)

The f-score doesn't change at all from our previous results (although it is still better than the baseline). The response curve itself does look better though. The curve in general seems to have a higher accuracy to recall ratio than the basic pbLite implementation. The reason why the score didn't change may be explained by the odd dip at the beginning of the PR curve. For very small recalls, the accuracy seems to drop substantially, bringing down our f-score. Regardless for most other recalls, the addition of filters and color gradients seem to aid the results.