The goal of this project is to develop a simplified version of pb, a boundary detection algorithm. The pb-lite algorithm utilizes brightness and texture data to improve upon both the Sobel and Canny boundary detection algorithms. The pb-lite algorithm follows the following work-flow. On a given image input, we first create a gray-scale (brightness) copy of the image, and generate a filter bank of derivative Gaussian filters at 16 orientations and 2 scales. We also generate a set half-disk filters at 16 orientations (then sorted into 8 left-right respective pairs) and 3 scales. The second step is to generate a texton map. This texton map groups sections of the image together based upon the sections texture. Next, we find gradients for both the texton map, and gray-scale data. Finally, we use this gradient data to improve the Sobel and Canny border detection algorithms.
The generation of Derivative Gaussian and filter masks is a relatively simple process. We are given a vector of orientation values and a vector of sigma scale values. For each sigma scale, we first create a base derivative Gaussian, by convolving a Gaussian filter, using sigma, with a Sobel filter. Then we rotate the base filter according to the orientation values.
The next step is to create a bank of half-disk masks. This was a very simple operation. Given a vector of orientations and scales, I produced a 3D mask cell-array. One dimension of the array was for the dimensions, the second was for the scales, and the third was a binary dimension for left/right (180 degree opposite) orientations. To get the half disk at each scale I used the built-in disk filter to produce a scaled circle base image. I then performed the ceil function on the image, and cut off the right half to get the desired half disk image. I then rotated the base images according to the orientation values to get the left image, and then rotated by another 180 degrees to get the right image.
The next step is to attempt to cluster the image by textures. We do this by convolving each filter in the filter bank with the source images. We then create a texton map from these 32 results, by using kmeans a provided utility script. Most of the code converts 32 convolved images into a format that kmeans can work with. We simply reshape each image into 1 dimensional list, and assign to a row in the data matrix. Kmeans operates on this data matrix and returns a membership list. We then reshape this list into a matrix with the same dimensions as the original image. A pixel will be mapped to a cluster's index.
Now we have a texton map and a brightness image and we want to turn them both into gradient images. I did this by creating histograms for all of the each image. We will create a 3D matrix with dimensions that are the width and height of the image. The third dimension will be the size of the number of buckets desired for for the histogram. Now for each bucket, we will loop through the x,y pairs in the image. If the value at each x,y pair is within the buckets range, we set the corresponding x,y,bucket to 1, else we let the value be zero.
Next we convolve each histogram layer with each of the half-disk mask left-right pairs. For each left-right pair, we sum the half the chi square distance (see below) between the left convolve and right convolve at each bin level. The returned gradient is the set of the 2D chi_squrare_dist, which should have a size of the number of scales multiplied by the number of orientations.
We have finally reached the part where all of the magic happens. We will now improve on the Sobel and Canny filters using our brightness and texture gradients. To do this, I first took the average of the texture and brightness gradients. Next I added the Sobel and Canny results and then pixelwise multiplied it by both the normalized averaged texture gradient and the normalized average brightness gradient. I then normalize the result and return it.