To be able to detect edges or boundaries of different images using pb-lite, a simplified version of an algorithm discussed in Contour Detection and Hierarchical Image Segmentation by Arbelaez, Maire, Fowlkes, and Malik.
In the field of computer visions, boundary detection is an important, well-studied problem as it is related to feature detection and recognition. Many scholars have created algorithms to solve this distinct problems. The classical ones include Canny and Sobel, which only check for intensity discontinuities. The more recent algorithm, pb (probability of boundary), considers texture and color gradients along with intensity, thus detecting the boundaries much more accurately than its predecessor algorithms.
The general overview of the pb-lite algorithm is:
To describe the tg, or how quickly the texture is changing at that point, first we need to represent texture as a local distribution of textons. Textons are discrete texture elements generated by clustering filter bank responses. To produce these textons, we used the following filter responses.
Next, we filtered the input image with each element of your filter bank, which resulted in a vector of filter responses centered on each pixel. In this case, as we have a 16 orientations with 2 scales, a 32-dimensional vector at every pixel describing the texture properties. We simplified this representation by replacing each 32-dimensional vector with a discrete texton id. We will do this by clustering the filter responses at all pixels in the image in to K textons using kmeans, where K is in the range of 1 to 64.
Here's an example of a texton map.
To actually compute tg, we compare the distributions in left/right half-disc pairs centered at a pixel.
The gradient increases as the similarity between the distributions decrease. Because our half-discs span multiple scales and orientations, we will end up with a series of local gradient measurements encoding how quickly the texture or brightness distributions are changing at different scales and angles.
We will compare texton distributions with the chi-square measure. It is defined as follows:
chi_sqr(g,h)=.5*sumi=1:K( (gi-hi)2 / (gi+hi) )
where g and h are histograms with the same binning scheme, and i indexes through these bins.
The following is the pseudo code to calculate the tg.
//img is the texton map in this case
tg = matrix of size n*m*p
//n = height of img
//m = width of img
//p = number of all left-right mask pairs
for every left-right mask pair
chi_sqr_dist=img*0
for i=1:num_bins
tmp = 1 where img is in bin i and 0 elsewhere
g_i = convolve tmp with left_mask
h_i = convolve tmp with right_mask
update chi_sqr_dist
end
store chi_sqr_dist appropriately in tg
end
This is the same process as producing the texture gradients. However instead of a texton map, we change the image to grayscale, scale the grayscale image so that each pixel's value ranges from 0 to 255, and use that as the "brightness map".
To approximate the per-pixel probability of boundary, I combined the results of the canny edge dectector with the tg and bg.
For each pixel of tg and bg, I took an average of the stored vector of gradients. Then I multiplied the mean of tg and bg to the canny edge detector value.
pb = canny*tg*bg
The last step was to normalize the pb values of all pixels.
The last part of this assignment is to evaluate the performance of the pb-lite vs canny vs sobel vs gpb (the best edge detector algorithm so far).
Looking at the above chart, pb-lite is clearly not any better than gpb, but it is better than two baselines (sobel and canny) for the most part.