CS143 Project 2: Edge Detection

Introduction

Summary

Boundary detection is an important, well-studied computer vision problem. Clearly it would be nice to have algorithms which know where one object stops and another starts. But boundary detection from a single image is fundamentally difficult.

In this project, I developed a simplified version of pb, which finds boundaries by examining brightness and texture information across multiple scales. I also examined RGB information for extra credit. The output of my algorithm is a per-pixel probability of boundary. This simplified boundary detector significantly outperforms the well regarded Canny edge detector. Evaluation is carried out against human annotations (ground truth) from a subset of the Berkeley Segmentation Data Set 500 (BSDS500)

The Pipeline

The general pipeline for the project is shown here. I ellaborate on each individual step alter.

The algorithm

The filter bank and texton maps

Filter banks are used to measure texture properties in the image. Each pixel record the local response to a different filter. Pixels with similar responses to filters indicate regions with similar textures. Below is a sample filter bank for different orientations and sigmas (orientations 45, 90,180,225, 270, 315 and 360 degrees and sigmas .5, 1, 1.5 and 2).

Given responses of pixels to the filters, we can group the pixels into K groups based on how similar responses are. To group the pixels I used the kmeansML algorithm. I initially picked k to be 64. However, this showed too much segmentation in the texton maps -- similar textures seemed to end up in different groups. I set K equal to 32 instead. A sample texton is shown below. Areas with similar textures are displayed with the same RGB value.

Compute texture gradient brightness gradient

To figure out where edges are we need to look at whether there are changes in the image. In this algorithm we focus on the changes in texture and in brightness. The local texton gradient (tg) and brightness gradient (bg) encode for every pixel how much the texture and brightness distributions are changing in its vicinity. Intuitively, if more things are changing around a pixel, it is more likely to be part of a boundary. Gradients are key us determine which pixels are more likely to be part of an edge.

To compute the brightness and the texture gradients we first need to define a bank of masks. These masks will allow us to compare the distribution of brightness or textures in opposite sides of a pixel. If the difference in the distributions is big this implies that the textures or brightness is different on either side. Hence, it is likely that the pixel is part of a boundary. The image below shows an example of a set of masks with orientations 0, 45, 90,180,225, 270, 315 and 360 for radii 15, 10 and 5.

The gradients are calculated using the masks above. Sample results are shown below:

Example of a texture gradient:

Example of a brightness gradient:

Assign probability of boundary score to each pixel location

The gradients encode the the likelihood that a pixel is part of an edge. We now need to combine the information from the different gradients. We do this by averaging tg and bg values.

To get the final edge boundaries, we do the dot product of this with the Canny edge detector. By combining tg+bg with Canny we will accomplish two things: we will have certainty of where edges are without having to set thresholds (they are already set through Canny) and, more importantly, tg and bg will help us discard pixels that Canny thought were part of edges, but were in fact part of a texture.

Evaluation and F-score

Below are the results of applying different algorithms to calculate the edges. Firstly, we have Sobel and Canny. Both of them are pretty good, but have perfect. Canny performs better but seems to lack the concept of texture. Sobel ignores some of the more obvious edges. When we combine our PB with Canny and Sobel, we get better edge detection than with either of them by themselves. Since Canny seems to need more help in identifying textures, PB combined with Canny is particularly successful, more so than PB and Sobel.

Canny

Sobel

PB

PB and Sobel

PB and Canny

As could be inferred from the images above, our algorithm performs better than the baseline methods. The F scores obtained are shown below. The first two correspond to the Sobel and Canny algorithms, and the last one

Sobel

ODS: F( 0.38, 0.55 ) = 0.45
OIS: F( 0.38, 0.55 ) = 0.45

Canny

ODS: F( 0.66, 0.51 ) = 0.58
OIS: F( 0.70, 0.51 ) = 0.59

Pb and Canny

ODS: F( 0.66, 0.56 ) = 0.61
OIS: F( 0.68, 0.55 ) = 0.61

We can see that the f-score for our algorithm (.61) is greater than Canny(.59) or Sobel(.45). We have achieved better performance, as desired. This is also depicted in the performance graph below:

Ignoring pb, using brightness gradient only

I wanted to see how much the bg gradient by itself improved the performance of edge detection. I expected the texture gradient to have a greater impact than the brightness gradient. Surprisingly, however, the brightness gradient by itself raised the F score to 0.60.

Extra Credit

RGB gradient

In an attempt to beat the above algorithm I calculated the gradients for the red, green and blue channels. This was accomplished simply, in a very similar fashion to the calculation of the brightness gradient. For the final edge detection calculation I substituted bg for the sum of the three RGB gradients. The results obtained are shown below

RGB edge detection

RGB Performance

Sobel

ODS: F( 0.38, 0.55 ) = 0.45
OIS: F( 0.38, 0.55 ) = 0.45 Area_PR = 0.21

Canny

ODS: F( 0.66, 0.51 ) = 0.58
OIS: F( 0.70, 0.50 ) = 0.59

Texture and RGB gradient

ODS: F( 0.65, 0.56 ) = 0.60
OIS: F( 0.64, 0.56 ) = 0.60

Unfortunately, this tweak lowered the F-score from 0.61 to 0.60. This suggests that the brightness by itself encodes the idea of an edge more accurately than the sum of the RGB channels.

Difference of Gaussians

The Difference of Gaussians is generally used to increase the visibility of edges and other detail present in an image. First, I filtered the image with a Gaussian filter. If we subtract the filtered image from the original image we obtain a very high frequency image. The result of this subtraction is a very high frequency image. I used this image as the starting point image in texton and gradient calculations.
However, a major drawback to is an inherent reduction in overall image contrast produced by the operation. This is probably the reason why the results obtained were so poor. The algorithm performed better than Sobel, but worse than Canny. Since we are combining pb with Canny, we basically destroyed what was already there.

Sobel

ODS: F( 0.38, 0.55 ) = 0.45 OIS: F( 0.38, 0.55 ) = 0.45

Canny

ODS: F( 0.66, 0.51 ) = 0.58 OIS: F( 0.70, 0.50 ) = 0.59

PB and Canny

ODS: F( 0.51, 0.56 ) = 0.53 OIS: F( 0.51, 0.56 ) = 0.53