Project 4 Report

By Andersen Chen

Algorithm: Face Detection with Sliding Window

First, we extract SIFT descriptors from our cropped faces. Then, for a predetermined amount of iterations, we mine hard negatives from non-face scenes by finding false positives, then train the classifier, then repeat. For the first iteration, we get random negatives from non-face scenes to train our classifier. After, we evaluate our classifier on our test data set. In terms of parameters, a bin size of 10 and a step size of 6 seemed to work well enough for vl_dsift. I did not alter any other parameters.

Results

For the linear classifier, without mining hard negatives, bin size = 10, step = 6:

For the linear classifier, mining hard negatives once, bin size = 10, step = 6. We see that with mining hard negatives twice (total stages = 3), we achieve better results:

For the non-linear classifier, without mining hard negatives, bin size = 10, step = 6. With a non-linear classifier, we have a significant improvement in results:

For the non-linear classifier, mining hard negatives just once (total stages = 2), bin size = 10, step = 6, we see a slight decrease in results. This may be due to the fact that the classifier was trained with a large amount of negative data (1000 images), and therefore had an overly conservative threshold and thus underdetected. However, I believe that mining for hard negatives does reduce false positives, which the PR metric does not penalize. We see that at the highest recall point, we still have a precision of 0.3, which means that average precision could be increased by reducing the threshold for a positive detection.