Face Detection with a Sliding Window

Overview

The sliding window model independently classifies image patches as being object or non-object. In this project we implement a sliding window face detector. To do so I:

Improved representation using the SIFT descriptor, rather than raw image patches (which was the baseline).
Implemented a strategy for iteratively re-training a classifier with mined hard negatives, and compared this to the accuracy acheived when randomly sampling non-face images.
Utilized both linear and non linear SVMs and compared the results of each.

Results

I observed a range of accuracy results depending on the method implemented and the parameters selected...

Representation Improvement

I was able to improve upon the baseline's accuracy of .045 by implementing a SIFT descriptor (and using a linear SVM). This gave an accuracy of .23. The precision-recall curve looked like this:

I managed to improve upon this by tweaking some of the parameters. Here I changed the step_size to 2 (from 4) and the start_scale to 1 (from 3) and saw much better results:

Mining Hard Negatives

Adding more negative training examples by adding hard negatives improved upon this slightly:

Non-linear SVM

First I tested a non-linear SVM without mining hard negatives, only randomly subsampling them. This method outperformed the previous ones with an average precision of .71:

The most successful results, however, were acheived using a non-linear SVM and mining for hard negatives. Average precision was .74.

CS143 Introduction to Computer Vision: Project 4 Face Detection with a Sliding Window

Overview

Results

Representation Improvement

Mining Hard Negatives

Non-linear SVM

CS143 Introduction to Computer Vision:
Project 4 Face Detection with a Sliding Window