Face Detection - Kayle Gishen

Overview

We begin by training an SVM on positive and negative face features which are gathered from a Histogram of Oriented Gradients descriptor. For each image of the test set we use a moving window algorithm with a HOG descriptor at several different scales of the image to detect faces at different sizes.

Algorithm

A HOG descriptor of 12x12 with bin size of 8 was used to gather the features of from the known faces. Several different sizes for the HOG descriptor were tried with 12x12 having optimal results. The pipeline was setup in a cascade form where successive levels of the cascade used hard negatives to adjust the SVM. The test images were passed through a HOG detector on a per patch basis as HOG does a moving window within its algorithm.

Results

Due to the slow running time of the pipeline using HOG, the effect of hard negatives and a non-linear SVM are compared with the low APR value of the detector.
At the baseline with step_size of 4 and start scale of 3 the APR was 0.354.
When mining for hard negatives was included we saw the APR drop to 0.287
When changing the SVM to be non-linear the APR changed slightly to 0.293 which could be due to the random negatives being used.

However we were capable of receiving a significantly higher APR of 0.798 by adjusting the parameters of the base to step_size of 2 and start_scale of 2. This, however, took a significant amount of time (~5 hours) to run on the test set due to the slow performance of converting the images to patches and then running HOG on each patch.

Conclusions

The HOG implementation is from one provided from the MATLAB code sharing site and was chosen as HOG's precision potential is higher than sifts due to the fact that there is no normalization within the algorithm so the descriptors are more precise. If the time requirement was not so immense we could have tried smaller values for the pipeline parameters to increase the APR.