Face Detection with a Sliding Window
by Sam Swarr (sswarr)

CS 143 Fall 2011


Confidences of 0.3 or higher on the easy picture.	Confidences of 0.5 or higher on the easy picture. (It even detects clock faces!)


Confidences of 0.3 or higher on the hard picture.	Confidences of 0.5 or higher on the hard picture.

Training the Linear SVM Classifier

In order to train the classifier, I need to feed it features of positive facial crops and features of non-face crops. I chose to use a SIFT (via vl_dsift) to obtain features since they are more robust and invariant than raw image patches. I obtained a single SIFT feature for each of the faces in the positive training set. Then, I obtained an initial collection of 1,000 random negative features by running a SIFT on random crops of images containing no faces. Then, with 1,000 of the positive features and the 1,000 negative features, I initially trained my linear SVM using a lambda value of 50. Here is the performance of the classifier after this first stage of training:

Linear SVM Classifier; lambda = 50.0; Fed 1000 positives and 1000 random negatives

Linear SVM Classifier; lambda = 50.0; Fed 1000 positives and 1000 random negatives
Step-size = 2; Scale-factor = 1.2; Start-scale = 2

	Stage 1
TPR	0.489
FPR	0.009
TNR	0.491
FNR	0.011

To hopefully improve on this, I used the above SVM on a series of non-face scenes. 1,000 false positives detected here were transformed into SIFT features and added to the pool of negatives. The SVM was then retrained with these mined hard-negatives. Here is the performance of the classifier after two stages of training:

Linear SVM Classifier; lambda = 50.0; Fed 2000 positives and 2000 negatives (half random negatives/half hard negatives)

Linear SVM Classifier; lambda = 50.0; Fed 2000 positives and 2000 negatives (half random negatives/half hard negatives)
Step-size = 2; Scale-factor = 1.2; Start-scale = 2

	Stage 1	Stage 2
TPR	0.481	0.301
FPR	0.012	0.021
TNR	0.488	0.646
FNR	0.018	0.033

As hoped, using mined hard-negatives improved precision by over 8%. Note also that the true-negative rate went up dramatically as a result of using mined hard-negatives. This most likely contributed to the precision improvement.

Using a Non-Linear SVM Classifier

In attempts to improve precision, I experimented with a non-linear SVM classifier. I first compared a linear SVM and a non-linear SVM using lax detection parameters to speed up testing. The non-linear SVM had around a 10% precision increase. I then ran the detector with the non-linear SVM using tighter parameters. Here are the results:

Non-Linear SVM Classifier; lambda = 1.0; sigma = 200.0; Fed 2000 positives and 2000 negatives (half random negatives/half hard negatives)

Non-Linear SVM Classifier; lambda = 1.0; sigma = 200.0; Fed 2000 positives and 2000 negatives (half random negatives/half hard negatives)
Step-size = 2; Scale-factor = 1.2; Start-scale = 3
My Personal Best Result

	Stage 1	Stage 2
TPR	0.498	0.494
FPR	0.001	0.001
TNR	0.499	0.499
FNR	0.002	0.005

As you can see, the non-linear SVM outperformed the linear SVM by nearly 13%.

Side-by-Side Comparisons

The detector parameters for both were: step = 2, scale = 1.2, start-scale = 2

Linear SVM trained only on random negatives	Linear SVM trained on random and mined hard-negatives

Both classifiers were trained on 2000 positives and 2000 negatives (half random and half hard). The detector parameters for both were: step = 2, scale = 1.2

Linear SVM (lambda = 50)	Non-linear SVM (lambda = 1; sigma = 200)

Conclusion

Overall, I'm happy with my results. I was pleased to see that using mined hard-negatives increased precision over just random negatives, and that using a non-linear SVM increased precision over a linear one. Had I had the time to run the classifier over night, I would've liked to have used more training data and tweak the detector parameters to be even more thorough.

Face Detection with a Sliding Windowby Sam Swarr (sswarr)

CS 143 Fall 2011

Training the Linear SVM Classifier

Linear SVM Classifier; lambda = 50.0; Fed 1000 positives and 1000 random negatives

Linear SVM Classifier; lambda = 50.0; Fed 1000 positives and 1000 random negatives Step-size = 2; Scale-factor = 1.2; Start-scale = 2

Stage 1

TPR

FPR

TNR

FNR

Linear SVM Classifier; lambda = 50.0; Fed 2000 positives and 2000 negatives (half random negatives/half hard negatives)

Linear SVM Classifier; lambda = 50.0; Fed 2000 positives and 2000 negatives (half random negatives/half hard negatives) Step-size = 2; Scale-factor = 1.2; Start-scale = 2

Stage 1

Stage 2

TPR

FPR

TNR

FNR

Using a Non-Linear SVM Classifier

Non-Linear SVM Classifier; lambda = 1.0; sigma = 200.0; Fed 2000 positives and 2000 negatives (half random negatives/half hard negatives)

Non-Linear SVM Classifier; lambda = 1.0; sigma = 200.0; Fed 2000 positives and 2000 negatives (half random negatives/half hard negatives) Step-size = 2; Scale-factor = 1.2; Start-scale = 3 *My Personal Best Result*

Stage 1

Stage 2

TPR

FPR

TNR

FNR

Side-by-Side Comparisons

Conclusion

Face Detection with a Sliding Window
by Sam Swarr (sswarr)

Linear SVM Classifier; lambda = 50.0; Fed 1000 positives and 1000 random negatives
Step-size = 2; Scale-factor = 1.2; Start-scale = 2

Linear SVM Classifier; lambda = 50.0; Fed 2000 positives and 2000 negatives (half random negatives/half hard negatives)
Step-size = 2; Scale-factor = 1.2; Start-scale = 2

Non-Linear SVM Classifier; lambda = 1.0; sigma = 200.0; Fed 2000 positives and 2000 negatives (half random negatives/half hard negatives)
Step-size = 2; Scale-factor = 1.2; Start-scale = 3
My Personal Best Result