Nabeel Gillani - Face Detection With a Sliding Window

Overview

For this assignment, I followed the recommended pipeline, but tried out some different experiments to see if I could improve my results. I primarily used parameter tweaking to improve the accuracy of my face detector. As exepcted, my detector trained with a non-linear SVM outperformed my linear implementation, but only after finding a good comination of lambda/sigma values. In addition, both the linear and non-linear implementations were similar in the amount of false positives they returned at low confidence levels.

Algorithms and Decisions

Computing Features

I started out by using vl_dsift to compute features on image crops, but this process was slow. I ended up using external mex-compiled code to compute HoG features. The external function was written by our very own psastras. Turning crops into features was simple process - for each row in our crops matrix, I computed 1 HoG feature and ultimately returned an N x D2 matrix where D2 = (#bins)*windowSize*windowSize. The window size, or number of subdivisions for the x and y directions in the image, was 6 for both. This yielded a 1x324 feature vector for each crop.

Mining Hard Negatives

I implemented this very similarly to the run_detector function. I defined the max number of crops per scene internally as simply the ceil of the number of crops / num of non-face scenes. Then, for each scene, I first retrieve the detections at each image scale, followed by the bounding boxes for these detections and lastly the corresponding crops. I then used randperm to pick at most max_crops_per_scene crops from the current scene. After looping over all scenes, I concluded by taking only num_crops crops, as defined in the proj4.m script. I also used parfor to speed up the mining stage.

For both computational as well as performance purposes, I decided to only mine for hard negatives once. Mining once led to significant improvements in accuracy compared to simply randomly sampling negatives (for the non-linear svm in one case, the improvement was approximately 60% better average precision!). I tried mining multiple times with a non-linear SVM once as well, however, and precision seemed to drop below what it was in the 1 stage case. Perhaps my parameter decisions did not play well with multiple mining stages.

Results

Linear SVM

Linear SVM, 4000 negative crops, lambda=50

Changing the lambda value here appeared to give a sizeable boost - AP was approximately 10% lower with the stock lambda value of 100. Mining 4000 negative crops also increased performance. Below are detection results for the class photos:

Class Easy - At least 0 confidence

Class Easy - At least 0.5 confidence

Class Hard - At least 0 confidence

Class Hard - At least 0.5 confidence

Non-Linear SVM

Non-Linear SVM, 4000 negative crops, lambda=6, sigma=3

Changing the lambda and sigma values here improved performance. With a sigma value of 3 and a lambda value at the default value, 1, average precision was approximately 23% worse. Below are detection results for the class photos:

Class Easy - At least 0 confidence

Class Easy - At least 0.5 confidence

Class Hard - At least 0 confidence

Class Hard - At least 0.5 confidence

Comparative Analysis

We can see from the results on the class photos that at the lowest confidence rate, both models returned many fals positives - however, as confidence increased, these false positives were quickly pruned. Moreover, the Non-linear SVM outperformed the Linear implementation by approximately 8%, a number that could be increased with better choices for lambda/sigma as well as possibly by implementing the cascade architecture.

Summary

Overall, increasing the number of negative crops to 4000 from 1000 produced a significant performance boost. For the non-linear SVM, increasing sigma to about 3 and lambda to 6 also produced a performance boost. Both SVMs tended to return a high number of false positives which did not penalize the AP values for either. I suspect playing around with parameters more in the future, as well as trying a more sophisticated mining strategy (implementing the cascade + more mining iterations) would produce improvements in results.

Overview

Algorithms and Decisions

Computing Features

Mining Hard Negatives

Results

Linear SVM

Non-Linear SVM

Comparative Analysis

Summary

Shoutouts