In this project, we implemented a face detector that uses the sliding window paradigm. It independently classifies all image-patches as being object or non-object and uses this data to detect objects on test data. Here is the general outline of the algorithm:
The algorithm is basically trained on a set of positive and negative samples and and uses this data to determine whether patches are true or false during test time.
- Load crops of positive samples and extract positive features using HoG
- Mine hard negatives by finding false positives
- Train the classifier on the above data using a linear and a non-linear SVM
- Evaluate on test set
Results
I ran the algorithm under these varying parameters, keeping all other variables at their default values.:
- Varying window dimensions for HoG.
- Linear and non-linear classification.
I started from an initial window size of 3x3. There seemed to be a dramatic increase when I changed this size to 6x6. The linear SVMs I trained tended to have a precision-recall curve with fast-exponential-upto-linear falloff. Running the algorithm using on Non-linear SVM yielded two very different results on two different window sizes: with a window size of 6x6, the prevision-recall curve ended up having a very fast falloff with a very low recall and a low average precision, but a window size of 3x3 on the same classifier with the same parameters yielded a result very similar to the linear classifier with 6x6 windows.
Linear SVM:
-
HoG window size: 3x3 No# Histogram Bins 9 Accuracy: 0.935 TPR: 0.461 FPR: 0.025 TNR: 0.475 FNR: 0.04 Average Precision: 0.165 -
HoG window size: 6x6 No# Histogram Bins 9 Accuracy: 0.985 TPR: 0.488 FPR: 0.004 TNR: 0.496 FNR: 0.011 Average Precision: 0.350 -
HoG window size: 12x12 No# Histogram Bins 9 Accuracy: 0.998 TPR: 0.498 FPR: 0 TNR: 0.500 FNR: 0.002 Average Precision: 0.311
Non-linear SVM:
-
HoG window size: 6x6 No# Histogram Bins 9 Accuracy: 1 TPR: 0.5 FPR: 0 TNR: 0.5 FNR: 0 Average Precision: 0.169 -
HoG window size: 3x3 No# Histogram Bins 9 Accuracy: 0.99 TPR: 0.495 FPR: 0.5 TNR: 0.495 FNR: 0.5 Average Precision: 0.337
Some Examples:
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |