CSCI1430 : Project 4 - Face Detection with Sliding Windows
rdfong
In this project we attempt to identify where faces are in some arbitrary scene. The main idea is to train some classifier
(either linear or non linear) with positive training data, negative training data and then hard negative training data. We then run
over the image with a sliding window, applying the contents of our window to the classifier to determine whether or not a face has been found.
Positive Crops
First we need to train our classifier to know what faces are. We take a set of crops that we know are faces and provide each with
a descriptor for the crop. I chose to use a SIFT descriptor for each crop. We can feed these positive descriptors into the SVM.
Mining Random Negatives
Now we need to train our classifier to know what isn't a face. We do the same thing we did with the positive crops with crops that we know are
negative, feeding those descriptors into the SVM as well.
Mining Hard Negatives
Next we find false positives in our data set, feeding our classifier images that we know have no faces. For any crop that our classifier
finds as positive we refeed the descriptor for that crop into the classifier as a hard negative and retrain the classifier. We do this for some number
of iterations (I did up to 5) in the hopes that are classifier becomes more and more accurate.
Sliding Windows
Once we are satisfied with our classifier we actually need to run our detector on arbitrary images.
For each image we use a sliding window approach to generate many crops of the image. For find the HOG descriptor for each
of these crops and then using the trained classifier, decide whether or not the crop is a face. Lastly we can visualize our results
using a Precision vs Recall curve (see results below).
Results (using SIFT descriptor)
Linear SVM (.213 accuracy)
Using just linear the results clearly weren't great.
Non-Linear SVM (.372 accuracy)
Non-Linear did significantly better than linear, as expected.
Hard Negatives (number of iterations = 3, accuracy = .229)
Interestingly, after using 3 iterations of hard negatives on non-linear,
the accuracy actually decreased. It had very high prcesion for low recalls (.96 at .15 recall), but the recall itself
didn't get to past .25, resulting in a low average precision.