CS143 Introduction to Computer Vision Project 4 - Face detection with a sliding window
Hung-I Chuang Login:hichuang
 
            CS143 Introduction to Computer Vision Project 4 - Face detection with a sliding window
Hung-I Chuang Login:hichuang
 
              Introduction
In this project, we use sliding window model, independently classify all image patch as being object or non-object, to detect faces in photos. We will use linear or non-linear classifier to train strong features of patches and use hard negative mining strategy to improve detection accuracy.
 
              algorithm
Brown University - Computer Science
 
              1. Convert positive training datas (36 by 36 face images) to SIFT feature.
2.While ( current_stage <= total_stage )
    current_stage=1: Randomly select number of patches from non-face training datas and get their SIFT
    current_stage>1: Sample only patches where classifier return false positives and add it to negative features
3.    Train classifier using linear SVM or non-linear SVM (adjusting SVM parameters by experiment).
4.    current_stage+=1
5. Choose sliding window step size and scale factor to run the classifier on test patches.
 
              More Result
 
             
            When mining hard negative and testing, we apply SIFT on whole image instead of crops from image to speed up the process. For the first few stage, we get more false positive than we need for our classifier, so we randomly subsample the hard negatives.
Learning parameters for SVM is very important since it effects the final precision a lot. I tried different sets of parameters for both linear (lambda) and non-linear (lambda and sigma) SVM. 
Figures below are the statistic for linear and non-linear SVM parameters experiment. The experiment shows that lambda for linear SVM shows that it has the highest accuracy when lambda is around 2; and for the non-linear SVM (lambda, sigma) = (1.0, 256) has the best result.
result from demo on class photo
 
             
            Times of adding hard negative will also effect the final result, as my experiment, I found out that add hard negatives once will be better than adding it more. Figures below show with only precision-recall curve for only random negative, adding hard negative once, and twice, with precision .342, .356, .344 respectively training on non-linear SVM.
 
             
             
            To increase average precision, I adjusted sliding window’s step side and scale of testing image to increase recall rates by finding smaller faces. As I tuned step size to 2, scale factor to 1.25, the result dramatically improved with average precision 0.864 and recall rate is higher then 90%.
 
            Last, I increase the amount of training data, from 1000 to 1500 to 2000, and the precision slightly increase. The best result I can get is on 2000 crops for positive data and both random and hard negatives. The precision reach 87.3%.
 
            On the class photo, although it return some false positives, as we increase confidence threshold, false positives will be filter out and true positives with high confidence will remain.
 
            And found all the faces on Brown CS faculty page. Yeah!