CS143 Introduction to Computer Visions

Project 4: Face Detection with a Sliding Window (by Margaret Kim, mk20)

Objective

To detect faces with a sliding window and training both linear and non-linear classifier and to test the classifier on other images.

Sliding Window Model

The sliding window model is conceptually simple: independently classify all image patches as being object or non-object. Sliding window classification is the dominant paradigm in object detection and for one object category in particular -- faces -- it is one of the most noticeable successes of computer vision. For example, modern cameras and photo organization tools have prominent face detection capabilities.

Overview of the Algorithm

The steps of this project is:

  1. Load crops of positive images (ones with just faces).
  2. Generate SIFT features for all of the positive crops.
  3. Load crops of random or hard negative images (ones without any faces).
  4. Generate SIFT features for all of the negative crops.
  5. Train linear/non-linear SVM classifier using the generated positive/negative features
  6. Test the trained classifier

Details

Each crop represents the patch of image the sliding window is showing.

The features that we will be extracting from images are the SIFT features. SIFT stands for scale-invariant feature transform and these features are tolerant to image noise, changes in illumination, uniform scaling, rotation, and minor changes in viewing direction. The features were extracted in the form of a regular grid.

SIFT grid

The SVM classifier for this project works by plotting the features of the positive images and negative images as a graph, and creating a "line" to divide them. With the "line", one can easily differentiate between a face and a non-face.

Results

The following are the results of varying the total stages of iteration and stepsizer values while generating the SIFT features. Also we use two different types of classifiers, linear and nonlinear.

Linear SVM vs. Non-linear SVM

Linear, Total stages = 1 (only random-negatives), Stepsize = 4, Accuracy = 0.332
Face detections with total stages of 1 and stepsize of 4

Nonlinear, Total stages = 1 (only random-negatives), Stepsize = 4, Accuracy = 0.478
Face detections with total stages of 1 and stepsize of 4, nonlinear

Linear, Total stages = 2 (random and hard negatives), Stepsize = 4, Accuracy = 0.388
Face detections with total stages of 1 and stepsize of 4

Nonlinear, Total stages = 2 (random and hard negatives), Stepsize = 4, Accuracy = 0.445
Face detections with total stages of 1 and stepsize of 4, nonlinear

Varying Total Stages

Linear, Total stages = 2, Stepsize = 4, Accuracy = 0.388
Face detections with total stages of 2 and stepsize of 4

Linear, Total stages = 5, Stepsize = 4, Accuracy = 0.351
Face detections with total stages of 5 and stepsize of 4

Linear, Total stages = 8, Stepsize = 4, Accuracy = 0.326
Face detections with total stages of 8 and stepsize of 4

Varying Stepsizes

Linear, Total stages = 2, Stepsize = 4, Accuracy = 0.388
Face detections with total stages of 2 and stepsize of 4

Linear, Total stages = 2, Stepsize = 3, Accuracy = 0.416
Face detections with total stages of 2 and stepsize of 3

Linear, Total stages = 2, Stepsize = 2, Accuracy = 0.421
Face detections with total stages of 2 and stepsize of 2

Other Results

Nonlinear, Total stages = 1, Stepsize = 1, Accuracy = 0.471
Face detections with total stages of 1 and stepsize of 1, nonlinear

Nonlinear, Total stages = 1, Stepsize = 2, Accuracy = 0.476
Face detections with total stages of 1 and stepsize of 2, nonlinear

Nonlinear, Total stages = 2, Stepsize = 1, Accuracy = 0.453
Face detections with total stages of 2 and stepsize of 1, nonlinear

Best Result

Nonlinear, Total stages = 1, Stepsize = 4, Accuracy = 0.478
Face detections with total stages of 1 and stepsize of 4, nonlinear