Scene Recognition with Bags of Words

FaceDetection with Sliding Window

Charles Yeh, Oct. 2011

Implementation of Single Stage Classifier

Face detection with a sliding window is done by first training a classifier using both positive and negative training images.
Training images are cropped into face-sized patches, then converted into SIFT features which are used in the classifier.

Precision-Recall curve (step=1, 1 stage)

Then, I compared using multiple training stages with using just one. When training in multiple stages, the classifier is adjusted to be more conservative, then it is run on images with no faces to find false positives. These false positives are considered "hard negatives." The previous features and the hard negatives are used to train a new classifier. This process is repeated until the numbered of desired stages is reached.

Tweaking Step and Scale Size

Decreasing step and scale size drastically improved performance. When just using a single SIFT feature for each patch/crop, accuracy improved. It improved even more greatly from 2 to 1 step sizes (as shown above).

Precision-Recall curves

step=4, 1 stage
step=2, 1 stage

Most of the matches for step sizes 2 and 4 were the same, although the matches with step size 2 sometimes fit the ground truth better.

Step = 2	Step = 4
The match with step=4 actually does better in this case. This may be because the left match actually does match the training data better, but the classifier on the right stepped over it.

A lot of the matches were visually (and probably exactly) the same.

Decreasing the "start scale" and "scale factor" even further improved accuracy.

Before

Lowered to start scale=2 and scale factor=1.2

Using multiple stages improved accuracy, though only by a little. The following are precision-recall curves for 1, 3, and 5 stages, all with step=1, scale factor=1.2, start scale=2.

1 Stage, AP = .664

3 Stages, AP = .677

5 Stages, AP = .690

As expected, the non-linear classifier performed better, even with a smaller step size and less stages.

Final linear result (step=1, scale factor=1.1, start scale=2, 5 stages):

Final non-linear result (step=6, scale factor=1.1, start scale=2, 1 stage):

Linear Matches: