CS 143 / Project 3 / Scene Recognition with Bag of Words

This project's goal was to implement various image recognition techniques. We implemented two different image representation/feature extraction techniques and two different feature detection techniques and evaluated these techniques for scene recognition of 15 different scenes and 1500 total images. The specific image representation and feature detection methods are as follows:

  1. Tiny Images (image representation)
  2. Bag of SIFT (image representation)
  3. 1-Nearest Neighbor (classifier)
  4. Support Vector Machine (classifier)

Explanation of Parameters for each Technique

Tiny Image: 16 x 16 pixel sized tiny images worked the best in my tests.

Bag of SIFT: Vocabulary size of 400 and sample 800 random images to get features from, finally I found a SIFT step size of 8 to work best for vocab building and bag of sift generation.

SVM: The LAMBDA value for SVM classification that I found most successful was 0.00009

Discussion

During testing of various parameters I found that by using the 'fast' option for the vl_dsift call it would speed things up but also change accuracy (this is probably expected). For example, when testing different LAMBDA values for the SVM step if I set LAMBDA to 0.00009 and used 'fast' for my bag of SIFT calculations it would yield 63.9% accuracy where as without 'fast' it yielded 68.2%

Confusion Matrices for 4 different configurations

Below are the confusion matrices for 4 different technique pairs with their related accuracies.
Tiny Image + 1-Nearest Neighbor Tiny Image + SVM SIFT + 1-Nearest Neighbor SIFT + SVM
Accuracy 20.0% Accuracy 20.1% Accuracy 50.5% Accuracy 68.2%



Results visualization for final SIFT + SVM recognition pipeline.

Accuracy (mean of diagonal of confusion matrix) is 0.682

Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label
Kitchen 0.610
Store

Store

Office

Office
Store 0.460
InsideCity

LivingRoom

Kitchen

InsideCity
Bedroom 0.570
LivingRoom

Kitchen

Street

LivingRoom
LivingRoom 0.410
Store

Bedroom

Bedroom

Bedroom
Office 0.910
Kitchen

Bedroom

Kitchen

Bedroom
Industrial 0.580
InsideCity

Store

LivingRoom

Street
Suburb 0.940
Mountain

Coast

LivingRoom

Store
InsideCity 0.590
Suburb

Store

Store

Store
TallBuilding 0.810
Store

Street

Bedroom

InsideCity
Street 0.630
Forest

Bedroom

Store

InsideCity
Highway 0.850
OpenCountry

Industrial

Coast

Mountain
OpenCountry 0.410
Coast

Coast

Mountain

TallBuilding
Coast 0.710
OpenCountry

OpenCountry

Mountain

OpenCountry
Mountain 0.820
OpenCountry

Bedroom

Highway

OpenCountry
Forest 0.930
OpenCountry

Store

Mountain

Mountain
Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label