CS 143 / Project 3 / Scene Recognition with Bag of Words

Tiny images representation and nearest neighbor classifier

The tiny image representation with nearest neighbor classification can be used as a baseline to measure all subsequent classification pipelines. With tiny images, each image is represented as a scaled down version of itself (16x16 pixels). For simplicity, the aspect ratios of the original images were not maintained. Images in the test set are classified as the label for their nearest neighbor in the training set.


Accuracy (mean of diagonal of confusion matrix) is 0.225

Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label
Kitchen 0.080
Suburb

Bedroom

Industrial

OpenCountry
Store 0.020
Street

Kitchen

OpenCountry

OpenCountry
Bedroom 0.180
LivingRoom

Mountain

Kitchen

Mountain
LivingRoom 0.100
Office

Street

Kitchen

Street
Office 0.180
Forest

Store

TallBuilding

Coast
Industrial 0.130
TallBuilding

Forest

InsideCity

Highway
Suburb 0.370
Coast

Office

Industrial

OpenCountry
InsideCity 0.060
Kitchen

Office

Highway

Street
TallBuilding 0.220
Coast

Forest

Mountain

Mountain
Street 0.420
Bedroom

InsideCity

Suburb

Suburb
Highway 0.560
Coast

Office

OpenCountry

OpenCountry
OpenCountry 0.350
InsideCity

Store

Coast

Highway
Coast 0.390
Store

Forest

OpenCountry

Highway
Mountain 0.190
InsideCity

Kitchen

Office

Bedroom
Forest 0.130
OpenCountry

InsideCity

Industrial

Mountain
Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label

Bag of SIFT representation and K-NN classifier

With the bag of SIFT representation, images are represented as a histogram of visual words or vocabulary. This vocabulary is generated by obtaining SIFT features for the images in the training set and running K-Means against the SIFT features. The vocabulary consists of the centers of each of these K clusters. The histogram of visual words is then generated by making a soft assignment of each images SIFT features to the closest x words in the vocabulary. Afterwards the test images are run through a K-NN algorithm, as classified as the mode of the labels of their k nearest neighbors.


Accuracy (mean of diagonal of confusion matrix) is 0.556

Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label
Kitchen 0.510
LivingRoom

Office

Office

Bedroom
Store 0.450
InsideCity

InsideCity

LivingRoom

InsideCity
Bedroom 0.270
Coast

Kitchen

Kitchen

LivingRoom
LivingRoom 0.340
Bedroom

Bedroom

Bedroom

Office
Office 0.900
LivingRoom

Kitchen

Kitchen

LivingRoom
Industrial 0.240
InsideCity

TallBuilding

InsideCity

TallBuilding
Suburb 0.920
Industrial

Street

Street

LivingRoom
InsideCity 0.560
Street

Highway

Street

Kitchen
TallBuilding 0.280
Street

InsideCity

LivingRoom

Store
Street 0.610
Industrial

Store

Store

Suburb
Highway 0.790
Coast

Coast

Suburb

OpenCountry
OpenCountry 0.450
Coast

Mountain

Forest

Suburb
Coast 0.510
OpenCountry

OpenCountry

Bedroom

Highway
Mountain 0.570
OpenCountry

Forest

OpenCountry

OpenCountry
Forest 0.940
OpenCountry

Mountain

Suburb

Mountain
Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label

Bag of SIFT representation and linear SVM classifier

The SVM classifier is creating a one vs all classifier for each category using the training set. New images are classified by running them through each SVM and choosing the one with the highest value.


Accuracy (mean of diagonal of confusion matrix) is 0.720

Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label
Kitchen 0.630
Store

Store

Bedroom

Bedroom
Store 0.650
LivingRoom

Bedroom

LivingRoom

Highway
Bedroom 0.530
LivingRoom

Kitchen

Kitchen

LivingRoom
LivingRoom 0.380
Store

Industrial

Bedroom

Office
Office 0.890
Kitchen

Industrial

Bedroom

Store
Industrial 0.590
LivingRoom

Bedroom

Highway

InsideCity
Suburb 0.980
OpenCountry

Mountain

InsideCity

TallBuilding
InsideCity 0.580
Street

Suburb

Suburb

TallBuilding
TallBuilding 0.800
Kitchen

InsideCity

InsideCity

InsideCity
Street 0.750
Highway

Highway

InsideCity

Store
Highway 0.840
OpenCountry

Store

Coast

TallBuilding
OpenCountry 0.580
Coast

Coast

Highway

Highway
Coast 0.800
OpenCountry

InsideCity

Highway

Bedroom
Mountain 0.870
Coast

Highway

TallBuilding

OpenCountry
Forest 0.930
OpenCountry

Mountain

Mountain

Suburb
Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label