CS 143 Project

This project's goal was to implement various image recognition techniques. We implemented two different image representation/feature extraction techniques and two different feature detection techniques and evaluated these techniques for scene recognition of 15 different scenes and 1500 total images. The specific image representation and feature detection methods are as follows:

Explanation of Parameters for each Technique

Tiny Image: 16 x 16 pixel sized tiny images worked the best in my tests.

Bag of SIFT: Vocabulary size of 400 and sample 800 random images to get features from, finally I found a SIFT step size of 8 to work best for vocab building and bag of sift generation.

SVM: The LAMBDA value for SVM classification that I found most successful was 0.00009

Discussion

During testing of various parameters I found that by using the 'fast' option for the vl_dsift call it would speed things up but also change accuracy (this is probably expected). For example, when testing different LAMBDA values for the SVM step if I set LAMBDA to 0.00009 and used 'fast' for my bag of SIFT calculations it would yield 63.9% accuracy where as without 'fast' it yielded 68.2%

Confusion Matrices for 4 different configurations

Below are the confusion matrices for 4 different technique pairs with their related accuracies.

Tiny Image + 1-Nearest Neighbor	Tiny Image + SVM	SIFT + 1-Nearest Neighbor	SIFT + SVM
Accuracy 20.0%	Accuracy 20.1%	Accuracy 50.5%	Accuracy 68.2%

Results visualization for final SIFT + SVM recognition pipeline.

Accuracy (mean of diagonal of confusion matrix) is 0.682

Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label

Kitchen 0.610
Store
Store
Office
Office

Store 0.460
InsideCity
LivingRoom
Kitchen
InsideCity

Bedroom 0.570
LivingRoom
Kitchen
Street
LivingRoom

LivingRoom 0.410
Store
Bedroom
Bedroom
Bedroom

Office 0.910
Kitchen
Bedroom
Kitchen
Bedroom

Industrial 0.580
InsideCity
Store
LivingRoom
Street

Suburb 0.940
Mountain
Coast
LivingRoom
Store

InsideCity 0.590
Suburb
Store
Store
Store

TallBuilding 0.810
Store
Street
Bedroom
InsideCity

Street 0.630
Forest
Bedroom
Store
InsideCity

Highway 0.850
OpenCountry
Industrial
Coast
Mountain

OpenCountry 0.410
Coast
Coast
Mountain
TallBuilding

Coast 0.710
OpenCountry
OpenCountry
Mountain
OpenCountry

Mountain 0.820
OpenCountry
Bedroom
Highway
OpenCountry

Forest 0.930
OpenCountry
Store
Mountain
Mountain

Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label

Jeff Rasley (jeffra)

CS 143 / Project 3 / Scene Recognition with Bag of Words

Explanation of Parameters for each Technique

Discussion

Confusion Matrices for 4 different configurations

Category name	Accuracy	Sample training images	Sample true positives	False positives with true label		False negatives with wrong predicted label
Kitchen	0.610			Store	Store	Office	Office
Store	0.460			InsideCity	LivingRoom	Kitchen	InsideCity
Bedroom	0.570			LivingRoom	Kitchen	Street	LivingRoom
LivingRoom	0.410			Store	Bedroom	Bedroom	Bedroom
Office	0.910			Kitchen	Bedroom	Kitchen	Bedroom
Industrial	0.580			InsideCity	Store	LivingRoom	Street
Suburb	0.940			Mountain	Coast	LivingRoom	Store
InsideCity	0.590			Suburb	Store	Store	Store
TallBuilding	0.810			Store	Street	Bedroom	InsideCity
Street	0.630			Forest	Bedroom	Store	InsideCity
Highway	0.850			OpenCountry	Industrial	Coast	Mountain
OpenCountry	0.410			Coast	Coast	Mountain	TallBuilding
Coast	0.710			OpenCountry	OpenCountry	Mountain	OpenCountry
Mountain	0.820			OpenCountry	Bedroom	Highway	OpenCountry
Forest	0.930			OpenCountry	Store	Mountain	Mountain
Category name	Accuracy	Sample training images	Sample true positives	False positives with true label		False negatives with wrong predicted label