CS 143 Project

CS 143 / Project 3 / Scene Recognition with Bag of Words

The baseline project was implemented, and with some parameter fidgeting, an accuracy of 68.2% was achieved.

Implementation

Tiny Image Features and Nearest Neighbor Classifier

Implementation of the tiny image features and the nearest neighbor classifier was straight forward. As these were just temporary solutions, no special tricks were used. Still, an accuracy of 20% was attained. (See all results below.)

Building a SIFT Vocabulary

In sampling SIFT descriptors from images, a step size of 64 pixels was chosen, as it gave "tens of thousands of SIFT features." Normalization of these features was experimented with, but it actually lowered overall performance.

Building Histograms

Normalizing the histograms gave a final overall improvement from 61.1% to 68.2%. Changing the step size from 16 to 8 at one point improved accuracy from 40% to 48.5%. I suspect that picking an even smaller step size will further improve performance, but testing this would be slow.

SVM Training

The SVM implementation was straight forward, and a lambda parameter of 0.0001 worked best.

Results

P.S. I sometimes got this weird glitch in the auto generated web page.