CS 143 / Project 3 / Scene Recognition with Bag of Words

Top Accuracy:   71.5 % ±0.5

Results:

  1. Tiny images representation and nearest neighbor classifier:
    22% (web-results)
  2. Bag of SIFT representation and nearest neighbor classifier:
    55.3% (web-results)
  3. Bag of SIFT representation and linear SVM classifier:
    71.5% (web-results)

When I first started the project the performance of the Bag of SIFT representation was suffering about a 5% decrease. It was due to a loss of information caused by down sampling of images. Altough the workflow was considerably faster the accuracy decreased considerably. To improve the performance considerably

Vocabulary Size Performance

The following confusion matrices show the performance of the Bag of SIFT and SVM at the following vocabulary sizes: 10, 20, 50, 100, 200, 400, 1000. The tests show a lower performance score compared the top results since the Bag of SIFT is used a sparser sampling.