Sam Boosalis

Algorithm

Below I describe my algorithm.

Get Features

Densely sample the features with a bin size of 4 and a step size of 8

Clustering

K-means

Create Training Data

Represent each training image as a distribution of words

For each training image in the training data,

Train SVM

For each class,

Classify Test Data

For each test image,

Results

Params

Confusion Matrix

Accuracy

My accuracy, with the above parameters, was: 0.6200

Extra Credit

Different Vocabulary Sizes

I ran my code with the above parameters on vocabulary sizes of 10, 20, 50, 100, 200, 400.

vocab_size = 10

accuracy =

vocab_size = 20

accuracy =

vocab_size = 50

accuracy = 0.5740

vocab_size = 100

accuracy = 0.6113

vocab_size = 200

(same as above)

vocab_size = 400

accuracy = 0.6420

Discussion

As expected, vocabulary size is positively correlated with accuracy.

Non-linear (RBF) SVM kernel

Params

accuracy = 0.6200

Discussion

While the accuracy happens to be exactly the same as the linear kernel, the confusion matrix is somewhat different (you can also verify this by looking at them side by side). Had I had more time, to tune the parameters, I would expect more of an increase in accuracy. As I played with more extreme variances, I got dismal results. The SVM-ensemble was classifying everything as the same image. Since SVMs are binary classifiers, we must aggregate the results of each SVM in some way to get multiclass classification. I think my implementation was correct, but I did not have time to properly tune the parameters (to which SVMs are super sensitive). Thus, I attribute this to one one-versus-all SVM (e.g. the bedroom-or-not) returning large values, whose voice thus dominated classification. Since these values are not probabilites, there is no principled way to compare the voices of different SVMs.
variance = 0.001
variance = 100