In this project, we implemented a classifier using the bag of words method. The algorithm can be broken down into the following steps:
- Given a collection of 1500 words, collect many local features and use k-means clustering to cluster them into a vocabulary of visual words.
- For each of the training images build a histogram of the word frequency. We do this by looking at the features in each image, and finding the cluster to which it has the smallest euclidean distance.
- Feed these histograms to an SVM.
- Build a histogram for test images and classify them with the SVM we trained.
In my implementation of the algorithm, I discard most of the collected features while building the vocabulary to improve the performance of the clustering; instead of using all of the ~1000 collected features, I initially used a randomly picked set of 50. I tried running the algorithm with 500 and 1000 random samples each, to see how this effects accuracy.
The given default number of words in the built vocabulary is 200. I initially ran the algorithm with this vocabulary size. I also tried a vocabulary size of 20 and 1000, to see the effects on accuracy and performance. Due to the slowness of execution, I ran the algorithm only once for each parameter value.
Here are the set of runs that I've executed. The results were very similar overall, with no apparent improvement.
Runs:
-
Initial run with the default vocabulary size and 50 random features used per image.
Vocabulary Size No. Random Features Used Per Image Accuracy Result 200 50 0.6267 -
This time, I kept the same vocabulary size but sampled 10 times as many feature points as I did in the previous run. I did not observe any improvement.
Vocabulary Size No. Random Features Used Per Image Accuracy Result 200 500 0.6193 -
In this run, I kept the initial number of random features and increased the vocabulary size to 1000. The result seemed to be better than the initial result, but only at a small magnitude.
Vocabulary Size No. Random Features Used Per Image Accuracy Result 1000 50 0.6420 -
In this run, I reduced the size of the vocabulary to 20. This is a very small size, and as expected, the results were not
satisfactory.
Vocabulary Size No. Random Features Used Per Image Accuracy Result 20 50 0.5513