Background
Bag of words models are a popular technique for image classification inspired by models used in natural language processing. The model ignores or downplays word arrangement (spatial information in the image) and classifies based only on a histogram of the frequency of visual words. Visual words are identified by clustering a large corpus of example features.
Algorithm
The basic flow of the algorithm:
- Collect a lot of features.
- Use k-means to cluster those features into a visual vocabulary.
- For each of the training images build a histogram of the word frequency (assigning each feature found in the training image to the nearest word in the vocabulary).
- Feed these histograms to an SVM.
- Build a histogram for test images and classify them with the SVM you just trained.
Results
![]() |
-
Confusion matrix which displaying the accuracy of the classifier