miya schneider (mmschnei)
This project was completed for CSCI1430: Computational Vision,
taught by
James Hays at Brown University. This project attempts to classify
scenes by using a bag of words model and a 15 scene database used in
the paper titled
Spatial Pyramid Matching for Recognizing Natural
Scene Categories, published by Lazebnik et al. in 2006.
The project can be broken into five steps:
The resulting SVMs were able to classify at around 61 percent accuracy. The following figure shows the confusion matrix. The red squares correspond to the highest correlation between the SVM for a scene and the scene itself. Blue corresponds to the lowest, and the colors in between fall somewhere in the spectrum (warm colors indicate higher correspondence than cold colors).
From this matrix, it appears that Tall Building, Mountain, Industrial, and Living Room were the least successful classifiers. Suburb, Forest, Inside City, Open Country, and Bedroom were the most successful. It is interesting to note the lighter blue squares not on the diagonal show which scenes were most often confused. Kitchen and Forest, Industrial and Mountain, and and Mountain and Office were among these confused scenes.