scene recognition

miya schneider (mmschnei)

overview

This project was completed for CSCI1430: Computational Vision, taught by James Hays at Brown University. This project attempts to classify scenes by using a bag of words model and a 15 scene database used in the paper titled Spatial Pyramid Matching for Recognizing Natural Scene Categories, published by Lazebnik et al. in 2006.

The project can be broken into five steps:

results

The resulting SVMs were able to classify at around 61 percent accuracy. The following figure shows the confusion matrix. The red squares correspond to the highest correlation between the SVM for a scene and the scene itself. Blue corresponds to the lowest, and the colors in between fall somewhere in the spectrum (warm colors indicate higher correspondence than cold colors).


Confusion Matrix

From this matrix, it appears that Tall Building, Mountain, Industrial, and Living Room were the least successful classifiers. Suburb, Forest, Inside City, Open Country, and Bedroom were the most successful. It is interesting to note the lighter blue squares not on the diagonal show which scenes were most often confused. Kitchen and Forest, Industrial and Mountain, and and Mountain and Office were among these confused scenes.