The goal of this project is to recognize scenes given a set of training data. This was accomplished using a variety of methods broken down into two categories: feature extraction and classification. For feature extraction, I implemented the "tiny images" method and the "bag of words (SIFT descriptor)" method. For classification, I implemented the KNN, or K nearest neighbors, algorithm, as well as SVM classification. The steps taken to implement these methods are detailed below.
The "Tiny Images" method of feature extraction takes each image and shrinks it down to a small size. In my program, images are shrunken down to 16x16px. This method of feature extraction accomplishes two things:
I first formed a "vocabulary" of visual words by sampling features (SIFT descriptors) from the training set and clustering them via K-means. This resulted in a set of cluster means.
When looking at the test images, I proceeded to sample features (SIFT descriptors) from the image. For each feature, I determined which cluster the feature fell into by finding the closest cluster mean to that feature. I did this for each SIFT feature to create a histogram of the number of SIFT features that fell into each cluster. This histogram is the feature representation that gets used during classification.
The KNN method of classification is as follows:
To train 1-vs-all linear SVMS, we determined the linear classification lines' w and b parameters in the equation y = wx+b, where x is the training features. W and B represent the parameters which we tune to best draw a line between the test images that fell into class "X" from all other classes. This was done for all classes "X".
Next, for each test image, we determined which line the test image was most strongly classified as class "X". We then classify this test image as class X.
The accuracy of my program is as follows:
Tiny Images & KNN: 0.204
Bag of SIFTs & KNN: 0.512
Bag of SIFTS & SVM: 0.639
Bag of SIFTS & SVM
Category name | Accuracy | Sample training images | Sample true positives | False positives with true label | False negatives with wrong predicted label | ||||
---|---|---|---|---|---|---|---|---|---|
Kitchen | 0.530 | ![]() |
![]() |
![]() |
![]() |
![]() Office |
![]() InsideCity |
![]() Industrial |
![]() Bedroom |
Store | 0.500 | ![]() |
![]() |
![]() |
![]() |
![]() LivingRoom |
![]() TallBuilding |
![]() LivingRoom |
![]() Industrial |
Bedroom | 0.390 | ![]() |
![]() |
![]() |
![]() |
![]() LivingRoom |
![]() Store |
![]() Industrial |
![]() Office |
LivingRoom | 0.480 | ![]() |
![]() |
![]() |
![]() |
![]() TallBuilding |
![]() Industrial |
![]() Bedroom |
![]() Store |
Office | 0.780 | ![]() |
![]() |
![]() |
![]() |
![]() Store |
![]() LivingRoom |
![]() LivingRoom |
![]() Kitchen |
Industrial | 0.540 | ![]() |
![]() |
![]() |
![]() |
![]() InsideCity |
![]() InsideCity |
![]() Kitchen |
![]() Store |
Suburb | 0.920 | ![]() |
![]() |
![]() |
![]() |
![]() Industrial |
![]() Industrial |
![]() InsideCity |
![]() TallBuilding |
InsideCity | 0.520 | ![]() |
![]() |
![]() |
![]() |
![]() Store |
![]() Store |
![]() TallBuilding |
![]() LivingRoom |
TallBuilding | 0.620 | ![]() |
![]() |
![]() |
![]() |
![]() Highway |
![]() Street |
![]() LivingRoom |
![]() LivingRoom |
Street | 0.700 | ![]() |
![]() |
![]() |
![]() |
![]() Coast |
![]() Highway |
![]() InsideCity |
![]() Industrial |
Highway | 0.790 | ![]() |
![]() |
![]() |
![]() |
![]() OpenCountry |
![]() OpenCountry |
![]() Store |
![]() Mountain |
OpenCountry | 0.430 | ![]() |
![]() |
![]() |
![]() |
![]() Coast |
![]() Highway |
![]() Store |
![]() Mountain |
Coast | 0.720 | ![]() |
![]() |
![]() |
![]() |
![]() OpenCountry |
![]() Industrial |
![]() Street |
![]() Highway |
Mountain | 0.770 | ![]() |
![]() |
![]() |
![]() |
![]() OpenCountry |
![]() OpenCountry |
![]() Bedroom |
![]() TallBuilding |
Forest | 0.900 | ![]() |
![]() |
![]() |
![]() |
![]() TallBuilding |
![]() OpenCountry |
![]() Store |
![]() Mountain |
Category name | Accuracy | Sample training images | Sample true positives | False positives with true label | False negatives with wrong predicted label |