Project 3: Scene recognition with bag of words
Vibhu Ramani
October 24th, 2011
Extremely brief summary
Starting with a small set of images from 15 scene database (described in Lazebnik et al. 2006), we picked up features at random and clustered them into a vocabulary of visual words using kmeans. After representing each training image in these visual words we trained a SVM and generated models for each category. Using these models we then classified the test images into 1 of those categories and calculated our performanceBase Level Performance (No Optimization)

Accuracy 0.6193
Confusion Matrix
93 1 0 2 2 0 0 0 2 0 0 0 0 0 0 2 80 0 5 0 3 10 0 0 0 0 0 0 0 0 1 0 94 0 0 3 1 1 0 0 0 0 0 0 0 2 10 0 78 3 2 3 0 0 0 1 0 1 0 0 4 4 1 1 62 0 0 4 4 0 0 0 10 1 9 8 1 4 2 0 79 3 1 1 0 1 0 0 0 0 6 22 6 7 0 11 43 2 1 0 2 0 0 0 0 2 0 0 9 21 3 0 53 3 0 0 2 0 3 4 0 2 2 0 2 6 0 5 74 1 3 3 1 0 1 0 0 0 0 0 0 0 0 0 90 2 0 8 0 0 2 0 0 0 2 1 1 1 2 13 42 2 22 11 1 6 4 1 8 8 4 3 3 9 2 2 33 4 4 9 0 0 0 0 9 1 0 0 1 20 5 4 52 2 6 1 0 0 1 2 4 0 2 3 25 25 1 17 8 11 1 0 3 2 19 5 0 3 6 5 3 0 4 1 48
(KD Tree, Sample 500 points/image)

Accuracy 0.6093
Confusion Matrix
95 1 0 1 1 0 0 0 0 0 1 0 0 1 0 2 79 0 6 0 4 9 0 0 0 0 0 0 0 0 0 0 95 0 0 3 0 2 0 0 0 0 0 0 0 2 10 1 76 3 3 3 0 0 0 1 0 1 0 0 4 4 2 1 57 0 0 4 7 0 1 0 10 1 9 8 0 3 2 0 82 2 2 1 0 0 0 0 0 0 8 25 9 5 0 5 44 2 0 0 2 0 0 0 0 2 0 0 8 20 2 1 55 2 0 0 3 0 2 5 0 3 2 0 3 5 0 8 70 2 1 2 2 0 2 0 0 0 0 0 0 0 0 0 88 3 0 9 0 0 2 0 0 0 2 2 1 1 4 12 40 2 23 9 2 6 4 1 12 8 4 5 3 10 2 1 27 4 3 10 0 0 0 0 9 1 0 2 0 23 7 2 50 1 5 2 0 0 2 3 3 0 1 3 22 25 5 19 5 10 1 0 2 2 15 7 0 2 6 4 4 1 3 2 51
(KD Tree, Sample 50 points/image)

Accuracy 0.6267
Confusion Matrix
96 1 0 0 2 0 0 0 0 0 1 0 0 0 0 1 79 0 6 0 4 10 0 0 0 0 0 0 0 0 0 0 94 0 0 3 1 2 0 0 0 0 0 0 0 0 11 1 78 4 1 3 0 0 0 1 1 0 0 0 4 4 2 1 61 0 0 3 3 1 1 0 11 1 8 9 0 3 1 0 81 3 1 1 0 1 0 0 0 0 3 26 4 4 0 11 48 2 1 0 1 0 0 0 0 3 0 0 7 21 2 1 49 5 0 0 4 0 3 5 0 3 2 0 2 3 0 4 75 1 5 3 1 0 1 0 0 0 0 0 0 0 0 0 88 3 0 9 0 0 3 0 0 0 1 1 2 0 3 14 47 3 17 6 3 5 3 0 9 8 5 5 5 9 2 1 32 4 5 7 1 0 0 0 10 1 0 0 2 20 6 4 52 2 2 2 0 0 0 2 4 0 3 6 22 24 2 20 7 8 0 0 2 3 15 6 0 2 6 2 4 1 3 3 53
Using gaussian(3 levels) to get features (KD Tree, Sample 50 points/image)

Accuracy 0.5813
Confusion Matrix
88 2 0 3 1 0 0 1 1 1 1 0 0 0 2 5 74 0 6 0 5 9 0 0 0 0 0 0 0 1 0 0 90 0 0 6 2 2 0 0 0 0 0 0 0 0 15 0 70 4 4 3 1 0 2 0 0 1 0 0 3 1 1 2 58 0 0 6 10 0 0 2 8 1 8 5 0 5 2 0 81 2 4 0 1 0 0 0 0 0 12 21 7 7 0 10 39 3 0 0 1 0 0 0 0 1 0 0 10 23 4 1 51 1 0 0 2 0 3 4 0 3 2 1 2 7 0 6 66 3 2 1 3 3 1 0 0 0 0 0 0 0 0 0 93 0 0 7 0 0 3 0 0 0 2 2 1 3 4 16 40 1 16 8 4 7 3 1 14 6 9 11 5 12 3 2 13 4 4 6 0 0 0 0 10 1 0 2 1 25 6 0 51 2 2 4 0 0 1 2 5 0 3 3 26 17 3 16 15 5 2 2 3 3 12 7 1 5 6 5 3 1 5 2 43
Using gaussian(1 levels) to get features (KD Tree, Sample 50 points/image)

Accuracy 0.6027
Confusion Matrix
90 1 0 2 2 0 1 1 1 0 1 0 0 0 1 0 78 1 6 0 5 9 0 0 0 1 0 0 0 0 0 0 92 0 0 5 1 2 0 0 0 0 0 0 0 1 15 1 72 1 5 3 1 0 0 0 0 1 0 0 3 2 1 2 55 0 0 5 7 1 0 2 10 0 12 8 0 4 3 0 75 3 6 0 0 0 1 0 0 0 3 23 8 7 1 8 48 2 0 0 0 0 0 0 0 2 0 0 6 23 4 0 56 2 0 0 2 0 2 3 0 3 2 0 2 4 0 5 71 1 3 1 2 1 5 0 0 0 0 0 0 0 0 0 90 2 0 8 0 0 0 0 0 0 1 3 1 1 5 11 44 3 22 7 2 6 3 3 12 7 7 4 4 9 3 1 23 6 4 8 0 0 0 0 10 1 0 0 2 22 6 3 52 1 3 3 0 0 1 0 5 0 1 8 22 22 2 19 8 9 3 1 2 3 12 8 0 2 7 4 2 1 4 1 50
Vocab Size 10 words (KD Tree, Sample 50 points/image)

Accuracy 0.4427
Confusion Matrix
17 2 1 26 8 13 7 3 2 0 12 2 1 1 5 0 87 0 5 0 3 5 0 0 0 0 0 0 0 0 0 0 95 0 0 4 0 1 0 0 0 0 0 0 0 0 35 1 51 4 3 3 1 1 0 0 0 0 0 1 2 3 1 0 46 1 0 4 11 5 1 0 14 2 10 0 5 11 4 0 58 11 2 7 0 1 1 0 0 0 0 22 7 16 1 15 35 0 2 0 2 0 0 0 0 1 2 2 3 33 4 0 19 17 0 6 1 2 3 7 0 0 3 0 16 2 0 3 62 1 3 1 4 0 5 0 1 0 1 1 1 0 1 1 88 0 0 5 1 0 4 0 0 0 6 6 0 8 10 20 24 3 12 2 5 1 9 3 8 9 17 1 5 16 3 2 9 4 3 10 0 0 0 1 13 1 0 1 4 32 7 0 40 0 1 1 0 0 0 3 4 0 6 7 24 17 3 23 6 6 1 3 4 5 20 6 0 1 19 3 4 3 3 1 27
Vocab Size 20 words (KD Tree, Sample 50 points/image)

Accuracy 0.5147
Confusion Matrix
53 1 0 15 5 6 5 4 1 1 1 3 1 2 2 0 78 1 11 0 2 7 0 0 1 0 0 0 0 0 1 0 93 0 0 5 0 1 0 0 0 0 0 0 0 5 20 0 55 2 10 4 2 0 0 0 0 0 0 2 2 3 1 1 49 0 0 9 19 1 1 3 9 0 2 1 0 4 5 0 86 1 0 2 0 0 1 0 0 0 5 25 10 11 0 13 33 3 0 0 0 0 0 0 0 4 1 0 8 20 2 0 49 3 2 0 1 1 2 7 0 2 1 0 1 3 0 12 67 2 2 6 4 0 0 0 0 0 0 0 0 0 1 0 87 2 0 9 1 0 0 1 0 2 2 0 0 6 8 19 30 5 18 4 5 5 7 2 7 8 8 5 7 10 5 6 21 1 2 6 1 0 0 0 10 1 0 3 7 28 11 1 38 0 0 4 0 0 0 3 2 0 2 9 25 23 8 16 5 3 0 5 3 3 16 7 0 4 13 2 4 4 7 4 28
Vocab Size 50 words (KD Tree, Sample 50 points/image)

Accuracy 0.5767
Confusion Matrix
82 2 0 6 2 0 1 1 0 0 1 1 0 1 3 2 80 0 7 0 2 9 0 0 0 0 0 0 0 0 1 0 94 0 0 5 0 0 0 0 0 0 0 0 0 3 14 0 72 0 4 3 0 1 1 0 1 0 0 1 5 3 0 1 52 0 0 10 5 1 0 1 13 2 7 8 0 4 4 0 78 4 1 1 0 0 0 0 0 0 4 31 7 12 0 10 33 2 1 0 0 0 0 0 0 3 1 0 9 19 1 2 57 1 0 0 3 0 1 3 0 2 1 0 4 5 0 5 64 1 4 5 5 0 4 0 0 0 0 0 0 0 0 0 86 3 0 11 0 0 2 0 0 0 3 1 1 0 4 17 37 5 23 5 2 9 2 1 11 5 6 4 9 13 4 1 20 6 2 7 1 0 0 0 9 1 0 1 2 17 7 2 55 3 2 4 0 0 0 2 3 0 2 7 19 20 6 18 11 8 4 0 2 3 15 7 2 1 8 3 3 1 6 1 44
Vocab Size 100 words (KD Tree, Sample 50 points/image)

Accuracy 0.6053
Confusion Matrix
88 1 0 2 3 0 2 0 1 0 1 1 0 1 0 4 75 0 7 2 1 11 0 0 0 0 0 0 0 0 1 0 94 0 0 4 0 1 0 0 0 0 0 0 0 2 13 1 70 3 5 3 0 0 0 1 0 0 1 1 3 2 0 2 62 0 2 5 5 0 0 0 8 1 10 6 1 4 2 0 81 2 0 2 0 1 1 0 0 0 3 26 9 8 1 11 39 2 0 0 1 0 0 0 0 1 0 0 8 20 4 0 56 1 0 1 0 0 3 6 0 2 1 0 2 4 0 3 80 1 2 3 1 0 1 0 0 0 0 0 0 0 0 0 87 3 0 10 0 0 2 0 0 0 0 2 1 0 2 11 46 3 21 9 3 3 5 1 11 6 7 5 6 12 2 2 26 3 3 8 1 0 0 0 10 1 0 1 2 23 10 2 44 3 3 1 0 0 0 2 3 0 3 8 23 20 1 19 10 10 0 0 2 4 16 5 0 2 7 2 4 0 6 2 50
Vocab Size 400 words (KD Tree, Sample 50 points/image)

Accuracy 0.6107
Confusion Matrix
94 1 0 1 1 0 0 1 1 0 1 0 0 0 0 0 79 1 8 0 3 9 0 0 0 0 0 0 0 0 0 0 95 0 0 3 0 2 0 0 0 0 0 0 0 0 9 1 76 4 3 4 0 0 1 1 1 0 0 0 4 3 1 1 60 0 0 4 5 1 1 1 12 0 7 6 0 6 2 0 79 4 1 1 0 1 0 0 0 0 4 23 16 6 0 7 39 2 1 0 2 0 0 0 0 0 0 0 12 23 4 0 52 3 0 0 2 0 2 2 0 3 2 0 1 3 0 8 74 2 3 3 0 0 1 0 0 0 0 0 0 0 0 0 87 3 0 10 0 0 1 1 0 0 2 1 2 0 3 14 41 2 23 8 2 7 2 3 9 5 6 6 3 6 2 1 34 5 3 8 1 0 0 0 12 1 0 1 2 20 8 1 52 1 1 0 0 0 2 4 4 0 1 3 29 18 2 22 6 9 0 1 4 1 17 6 0 3 5 4 3 1 5 2 48
Vocab Size 1000 words (KD Tree, Sample 50 points/image)

Accuracy 0.6120
Confusion Matrix
95 1 0 0 1 0 0 1 1 1 0 0 0 0 0 3 80 1 4 0 3 9 0 0 0 0 0 0 0 0 0 0 95 0 0 3 0 2 0 0 0 0 0 0 0 1 9 0 78 4 3 3 0 0 0 1 0 1 0 0 4 4 1 1 58 0 0 3 7 2 1 0 11 0 8 5 2 6 1 0 79 2 4 0 1 0 0 0 0 0 4 27 23 6 0 7 30 2 1 0 0 0 0 0 0 0 1 0 12 19 3 1 55 3 0 0 0 0 1 5 0 3 2 0 1 3 0 5 78 3 2 1 1 0 1 0 0 0 0 0 0 0 0 0 91 1 0 8 0 0 1 1 1 0 1 2 2 0 3 14 38 4 23 9 1 3 4 2 7 8 9 4 5 9 2 1 33 5 2 6 0 0 0 0 12 1 0 0 5 19 4 3 53 1 2 0 0 0 0 2 5 0 3 6 33 15 3 18 7 8 0 1 4 3 16 6 0 2 5 3 4 2 4 2 48