CS1290 Final Project: The Design of High-Level Features for Photo Quality Assessment

Name: Chen Xu
login: chenx

Features

The original paper designs seven high-level features that are supposed to indicate the photo quality, whether a photo is professional or just a snapshot. The features are described as follows:

Spatial Distribution of Edges: edge spatial distributions of professional images and snapshots are learned as mean distributions, Mp and Ms. Quality ql = ds - dp;
Area of Bounding Box: the bounding box should encloses middle 96.04% of edge energy. Quality qb = 1 - wx * wy;
Color Distribution: each image is transformed into a 4096 dimensional color histogram, and quality is defined as difference of number of professional images and snapshots in top 5 nearest neighbors, qcd = np - ns;
Hue Count: hue of images is quantized into 20 bins, quality qh = 20 - |N|, where N is the number of prominent hues;
Blur: the quality is defined as the number of frequencies with high enough energy, normalized by the size of image. qf = |C|/|Ib|;
Contrast: the contrast quality, qct, is equal to the width of the middle 98% mass of color histogram;
Brightness: this is the average intensity of the image.

Classification

After we obtain the example quality values for each feature type of both positive and negative examples, we use the naive bayesion classifier for classification. For continuous data, like data from spatial distribution of edge, area of bounding box, I discretize the data into reasonable number of bins, and create the histogram distributions. The quality of one image considering all features is defined as follows

Dataset

I'm using the database provided by James. The images are from DPChallenge.com. There are totally 13,412 images, and they are divided into 4 parts: 3339 positive training images, 3470 negative training images, 3299 positive test images and 3304 negative test images. Images are only considerred professional or snapshots, no average rating considerred.

Feature Performance

Standard precision and recall curve are plot for each feature, recall and precision are defined as follows,

recall = # professional photos above threshhold / total # professional photos
precision = # professional photos above threshold / # photos above threshold

Table 1 shows the PR curves of seven features separately and the PR curve of combining all features. The results derived are not as good as the results reported by the paper. Still blur is best feature, but not dominant. Brightness is the weakest feature, the same as the paper reported. The combined feature gives the best resutls. I guess the reason my results are not as good as the paper reported is that, I haven't found the best parameters when extracting features, like bin size. But all the features do much better than chance.

Table 2 shows top rank and bottom rank images for the combined feature

Top Ranked (2,4,5,6,8,9,10,18 are negative test examples)
Bottom Ranked (4,8,18 are positive test examples)

Rule of Thirds feature

I create the rule of thirds feature which is used to detect if the a image satisfies rule of thirds. The basic idea is when we find the bounding box which encloses 98% of the edge energy, we consider the elements in the bounding box as the foreground object and hope the center of mass be close to one of the four one third points. So the smaller the distance from center of mass to one third point, the better the quality. Following is the illustration of the algo, as well as the distributions of center of mass. We can see that the center of mass stays at the center of image for snapshots, while the center of mass spreads out for professional photos.

Algo	Center of Mass Distribution, Professional	Center of Mass Distribution, Snapshots

Table 3 shows precision and recall curve of rule of thirds feature. We can see that rule of thirds doesn't help increase the combined feature classification performance, however, its individual performance is close to hue count.

Table 4 shows top ranked image according to rule of thirds.

Top Ranked