Dec 12, 2011
Sungmin Lee
In this project, I implemented automatic rating system about human facial attractiveness. This implementation is based on a Journal "Facial Attractiveness: Beauty and the Machine" (2006, Y Eisenthal et al.) . Based on this journal, I extracted some features from Cambridge University Computer Laboratory's face database and by calculating euclidean distances from each feature, I managed to feed training data. To varify my result, I used classification techniques(K-nearest neighber and Support Vector Machine) to discern attractive images from unattractive images. and I also used a regression based on SVM to estimate rates of test images. The result shows 60% accuracy on classification, and 65% accuracy on estimation.
                One of the main issue of this project was looking for a plausible face database. Since I was going to extract some features from them
                very precisely, there were some restrictions to collect data. That is, 
                 1. The image should face to the front.
                 2. In the same sense, there shouldn't be some hidden part. (eg. covered by hand, sun glasses)
                 3. The number of data should be reasonably large. 
                Actually, in Eisenthal's journal they only collected women's images within specific age range to
                minimize the other factors which penalize fair comparison such as age difference, different jewelries.
                In this project, I just collected all possible data from 
                Cambridge University Computer Laboratory's face database since it was too problematic to collect perfect dataset for one person. 
                from this database, I collected 39 face images of all different age and gender as a training set.
                As a next step, I asked 7 people to rate the attractivness of images from 1 to 10(1: very unattractive, 10: very attractive) to set a 
                ground-truth of human annotation. Since the quality of dataset was very poor and all the people from the images are pretty average,
                average ratings were heavily baised to center. (avg=4.74, min=3, max=7.1)
            
< Fig 1. 29 points from a face >
After that, I extracted 29 points from a face (fig 1), and caluclated euclidean distances of some features such as Face length, face width(between cheeks and chins), eye size, nose size, etc. But more important things than sizes themselves are ratios of those features since sizes are heavily variant to picture condition. So differently from the paper, I got rid of all the size features, and just used proportions of those features instead. All the details of feature vector information are described in "measure_feature_set.m" file.
I calculated feature vectors for all the dataset, and made a test by using classification techniques to check whether they are distributed properly. I wanted to discern attractive faces from unattractive ones, so I collected 10% most and least attractive images from human rating respectively. I used both K-nearest neighbor and svm classifier, and they both showed around 60% accuracy in classification. Since the main goal of this project is automatically estimating a human's attractiveness, I employed SVR(Support Vector Regression).
I collected another database as a test set, and I also ranked those images to compare with computer estimation. I used nu-SVR with many kernel types such as linear, polynomial, but linear generally led acceptable result.
Even though I didn't use eigenfaces like the original paper, the result was comparable to the result. If the database was large enough and there were many people ranking the database, the we can expect more precise result. Beside these techniques, taking some specific features from a face(specifically, eyes, lips, nose, for example) could be an interesting approach too.