@ Author: Copy from Chimper Blog
In this tutorial, we are going to describe how to generate and use eigenfaces to recognize people faces.
Eigenfaces are a set of eigenvectors derived from the covariance matrix of the probability distribution of the high-dimensional vector space of possible faces of human beings. It can be used to identify a face on a picture from a person face database very quickly. In this post, we’ll not give much details on the mathematical aspects but if you are interested on those, you can look at the excellent post Face Recognition using Eigenfaces and Distance Classifiers: A Tutorial from the Onionesque Reality Blog.
To do this tutorial, you would need to have the following softwares installed on your machine:
- Java >= 1.6
You can find the instructions to install those from a previous post Playing with the Mahout recommendation engine on a Hadoop cluster.
Compiling the code
All the sourcecode, the training sets and testing sets are in the github repository at https://github.com/fredang/mahout-eigenface-example/
You can fetch the files from this repository by typing:
$ git clone https://github.com/fredang/mahout-eigenface-example.git
This repository is structured as follow:
- src/main/java/com/chimpler/example/eigenface/GenerateCovarianceMatrix.java: code to generate the covariance matrix from the images from the training set
- src/main/java/com/chimpler/example/eigenface/ComputeEigenFaces.java: code to compute the eigenfaces
- src/main/java/com/chimpler/example/eigenface/ComputeDistance.java: code to test the model with the testing set
- src/main/java/com/chimpler/example/eigenface/Helper.java: code used to do some matrix operations and image operations
- images/yalefaces-test: some additional images to add to the yalefaces testing set
- images/cats-train: training set for cat faces (not used in this example)
- images/cats-test: testing set for cat faces (not used in this example)
Once you fetched the project, you can compile it using maven:
$ mvn clean package assembly:single
It creates a jar file in the directory target which all the dependencies and the compiled class from the src/main/java directory.
Preparing the data set
You can download the yale face database by going to this page: http://vision.ucsd.edu/content/yale-face-database
Unzip the file:
$ unzip yalefaces.zip
Now we are going to split this file into two sets: a training set and a testing set:
$ mkdir training-set $ mv yalefaces/* training-set/
For the testing set, we are removing the sad facial expression from the training set and move it to the testing set:
$ mkdir testing-set $ mv training-set/*.sad testing-set/
We also add two non face images(hamburger and cat) and one person face unknown to the training set(Bruce Lee):
$ cp [MAHOUT EIGENFACE EXAMPLE DIRECTORY]/images/yalefaces-test/* testing-set
Training the model
The training is implemented in the class GenerateCovarianceMatrix to generate the covariance matrix.
The arguments of this class are:
- image width: it is used to scale down the image width so that the computation does not take too much memory
- image height
- training directory: directory containing the training face images
- output directory
$ java -cp target/mahout-eigenface-example-1.0-jar-with-dependencies.jar com.chimpler.example.eigenface.GenerateCovarianceMatrix 80 60 [TRAINING_SET_DIRECTORY] output
- reads all the n image files from the training directory
- convert each image to greyscale and scale down the image
- create a matrix M with each column representing an image. The column has a length of w x h and each of its element represents a shade of grey with a value between 0(black) and 255(white).
- compute the mean image and write it in output/mean-image.gif. It is computed by averaging each pixel of the images
- compute the diff matrix DM by substracting the mean image to M
- Compute the covariance matrix transpose(DM) x DM. It gives the matrix of size n x n
- write the diff matrix DM to output/diffmatrix.seq
- write the covariance matrix to output/covariance.seq
Now we need to compute the eigenvectors of the covariance matrix. It can be done using the Mahout Singular Value Decomposition(SVD).
To use it, first copy the file covariance.seq to HDFS:
$ hadoop fs -put output/covariance.seq covariance.seq
Then run the Mahout SVD:
$ mahout svd --input covariance.seq --numRows 150 --numCols 150 --rank 50 --output output
We set the –numRows and –numCols to the size of the covariance matrix (150 x 150) and the rank to 50 (we usually set it to one third of the number of images).
The computed eigen vectors might contain extra eigenvectors with invalid eigenvalues. To fix this, we can run mahout cleansvd:
$ mahout cleansvd -ci covariance.seq -ei output -o output2
We can now copy the clean eigen vector to the local filesystem:
$ hadoop fs -get output2/cleanEigenvectors output/cleanEigenvectors
Then execute the java class ComputeEigenFaces to create the eigenfaces.
To run the program:
$ java -cp target/mahout-eigenface-example-1.0-jar-with-dependencies.jar com.chimpler.example.eigenface.ComputeEigenFaces output/cleanEigenvectors output/diffmatrix.seq output/mean-image.gif 80 60 [TRAINING_SET_DIRECTORY] output
It also tries to reconstruct the faces of the training sets using the eigenfaces. To do that it computes the weight of each eigenface by doing a scalar product of the image pixel column with each eigenface column and then normalize it. Then it sums up each pixel of the eigenfaces weighted by those weights. You can think of this process as superposing the eigenfaces layers and give them a different transparency value (can be negative) to try to reconstruct the original image.
After having reconstructed the image, it computes the distance between the original image and the reconstructed image (using euclidian distance between the pixels):
Reconstructed Image distance for subject01.centerlight: 37.395691 Reconstructed Image distance for subject01.glasses: 32.350212 Reconstructed Image distance for subject01.happy: 27.559056 Reconstructed Image distance for subject01.leftlight: 28.008936 Reconstructed Image distance for subject01.noglasses: 47.047757 Reconstructed Image distance for subject01.normal: 32.627928 Reconstructed Image distance for subject01.rightlight: 25.465009 Reconstructed Image distance for subject01.sleepy: 23.635308 Reconstructed Image distance for subject01.surprised: 45.947206 Reconstructed Image distance for subject01.wink: 32.132286 [...] Min distance = 14.470855648264822 Max distance = 47.047756576566904
These distances are quite small which means that our eigenfaces allows to efficiently represent faces.
Testing the Model
Now that we have trained our model, we are going to test it.
In the training set, we have some of the same people than in the training set but with a different facial expression. We also have two images with are not person face(hamburger and cat) and one image of a new person(Bruce Lee).
The class ComputeDistance tests if the images in the testing directory can be recognized as a person face and find the most similar image in the training set.
To run the program:
$ java -cp target/mahout-eigenface-example-1.0-jar-with-dependencies.jar com.chimpler.example.eigenface.ComputeDistance output/eigenfaces.seq output/mean-image.gif output/weights.seq 68 68 [TRAINING_SET_DIRECTORY] [TESTING_SET_DIRECTORY] output
For each image of the testing set, it computes the weight that needs to be applied on each eigenface to reconstruct the image and generate the reconstructed image in the output directory:
As expected, the images representing the face of people from the training set are well reconstructed but not the cat and the hamburger images. The reconstructed face of Bruce Lee is not recognizable but we can see that it is still a face. The program also computes the distance between the original image and the reconstructed image. It also tries for each test image, to find the most similar image in the training set by comparing the eigenfaces weight using euclidian distance:
Reconstructed Image distance for brucelee.gif: 51.404904 Image brucelee.gif is most similar to subject03.surprised: 447.574353 Reconstructed Image distance for cat.gif: 65.154281 Image cat.gif is most similar to subject05.centerlight: 638.072675 Reconstructed Image distance for hamburger.gif: 52.313601 Image hamburger.gif is most similar to subject01.rightlight: 684.214467 Reconstructed Image distance for subject01.sad: 32.473280 Image subject01.sad is most similar to subject01.sleepy: 101.895815 Reconstructed Image distance for subject02.sad: 22.418869 Image subject02.sad is most similar to subject02.noglasses: 104.859642 Reconstructed Image distance for subject03.sad: 35.468822 Image subject03.sad is most similar to subject03.noglasses: 120.972063 Reconstructed Image distance for subject04.sad: 30.370102 Image subject04.sad is most similar to subject04.normal: 0.000000 [...]
Those results confirm the visual interpretations we made previously: the distance between the reconstructed image and the original image of the hamburger and the cat are pretty high, also the weight distance with the images from the training set is pretty high. The image of Bruce Lee is reconstructed fairly but the weight distance is low.
For the other people faces, this distance is pretty small and it successfully associates them to the face of the same person from the training set.
Using the weight distance, we can define two thresholds:
- T1: threshold at which the images represent a face
- T2: weight threshold at which the image represents a face from the training set
So if the weight distance is above T1, then the image does not represent a face. Between T1 and T2, it represents an unknown face. And below T2, it represents a face from the training set. Choosing those thresholds is done heuristically.
We show in this post how to generate the eigenfaces from a training set and then uses those eigentafces to recognize person’s face. We also introduce some metrics to to determine if an image represents a person face or not and if it is similar to a face from the training set.
If you are trying this tutorial with other images make sure that:
- the faces are in the same position in the image
- the faces have the same scale/rotation angle
- the faces have the same brightness/contrasts
Some techniques were developed to alleviate those constraints. You can find several papers about this on the web.