RSS

Generating EigenFaces with Mahout SVD to recognize person faces


@ Author: Copy from Chimper Blog

catnmouseIn this tutorial, we are going to describe how to generate and use eigenfaces to recognize people faces.
Eigenfaces are a set of eigenvectors derived from the covariance matrix of the probability distribution of the high-dimensional vector space of possible faces of human beings. It can be used to identify a face on a picture from a person face database very quickly. In this post, we’ll not give much details on the mathematical aspects but if you are interested on those, you can look at the excellent post Face Recognition using Eigenfaces and Distance Classifiers: A Tutorial from the Onionesque Reality Blog.

Requirements

To do this tutorial, you would need to have the following softwares installed on your machine:

  • Java >= 1.6
  • Hadoop
  • Mahout
  • Maven

You can find the instructions to install those from a previous post Playing with the Mahout recommendation engine on a Hadoop cluster.

Compiling the code

All the sourcecode, the training sets and testing sets are in the github repository at https://github.com/fredang/mahout-eigenface-example/

You can fetch the files from this repository by typing:

$ git clone https://github.com/fredang/mahout-eigenface-example.git

This repository is structured as follow:

Once you fetched the project, you can compile it using maven:

$ mvn clean package assembly:single

It creates a jar file in the directory target which all the dependencies and the compiled class from the src/main/java directory.

Preparing the data set

You can download the yale face database by going to this page: http://vision.ucsd.edu/content/yale-face-database

Unzip the file:

$ unzip yalefaces.zip

Now we are going to split this file into two sets: a training set and a testing set:

$ mkdir training-set
$ mv yalefaces/* training-set/

For the testing set, we are removing the sad facial expression from the training set and move it to the testing set:

$ mkdir testing-set
$ mv training-set/*.sad testing-set/

We also add two non face images(hamburger and cat) and one person face unknown to the training set(Bruce Lee):

$ cp [MAHOUT EIGENFACE EXAMPLE DIRECTORY]/images/yalefaces-test/* testing-set

Training the model

The training is implemented in the class GenerateCovarianceMatrix to generate the covariance matrix.

The arguments of this class are:

  • image width: it is used to scale down the image width so that the computation does not take too much memory
  • image height
  • training directory: directory containing the training face images
  • output directory

$ java -cp target/mahout-eigenface-example-1.0-jar-with-dependencies.jar com.chimpler.example.eigenface.GenerateCovarianceMatrix 80 60 [TRAINING_SET_DIRECTORY] output
This program:

  1. reads all the n image files from the training directory
  2. convert each image to greyscale and scale down the image
    person-sample
  3. create a matrix M with each column representing an image. The column has a length of w xand each of its element represents a shade of grey with a value between 0(black) and 255(white).
  4. compute the mean image and write it in output/mean-image.gif. It is computed by averaging each pixel of the images
    mean-image
  5. compute the diff matrix DM by substracting the mean image to M
  6. Compute the covariance matrix transpose(DM) x DM. It gives the matrix of size n x n
  7. write the diff matrix DM to output/diffmatrix.seq
  8. write the covariance matrix to output/covariance.seq

Now we need to compute the eigenvectors of the covariance matrix. It can be done using the Mahout Singular Value Decomposition(SVD).

To use it, first copy the file covariance.seq to HDFS:

$ hadoop fs -put output/covariance.seq covariance.seq

Then run the Mahout SVD:

$ mahout svd --input covariance.seq --numRows 150 --numCols 150 --rank 50 --output output

We set the –numRows and –numCols to the size of the covariance matrix (150 x 150) and the rank to 50 (we usually set it to one third of the number of images).

The computed eigen vectors might contain extra eigenvectors with invalid eigenvalues. To fix this, we can run mahout cleansvd:

$ mahout cleansvd -ci covariance.seq -ei output -o output2

We can now copy the clean eigen vector to the local filesystem:

$ hadoop fs -get output2/cleanEigenvectors output/cleanEigenvectors

Then execute the java class ComputeEigenFaces to create the eigenfaces.

To run the program:

$ java -cp target/mahout-eigenface-example-1.0-jar-with-dependencies.jar com.chimpler.example.eigenface.ComputeEigenFaces output/cleanEigenvectors output/diffmatrix.seq output/mean-image.gif 80 60 [TRAINING_SET_DIRECTORY] output

It creates the eigenfaces matrix in output/eigenfaces.seq and the images representing those eigenfaces in the output directory:
person-eigenfaces

It also tries to reconstruct the faces of the training sets using the eigenfaces. To do that it computes the weight of each eigenface by doing a scalar product of the image pixel column with each eigenface column and then normalize it. Then it sums up each pixel of the eigenfaces weighted by those weights. You can think of this process as superposing the eigenfaces layers and give them a different transparency value (can be negative) to try to reconstruct the original image.

After having reconstructed the image, it computes the distance between the original image and the reconstructed image (using euclidian distance between the pixels):

Reconstructed Image distance for subject01.centerlight: 37.395691
Reconstructed Image distance for subject01.glasses: 32.350212
Reconstructed Image distance for subject01.happy: 27.559056
Reconstructed Image distance for subject01.leftlight: 28.008936
Reconstructed Image distance for subject01.noglasses: 47.047757
Reconstructed Image distance for subject01.normal: 32.627928
Reconstructed Image distance for subject01.rightlight: 25.465009
Reconstructed Image distance for subject01.sleepy: 23.635308
Reconstructed Image distance for subject01.surprised: 45.947206
Reconstructed Image distance for subject01.wink: 32.132286
[...]
Min distance = 14.470855648264822
Max distance = 47.047756576566904

These distances are quite small which means that our eigenfaces allows to efficiently represent faces.

Testing the Model

Now that we have trained our model, we are going to test it.
In the training set, we have some of the same people than in the training set but with a different facial expression. We also have two images with are not person face(hamburger and cat) and one image of a new person(Bruce Lee).
persons-test

The class ComputeDistance tests if the images in the testing directory can be recognized as a person face and find the most similar image in the training set.

To run the program:

$ java -cp target/mahout-eigenface-example-1.0-jar-with-dependencies.jar com.chimpler.example.eigenface.ComputeDistance output/eigenfaces.seq output/mean-image.gif output/weights.seq  68 68 [TRAINING_SET_DIRECTORY] [TESTING_SET_DIRECTORY] output

For each image of the testing set, it computes the  weight that needs to be applied on each eigenface to reconstruct the image and generate the reconstructed image in the output directory:

reconstructed-persons-test

As expected, the images representing the face of people from the training set are well reconstructed but not the cat and the hamburger images. The reconstructed face of Bruce Lee is not recognizable but we can see that it is still a face. The program also computes the distance between the original image and the reconstructed image. It also tries for each test image, to find the most similar image in the training set by comparing the eigenfaces weight using euclidian distance:

Reconstructed Image distance for brucelee.gif: 51.404904
Image brucelee.gif is most similar to subject03.surprised: 447.574353
Reconstructed Image distance for cat.gif: 65.154281
Image cat.gif is most similar to subject05.centerlight: 638.072675
Reconstructed Image distance for hamburger.gif: 52.313601
Image hamburger.gif is most similar to subject01.rightlight: 684.214467
Reconstructed Image distance for subject01.sad: 32.473280
Image subject01.sad is most similar to subject01.sleepy: 101.895815
Reconstructed Image distance for subject02.sad: 22.418869
Image subject02.sad is most similar to subject02.noglasses: 104.859642
Reconstructed Image distance for subject03.sad: 35.468822
Image subject03.sad is most similar to subject03.noglasses: 120.972063
Reconstructed Image distance for subject04.sad: 30.370102
Image subject04.sad is most similar to subject04.normal: 0.000000
[...]

Those results confirm the visual interpretations we made previously: the distance between the reconstructed image and the original image of the hamburger and the cat are pretty high, also the weight distance with the images from the training set is pretty high. The image of Bruce Lee is reconstructed fairly but the weight distance is low.

For the other people faces, this distance is pretty small and it successfully associates them to the face of the same person from the training set.

Using the weight distance, we can define two thresholds:

  • T1: threshold at which the images represent a face
  • T2: weight threshold at which the image represents a face from the training set

So if the weight distance is above T1, then the image does not represent a face. Between T1 and T2, it represents an unknown face. And below T2, it represents a face from the training set. Choosing those thresholds is done heuristically.

Conclusion

We show in this post how to generate the eigenfaces from a training set and then uses those eigentafces to recognize person’s face. We also introduce some metrics to to determine if an image represents a person face or not and if it is similar to a face from the training set.

If you are trying this tutorial with other images make sure that:

  • the faces are in the same position in the image
  • the faces have the same scale/rotation angle
  • the faces have the same brightness/contrasts

Some techniques were developed to alleviate those constraints. You can find several papers about this on the web.

 
Leave a comment

Posted by on 21/08/2014 in Mahout

 

Error occurred in deployment step ‘Add Solution’: The solution cannot be deployed.


Issue: Got following error message when deploy solution in visual studio:

Directory “X” associated with feature ‘GUID1’ in the solution is used by feature ‘GUID2’ installed in the farm. All features must have unique directories to avoid overwriting files.”

Resolution for this issue.

– Run Manager Shell as Administrator:

stsadm -o uninstallfeature -id “GUID2” -force

– Deploy solution again.

 
Leave a comment

Posted by on 23/02/2013 in SharePoint

 

Cài đặt bộ gõ tiếng Việt Scim cho CentOS 5.x


1. Cài scim và các thư viện cần thiết.

Mở Terminal, gõ

  yum -y install scim scim-libs scim-tables scim-bridge

2. Tải scim input method for vietnamese về

Tải ở đây : http://code.google.com/p/scim-tables-vietnamese-ext/

3. Cài đặt

Vào chuyển con trỏ lệnh đến thư mục vừa mới tải về. Gõ :

tar -zxf scim-tables-vietnamese-ext*.tar.gz
cd scim-tables-vietnamese-ext/
make install

4. Tuỳ chỉnh

System => Preference => => Input Method, chọn custom input method, chọn scim.

Logout để scim hoạt động. Vào add thêm bộ gõ Telex trong Scim để gõ telex.

 
Leave a comment

Posted by on 18/02/2013 in Linux

 

Yêu cầu hệ thống khi cài đặt Sharepoint Foundation 2010


Hardware requirements

Component Minimum requirement
Processor 64-bit, 4 cores
RAM
  • 4 GB for developer or evaluation use
  • 8 GB for production use in a single server or multiple server farm
Hard disk 80 GB for system drive

Software requirements

Environment Minimum requirement
Database server in a farm One of the following:

  • The 64-bit edition of Microsoft SQL Server 2008 R2.
  • The 64-bit edition of Microsoft SQL Server 2008 with Service Pack 1 (SP1) and Cumulative Update 2. From the Cumulative update package 2 for SQL Server 2008 Service Pack 1 (http://go.microsoft.com/fwlink/p/?LinkId=165962) page, click the View and request hotfix downloads link and follow the instructions. On the Hotfix Request page, download the SQL_Server_2008_SP1_Cumulative_Update_2 file. When you install Microsoft SQL Server 2008 SP1 on Windows Server 2008 R2, you might receive a compatibility warning. You can disregard this warning and continue with your installation.
Note:
We do not recommend that you use CU3 or CU4, but instead CU2, CU5, or a later CU than CU5. For more information, see Cumulative update package 5 for SQL Server 2008 (http://go.microsoft.com/fwlink/p/?LinkId=196928). Download the SQL_Server_2008_RTM_CU5_SNAC file.

For more information about choosing a version of SQL Server, see SQL Server 2008 R2 and SharePoint 2010 Products: Better Together (white paper) (SharePoint Server 2010).

Single server with built-in database
  • The 64-bit edition of Windows Server 2008 Standard, Enterprise, Data Center, or Web Server with SP2; the 64-bit edition of Windows Server 2008 R2 Standard, Enterprise, Data Center, or Web Server; or the 64-bit edition of Windows Server 2008 R2 Service Pack 1 (SP1) Standard, Enterprise, Data Center, or Web Server. If you are running Windows Server 2008 without SP2, the Microsoft SharePoint Products Preparation Tool installs Windows Server 2008 SP2 automatically.
Note:
You must download an update for Windows Server 2008 and Windows Server 2008 R2 before you run Setup. The update is a hotfix for the .NET Framework 3.5 SP1 that is installed by the Preparation tool. It provides a method to support token authentication without transport security or message encryption in WCF. For more information and links, see the “Access to Applicable Software” section later in this article.

For information, see the related KB article Two issues occur when you deploy an ASP.NET 2.0-based application on a server that is running IIS 7.0 or IIS 7.5 in Integrated mode (http://go.microsoft.com/fwlink/p/?LinkId=192578).

The preparation tool installs the following prerequisites:

  • Web Server (IIS) role
  • Application Server role
  • Microsoft .NET Framework version 3.5 SP1
  • SQL Server 2008 Express with SP1
  • Microsoft Sync Framework Runtime v1.0 (x64)
  • Microsoft Filter Pack 2.0
  • Microsoft Chart Controls for the Microsoft .NET Framework 3.5
  • Windows PowerShell 2.0
  • SQL Server 2008 Native Client
  • Microsoft SQL Server 2008 Analysis Services ADOMD.NET
  • ADO.NET Data Services Update for .NET Framework 3.5 SP1
  • A hotfix for the .NET Framework 3.5 SP1 that provides a method to support token authentication without transport security or message encryption in WCF.
  • Windows Identity Foundation (WIF)
Note:
If you have Microsoft “Geneva” Framework installed, you must uninstall it before you install the Windows Identity Foundation (WIF).
Front-end Web servers and application servers in a farm
  • The 64-bit edition of Windows Server 2008 Standard, Enterprise, Data Center, or Web Server with SP2; the 64-bit edition of Windows Server 2008 R2 Standard, Enterprise, Data Center, or Web Server; or the 64-bit edition of Windows Server 2008 R2 Service Pack 1 (SP1) Standard, Enterprise, Data Center, or Web Server. If you are running Windows Server 2008 with SP1, the Microsoft SharePoint Products Preparation Tool installs Windows Server 2008 SP2 automatically.
Note:
You must download an update for Windows Server 2008 and Windows Server 2008 R2 before you run Setup. The update is a hotfix for the .NET Framework 3.5 SP1 that is installed by the Preparation tool. It provides a method to support token authentication without transport security or message encryption in WCF. For more information and links, see the “Access to Applicable Software” section.

For information, see the related KB article Two issues occur when you deploy an ASP.NET 2.0-based application on a server that is running IIS 7.0 or IIS 7.5 in Integrated mode (http://go.microsoft.com/fwlink/p/?LinkId=192578).

The preparation tool installs the following prerequisites:

  • Web Server (IIS) role
  • Application Server role
  • Microsoft .NET Framework version 3.5 SP1
  • Microsoft Sync Framework Runtime v1.0 (x64)
  • Microsoft Filter Pack 2.0
  • Microsoft Chart Controls for the Microsoft .NET Framework 3.5
  • Windows PowerShell 2.0
  • SQL Server 2008 Native Client
  • Microsoft SQL Server 2008 Analysis Services ADOMD.NET
  • ADO.NET Data Services Update for .NET Framework 3.5 SP1
  • A hotfix for the .NET Framework 3.5 SP1 that provides a method to support token authentication without transport security or message encryption in WCF.
  • Windows Identity Foundation (WIF)
Note:
If you have Microsoft “Geneva” Framework installed, you must uninstall it before you install the Windows Identity Foundation (WIF).
Client computer

Optional software

Environment Optional software
Single server with built-in database and front-end Web servers and application servers in a farm

The preparation tool installs the following optional software:

  • Microsoft SQL Server 2008 R2 Reporting Services Add-in for Microsoft SharePoint Technologies 2010 (SSRS) to use Access Services for SharePoint Server 2010. For the download, go to the Download Center(http://go.microsoft.com/fwlink/p/?LinkID=192588).
  • Microsoft Server Speech Platform to make phonetic name matching work correctly for SharePoint Search 2010.
Client computer
 
Leave a comment

Posted by on 25/11/2012 in SharePoint

 

Tags: , ,

Sửa lỗi khi không có quyền upload file .swf vào library hoặc list trong Sharepoint 2010


Người dùng với quyền truy cập Contribute tải file .swf  hoặc các định dạng file bị chặn vào sharepoint library hoặc sharepoint list attachment thì gặp thông báo lỗi

Các định dạng file bị chặn (WebFileExtension) bao gồm:

  • ASPX
  • Master
  • XAP
  • SWF
  • JAR
  • ASMX
  • Ascx
  • XSN
  • XSF

Cách sửa như sau:

Mở SharePoint 2010 Management Shell.

Chạy các dòng script tương ứng như sau:

$WebApp = Get-SPWebApplication http://YourSiteCollectionURL/
$Extensions = $WebApp.WebFileExtensions
$Ext = $Extensions.Remove(“swf”)
$WebApp.Update()

Để xem danh sách WebFileExtensions dùng đoạn script sau:

$WebApp = Get-SPWebApplication https://YourSiteCollectionURL/
$Extensions = $WebApp.WebFileExtensions
$Extensions | ForEach-Object {Write-Host $_}

Nếu muốn thêm vào danh sách WebFileExtensions thì dùng lệnh .Add() thay lệnh .Remove()

 
Leave a comment

Posted by on 03/10/2012 in SharePoint

 

Tags: ,

SharePoint 2010 Powershell Feature Cmdlets


In this installment its time to look at the various cmdlets that have to do with Features. Of course you can look at the UI to do this but its much, much easier to do this powershell and dare I say more fun.

Now keep in mind that this only related to FARM level features, I will cover Sandbox solutions and features next!

Listing features on Farm, Site Collection and Site

The main cmdlet used within powershell to list features is the Get-SPFeature cmdlets. To show all the features on the farm listed by display name and sorted use this:

1 Get-SPFeature | Sort -Property DisplayName

To show all the features on the Farm grouped by scope in a table use:

1 Get-SPFeature | Sort -Property DisplayName, Scope | FT -GroupBy Scope DisplayName

To see all features for a Web Application:

1 Get-SPFeature -WebApplication http://webapplication

To see all features for a Site Collection:

1 Get-SPFeature -Site http://sitecollection

To see all features for a Site:

1 Get-SPFeature -Web http://siteurl

Remember for some more information relating to the features you can use:

1 Get-SPFeature -Web http://siteurl | Format-List

To see all the members that a feature definition has use:

1 Get-SPFeature -Web http://siteurl | Get-Member

Enabling and Disabling Features

To disable and enable features is all pretty easy once again using the Disable-SPFeature and Enable-SPFeature cmdlets but there is a trick. You need the name of the feature folder that contains the actual feature not what is displayed in the UI so be careful:

1 Enable-SPFeature -Identity "Hold" -URL http://url

You can apply this to any Site and Site Collection scoped features.
Obviously to disable a feature just use the same syntax but with the Disable-Feature cmdlet

1 Disable-SPFeature -Identity "Hold" -URL http://url

Remember though that the -Identity is the DisplayName property of the feature, not the text displayed on the UI which is actually retrieved from a resources file.
For example the Document Sets feature looks like below in the SharePoint interface:
But to actually enable it you have to use the following cmdlet:

1 Enable-SPFeature -Identity DocumentSet -URL http://url

Installing and Uninstalling Features

Once again this is pretty straight forward and is really made up of only two cmdlets: Install-SPFeature and Uninstall-SPFeature
To install a feature you need to specify the name of the folder that your feature contains:
1 Install-SPFeature "FeatureFolderName"

To uninstall simply use the same Uninstall-Feature command with the same parameters:

1 UnInstall-SPFeature "FeatureFolderName"
 
Leave a comment

Posted by on 03/05/2012 in SharePoint

 

How to: Create a Windows Communication Foundation Client


See follow link:

How to: Create a Windows Communication Foundation Client

Note: SvcUtil.exe can be found at C:\Program Files\Microsoft SDKs\Windows\v6.0A\bin.

 
Leave a comment

Posted by on 24/04/2012 in WCF