EmguCV: Rotating face images to align eyes

Sometimes to increase accuracy of face recognition algorithms it’s important to make sure the face is upright. For instance, in this image of Arnold, he is tilting his head, which may make it difficult to recognize him:

One way to pre-process this image is to rotate the it so the face is upright. The fastest way to do that is to find the eyes using a cascade classifier and then finding the angle between the eyes. This method AlignEyes will take an image and return one that is rotated upright:

public static Image<Gray, byte> AlignEyes(Image<Gray, byte> image)
{
     Rectangle[] eyes = EyeClassifier.DetectMultiScale(image, 1.4, 0, new Size(1, 1), new Size(50, 50));
     var unifiedEyes = CombineOverlappingRectangles(eyes).OrderBy(r => r.X).ToList();
     if (unifiedEyes.Count == 2)
     {
           var deltaY = (unifiedEyes[1].Y + unifiedEyes[1].Height/2) - (unifiedEyes[0].Y + unifiedEyes[0].Height/2);
           var deltaX = (unifiedEyes[1].X + unifiedEyes[1].Width/2) - (unifiedEyes[0].X + unifiedEyes[0].Width/2);
           double degrees = Math.Atan2(deltaY, deltaX)*180/Math.PI;
           if (Math.Abs(degrees) < 35)
           {
                   image = image.Rotate(-degrees, new Gray(0));
           }
     }
     return image;
}

EyeClassifier is a cascade classifier using the training file included with EmguCV called “haarcascade_eye_tree_eyeglasses.xml”. You can use whatever training you find works best.

And here is the result (the face has been cropped and masked in this image):

Face Detection for .NET using EmguCV

First of all, let me explain the difference between face “detection” and face “recognition”.  There seems to be a lot of misinformation out there about these two terms and they are not interchangeable.  Face detection is when a computer finds all the faces that appear in an image.  The best algorithm out there right now is Viola-Jones method using cascade classifiers. The Viola-Jones method can actually be trained to detect any object, so it isn’t specific to detecting faces. For instance, it can detect an apple in a given image.

Face recognition is when a computer gives a name to a face image. There are many different algorithms for recognition including Eigen faces, Fischer faces, and Local Binary Pattern Histograms.

Okay, now that you know the difference between detection and recognition, I will show you how to do detection simple using a CascadeClassifier in EmguCV.

First, we need to construct a classifier using some of the built in training files. These can be found under the HaarCascades directory in the EmguCV installation directory. We make a new classifier like this:

private static readonly CascadeClassifier Classifier = new CascadeClassifier("haarcascade_frontalface_alt_tree.xml");

Secondly, classifiers only take grayscale image so we convert our Bgr image to gray:

 Image<Gray, byte> grayImage = image.Convert<Gray, byte>(); 

Finally, we call the DetectMultiScale method on our classifier:

Rectangle[] rectangles = Classifier.DetectMultiScale(grayImage, 1.4, 0, new Size(100,100),new Size(800,800));

Let’s review these parameters because you will probably need to tweak them for your system. The first parameter is the grayscale image. The second parameter is the windowing scale factor. This parameter must be greater than 1.0 and the closer it is to 1.0 the longer it will take to detect faces but there’s a greater chance that you will find all the faces. 1.4 is a good place to start with this parameter.

The third parameter is the minimum number of nearest neighbors. The higher this number the fewer false positives you will get. If this parameter is set to something larger than 0, the algorithm will group intersecting rectangles and only return those that have overlapping rectangle greater than or equal to the minimum number of nearest neighbors. If this parameter is set to 0, all rectangles will be returned and no grouping will happen, which means the results may have intersecting rectangles for a single face.

The last two parameters are the min and max sizes in pixels. The algorithm will start searching for faces with a window 800×800 and it will decrease the window size by the factor of 1.4 until it reaches the min size of 100×100. The bigger the range between the min and max size, the longer the algorithm will take to complete.

The output of the DetectMultiScale function is a set of rectangles that represent where the faces are relative to the input image.
It’s as easy as that. With just a few lines of code, you can detect where all the faces are in any image.

You can download EmguCV here: http://www.emgu.com/wiki/index.php/Main_Page