Eye detection and tracking has been an active research field in the past years as it adds convenience to a variety of applications. Eye-gaze tracking is been considered as untraditional method of Human Computer Interaction. Eye tracking is considered as the easiest alternative interface methods. For eye tracking and eye detection several different approaches have been proposed and used to implement different algorithms for these technologies. Generally, an eye tracking and detection system can be divided into four steps: Face detection, eye region detection, pupil detection and eye tracking.
The system has some proposed requirements such as the acquisition of user images from webcam by capturing snapshots, processing them, detecting essential points for eye tracking and calibration process.
The procedure starts with image acquisition, from a web camera but a prerecorded video can also be used for testing proposes. This webcam tool can be utilized using the Support Package for USB Webcams in MATLAB by creating a webcam object. After image acquisition, there is a calibration block where eye gaze is initialized for the pointer position on the screen. This block is decomposed in two stages: i) detection of pupil position, ii) calculation of the edge points from the center. The eye tracking block consists of pupil detection and tracking along the image sequence created by the video feed snapshots. With pupil position and the transformation matrix, the point on the screen where the user is looking is determined. The system is based on the image processing toolbox in MATLAB namely Computer Vision Toolbox, that implements important algorithms like: the Viola-Jones Haar Cascade classifier and Hough transform. System operation can be divided into two phases: calibration and real time processing.
Firstly, face detection is performed with Haar classifiers. These classifiers are based on feature extraction, which as discussed above finds the contrast variation inside the group of pixels making two distinguished areas, darker and lighter shades. The classifiers are trained with two groups of images, good and bad examples of the specific features. The use of a common web camera imposes some limitations on the quality of the acquired image, particularly regarding the lighting conditions. These limitations can cause face miss detection due to the lack of contrast. This problem is solved by using the different Haar filters which make the face detection step more robust. Locating a face in a photograph refers to finding the coordinate of the face in the image whereas localization refers to demarcating the extent of the face, via a bounding box around the face. The bounding box is a rectangular box that can be determined by the x and y axis coordinates in the upper-left corner and the x and y axis coordinates in the lower-right corner of the rectangle. It is a part of the Computer Vision Toolbox in MATLAB. For the purpose of minimizing detection errors and reduce processing time, relevant regions are cropped for further processing. For eye detection only the top half of face image is used. This procedure helps us minimize the errors and speed up eye detection step.
Once the system has the positions of the eye, iris detection is performed. Iris detection is one of the most important steps in the Calibration block, namely due to accuracy requirements. For this task, Hough circle transform is used. From Hough transform, the system is able to identify the iris. To speed up the process, only the image window with the eyes is used since we created a second detector for just the eye pair using the object detector. In this step, the image is converted to gray scales and flipped horizontally to create a more precise bounding box.
After obtaining the pupil center, the calibration operation step is finished. The main algorithms of this phase (Haar classifiers and Hough transform) are computational demanding and not suitable to be applied is the real time processing operation mode. For the tracking algorithm the Mean Shift procedure is followed.
When the iris is detected, a feature set (distance of eye corner position to center of iris) based on eye gaze detection is generated. From eye detection, we extract the iris localization eye, corner positions and gaze detection. All extracted features will be used to classify the distracted and non-distracted eye behavior. The pupil is detected which is considered as the center point.The distance between the right edge to the center point as well as the distance between the left edge to the center point is then calculated. If the distance of the left edge is more than the distance of the right edge then the image saying “right” is displayed and the same goes for the left edge. If the distances are the same then the image saying “straight” is displayed.
Using Hough transformation, we have been able to extract the measurements of the eyeball as well. The radius of the eyeball calculated is 8.4351mm. Using the ‘regionprops’ function we extracted the values of the bounding box around the biggest face identified as BoundingBox: [167.5000 295.5000 738 403].
This project was created and implemented on MATLAB. The additional MATLAB Add-On Packages and toolboxes aside from the default installations are:
- Image Processing Toolbox
- Computer Vision Toolbox
- MATLAB Support Package for USB Webcams
For the project to yield a proper output in view of our objective it is important to have appropriate video feed from a static camera and a qualifying resolution for the same. For this project we used a laptop webcam. For desktop computers a USB camera can also be set up if the relevant packages are installed priorly.
This project was inspired by Le Tan Phuc and Rafael Santos approach in building a Human Computer Interface