Speakers
Mrs
Elena Sibirtseva
(2 National Research University Higher School of Economics)Prof.
Ivan Gostev
(JINR LIT)
Description
Gaze-tracking has the wide range of application: such as human-computer interaction, neurophysiological research, security systems, and car accidents prevention systems. Tracking of gaze direction can be performed in real-time or recorded and then post-processed. In this research real-time tracking is the point of interest, since it gives immediate response to the user and can be applicable to computer control. And the most curtail property for real-time performance is the speed of processing, so the CUDA technology is used to accelerate the developed system.
The developed gaze-tracking system applies a step by step approach of gaze detection. First of all the video stream is captured and each frame is converted to grey scale and mirrored. Then the face is detected on the image using Haar Cascades. Knowing the face position, it is easy to find eye region. Based on anatomical features of the human face, it is assumed that the eye is located on the top one-third of the face with width and height – 35% and 30% of the face’s size correspondingly.
After the eye region is detected, the eye centre is localised. This step runs in parallel for both left and right eyes. The vector field of image gradients is computed through the first partial derivatives. To decrease computational complexity only gradient vectors with significant magnitude are computed. The threshold is calculated dynamically for every case according to the formula: 0.3 * standard deviation + mean of the gradients’ magnitude. As a result the centre of pupil is detected. Parallel to the eye centre localisation runs a similar operation of reflection from an IR diode centre detection. The only difference is that everything is performed to an inverted image.
The final step is gaze direction vector estimation. It is accomplished with the help of some preliminary calibration, when a user should consequentially look at 5 points on the screen. Then the relative position of pupil centre and reflection is converted to gaze direction vector.
The whole processing of each frame takes approximately 17 ms, so it can track gaze with the speed of 60 fps. However, the application of asynchronous streams in CUDA helps to overlap frames processing. In other words, when the one frame is processed the next is read/written to the CPU or GPU memory. It gives the opportunity to track gaze at 120 fps with insignificant lag of 1-2 frames.
Primary author
Prof.
Ivan Gostev
(JINR LIT)
Co-author
Mrs
Elena Sibirtseva
(2 National Research University Higher School of Economics)