skovsgaard.2011.evaluation of a remote webcam based eye tracker

4
Evaluation of a Remote Webcam-Based Eye Tracker Henrik Skovsgaard IT University of Copenhagen Rued Langgaards Vej 7 2300 Copenhagen S [email protected] Javier San Agustin IT University of Copenhagen Rued Langgaards Vej 7 2300 Copenhagen S [email protected] Sune Alstrup Johansen IT University of Copenhagen Rued Langgaards Vej 7 2300 Copenhagen S [email protected] John Paulin Hansen IT University of Copenhagen Rued Langgaards Vej 7 2300 Copenhagen S [email protected] Martin Tall Duke University 2424 Erwin Rd. (Hock Plaza) Durham, NC 27705 [email protected] ABSTRACT In this paper we assess the performance of an open-source gaze tracker in a remote (i.e. table-mounted) setup, and compare it with two other commercial eye trackers. An ex- periment with 5 subjects showed the open-source eye tracker to have a significantly higher level of accuracy than one of the commercial systems, Mirametrix S1, but also a higher error rate than the other commercial system, a Tobii T60. We conclude that the web-camera solution may be viable for people who need a substitute for the mouse input but cannot afford a commercial system. Categories and Subject Descriptors H.5.2 [Information Interfaces and Presentation]: User interfaces—Evaluation/methodology General Terms Human factors, Experimentation, Performance, Measure- ment Keywords Gaze interaction, low-cost gaze tracking, performance eval- uation, universal access 1. INTRODUCTION Gaze tracking systems enable people with severe motor disabilities to communicate using only their eye movements. However, some of them cannot afford a commercial system, which cost between $5,000 and $30,000. While the quality of the systems has improved dramatically over the years, the price has remained more or less constant. Systems that em- ploy low-cost and off-the-shelf hardware components are be- coming increasingly popular as camera technology improves. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. NGCA ’11 May 26-27 2011, Karlskrona, Sweden Copyright 2011 ACM 978-1-4503-0680-5/11/05 ...$10.00. The use of off the-shelf hardware components in gaze track- ing represents a growing research field [2]. In 2004, Babcock and Pelz [1] presented a head mounted eye-tracker that uses two small cameras attached to a pair of safety glasses. Li et al. [4] extended their work and built a similar system that worked in real time, called OpenEyes. Being headmounted, both systems are affected by head move- ments and are thus not suitable for use in combination with a desktop computer. Although the components used in the systems described above are inexpensive, assembling the hardware requires advanced knowledge of electronics. Zielin- ski’s Opengazer system [10], based on a remote webcam, takes a simpler hardware approach. The gaze estimation method is not tolerant to head movements, and therefore the user needs to keep the head still after calibration. Sewell and Komogortsev [7] developed a neural-network based eye tracker able to run on a personal computer’s built- in webcam under normal lightning conditions (i.e., no in- frared light). The aim of their study was to employ eye tracking without any modifications to the hardware. The five participants in the study complained that even during fixations they felt a jumpy sensation of the marker, and that the marker was unstable during use. The ITU Gaze Tracker 1 is an open-source gaze tracking software that can be used with low-cost and off-the-shelf hardware, such as webcams and video cameras. The soft- ware tracks the pupil and one or two corneal reflections pro- duced by infrared light sources. The first version of the system was introduced and evaluated by San Agustin et al. in [6]. The results obtained indicated that a low-cost system built with a webcam could have the same performance as ex- pensive commercial systems. However, the system required placing the webcam very close to the user’s eye, which was not comfortable. Furthermore, the camera blocked part of the user’s view. Being a headmounted system, it also re- quired the user to sit completely still, as head movements affected the cursor position. The second version of the system enables remote eye track- ing by using a camera with a narrow field-of-view. The same webcam used in [6] can be employed by replacing the stan- dard wide-angle lens with an inexpensive 16 mm zoom lens. Figure 1 shows the hardware configuration for such remote system with a webcam and two light sources. 1 http://www.gazegroup.org

Upload: mrgazer

Post on 17-Dec-2014

1.722 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Skovsgaard.2011.evaluation of a remote webcam based eye tracker

Evaluation of a Remote Webcam-Based Eye Tracker

Henrik SkovsgaardIT University of Copenhagen

Rued Langgaards Vej 7

2300 Copenhagen S

[email protected]

Javier San AgustinIT University of Copenhagen

Rued Langgaards Vej 7

2300 Copenhagen S

[email protected]

Sune Alstrup JohansenIT University of Copenhagen

Rued Langgaards Vej 7

2300 Copenhagen S

[email protected]

John Paulin HansenIT University of Copenhagen

Rued Langgaards Vej 7

2300 Copenhagen S

[email protected]

Martin TallDuke University

2424 Erwin Rd. (Hock Plaza)

Durham, NC 27705

[email protected]

ABSTRACT

In this paper we assess the performance of an open-sourcegaze tracker in a remote (i.e. table-mounted) setup, andcompare it with two other commercial eye trackers. An ex-periment with 5 subjects showed the open-source eye trackerto have a significantly higher level of accuracy than one ofthe commercial systems, Mirametrix S1, but also a highererror rate than the other commercial system, a Tobii T60.We conclude that the web-camera solution may be viablefor people who need a substitute for the mouse input butcannot afford a commercial system.

Categories and Subject Descriptors

H.5.2 [Information Interfaces and Presentation]: Userinterfaces—Evaluation/methodology

General Terms

Human factors, Experimentation, Performance, Measure-ment

Keywords

Gaze interaction, low-cost gaze tracking, performance eval-uation, universal access

1. INTRODUCTION

Gaze tracking systems enable people with severe motordisabilities to communicate using only their eye movements.However, some of them cannot afford a commercial system,which cost between $5,000 and $30,000. While the qualityof the systems has improved dramatically over the years, theprice has remained more or less constant. Systems that em-ploy low-cost and off-the-shelf hardware components are be-coming increasingly popular as camera technology improves.

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.NGCA ’11 May 26-27 2011, Karlskrona, SwedenCopyright 2011 ACM 978-1-4503-0680-5/11/05 ...$10.00.

The use of off the-shelf hardware components in gaze track-ing represents a growing research field [2].

In 2004, Babcock and Pelz [1] presented a head mountedeye-tracker that uses two small cameras attached to a pairof safety glasses. Li et al. [4] extended their work and builta similar system that worked in real time, called OpenEyes.Being headmounted, both systems are affected by head move-ments and are thus not suitable for use in combination witha desktop computer. Although the components used inthe systems described above are inexpensive, assembling thehardware requires advanced knowledge of electronics. Zielin-ski’s Opengazer system [10], based on a remote webcam,takes a simpler hardware approach. The gaze estimationmethod is not tolerant to head movements, and thereforethe user needs to keep the head still after calibration.

Sewell and Komogortsev [7] developed a neural-networkbased eye tracker able to run on a personal computer’s built-in webcam under normal lightning conditions (i.e., no in-frared light). The aim of their study was to employ eyetracking without any modifications to the hardware. Thefive participants in the study complained that even duringfixations they felt a jumpy sensation of the marker, and thatthe marker was unstable during use.

The ITU Gaze Tracker1 is an open-source gaze trackingsoftware that can be used with low-cost and off-the-shelfhardware, such as webcams and video cameras. The soft-ware tracks the pupil and one or two corneal reflections pro-duced by infrared light sources. The first version of thesystem was introduced and evaluated by San Agustin et al.in [6]. The results obtained indicated that a low-cost systembuilt with a webcam could have the same performance as ex-pensive commercial systems. However, the system requiredplacing the webcam very close to the user’s eye, which wasnot comfortable. Furthermore, the camera blocked part ofthe user’s view. Being a headmounted system, it also re-quired the user to sit completely still, as head movementsaffected the cursor position.

The second version of the system enables remote eye track-ing by using a camera with a narrow field-of-view. The samewebcam used in [6] can be employed by replacing the stan-dard wide-angle lens with an inexpensive 16 mm zoom lens.Figure 1 shows the hardware configuration for such remotesystem with a webcam and two light sources.

1http://www.gazegroup.org

Page 2: Skovsgaard.2011.evaluation of a remote webcam based eye tracker

Figure 1: Hardware configuration for the webcam-based gaze tracker.

The aim of this study is to investigate whether the per-formance of the remote, webcam-based ITU Gaze Tracker(costing around $100) can match the performance of twocommercial gaze-tracking systems, a Tobii T60 ($25,000)and a Mirametrix S1 ($6,000) in an interaction task.

2. PERFORMANCE METRICS

2.1 Accuracy and Precision

The performance of a sensor it typically measured in ac-

curacy and precision, where accuracy refers to the degree towhich the sensor readings represent the true value of what ismeasured, while precision (also known as spatial resolution)refers to the extent to which successive readings of the samephysical phenomenon agree in value [8].

The working copy of the COGAIN report: Eye tracker

accuracy terms and definitions [5] has a set of definitionsand terminologies for measuring accuracy and precision ofan eye tracking system. Here, accuracy Adeg is defined as theaverage angular distance, θi (measured in degrees of visualangle) between n fixation locations and the correspondingfixation targets (see Equation 1).

Adeg =1n

nX

i=1

θi (1)

Spatial precision is calculated as the Root Mean Square,RMS, of the angular distance θi (measured in degrees of vi-sual angle) between successive samples (xi, yi) to (xi+1, yi+1)(Equation 2).

RMSdeg =

vuut 1n

nX

i=1

θ2i (2)

The working copy of the COGAIN report does not statehow angular distances, θ should be calculated. Distancesare typically measured in pixels on computers and for thisexperiment, we used a function to map distances to pixels,∆px to degrees of visual angle,∆ ◦. Besides knowing thedistance in pixels, the physical size of a pixel S and thedistance from user to screen D need to be known (Equation3).

∆◦ =360π

· tan−1

„∆px · S2 · D

«(3)

2.2 Target Acquisition

In order to evaluate the performance of the different inputdevices, we followed the methodology described by the ISO9241-9 standard for non-keyboard input devices [3]. Theperformance is quantified by the throughput and error ofeach device.

Calculating the throughput is based on the effective tar-get width We and the effective distance De, which are usedto calculate the effective index of difficulty IDe followingEquation 4. Throughput is measured in bps and is calcu-lated as the relationship between effective index of difficultyIDe and movement time MT (Equation 5) [9].

IDe = log2

„De

We+ 1

«, We = 4.133 · SDx (4)

Throughput =IDe

MT(5)

3. PERFORMANCE EVALUATION

3.1 Participants

A total of five participants, three male and two female,with ages ranging from 29 to 39 years (M = 34 years, SD =4.3), volunteered to participate the study. Three of the par-ticipants had no previous experience with gaze interaction.One of them used contact lenses.

3.2 Apparatus

The computer used was a 2.6 GHz Intel Dual Core pro-cessor desktop computer with 3 GB RAM running WindowsXP SP3. We used the 17" monitor with a resolution of1280×1024 that comes with the Tobii T60 system. Threegaze trackers and a Logitech optical mouse (for baseline com-parison) were tested as input devices. Two of the three gazetrackers were the commercial systems Tobii T60 and Mi-rametrix. The third system was the ITU Gaze Tracker us-ing a Sandberg Nightcam 2 webcam running at 30 fps with a16 mm lens, and two Sony HVL-IRM infrared light sources.The total cost was around $100. The three gaze trackersused a 9-point calibration procedure. Figure 2 shows theexperimental setup.

3.3 Design and Procedure

After calibrating the system, participants completed anaccuracy test followed by a 2D target-selection task. Partic-ipants sat approximately 60 cm away from the monitor, andwere asked to sit as still as possible. The experiment wasconducted employing a within-subjects factorial design. Thetarget-selection task had the following independent variablesand levels:

• Device (4): Mouse, Tobii T60, Mirametrix, Webcam

• Amplitude (2): 450, 900 pixels

• Target Width (2): 75, 100 pixels

The dependent variables in the study were accuracy (de-grees), precision (degrees), throughput (bps) and error rate(%). Each participant completed 4 blocks of 1 trial (i.e., 4

Page 3: Skovsgaard.2011.evaluation of a remote webcam based eye tracker

Figure 2: Experimental setup. The participant isconducting the test using the Mirametrix system.

trials) for the accuracy and precision test, and 16 blocks of 15trials (i.e., 240 trials) for the target-selection task, where de-vice, amplitude, and target width were fixed within blocks.The orders of input device and task were counterbalancedacross users to neutralize learning effects. Participants wereencouraged to take a comfortable position in front of thecomputer and remain as still as possible during the test.The total test session lasted approximately 15 minutes.

Immediately after a successful calibration participants wereinstructed to gaze on a randomly appearing target in a 4×4matrix (evenly distributed with 100 pixels to the bordersof the monitor). A new target would appear when a to-tal of 50 samples had been recorded at 30 Hz. Prematuresamples were avoided with a smooth animated transition be-tween targets plus a reaction delay of 600 ms. Furthermore,samples further than M ± 3× SD away were considered asoutliers. To prevent distractions from cursor movements, wehid the cursor throughout the blocks except, of course, forthe mouse condition.

Once the accuracy test was completed, the target selec-tion task started. Participants were presented with 15 cir-cular targets arranged in a circle in the center of the screen.Targets were highlighted one-by-one, and participants wereinstructed to select the highlighted target as quickly and asaccurately as possible. Selections were performed with thespacebar for the gaze trackers and a left-button click for themouse condition. Activations outside the target area wereregarded as misses and were thus considered as the errorrate. Every selection ended the current trial and started thenext one. Based on the amplitudes and target widths, thenominal indexes of difficulty were between 2.5 and 3.7 bits.

4. RESULTS

4.1 Accuracy and Precision

Analysis of the accuracy and precision was performed us-ing a one-way ANOVA, with device as independent variable.Accuracy and precision were analyzed as the dependent vari-ables. 228 outliers of the 16,000 samples were removed fromthe analysis. An LSD post-hoc test was applied after theanalysis. Figure 3 shows a plot of the average accuracy andprecision per device.

Mean accuracy for mouse, Tobii, Mirametrix and webcamwas 0.14◦, 0.67◦, 1.34◦ and 0.88◦, respectively (left-side bar

!"

!#$"

!#%"

!#&"

!#'"

("

(#$"

(#%"

(#&"

(#'"

)*+,-" .*/00" )0123-4105" 6-/723"

8-91--,":;<"

=77+127>"?1-70,0*@"

Figure 3: Accuracy and Precision by device. Errorbars show ± SD.

in Figure 3). The main effect of device on accuracy wasstatistically significant, F (3, 12) = 16.03, p < 0.001. Thepost-hoc test showed a significant difference between mouseand all of the gaze trackers. Tobii performed significantlybetter than Mirametrix, t(4) = 3.65, p < 0.05. The webcamalso performed significantly better than Mirametrix, t(4) =4.42, p < 0.05. There was no significant difference betweenthe webcam and Tobii with t(4) = 1.57, p > 0.05.

Mean precision for mouse, Tobii, Mirametrix and web-cam was 0.05◦, 0.08◦, 0.43◦ and 0.31◦, respectively (right-side bar in Figure 3). Mauchly’s test indicated that theassumption of sphericity had been violated, χ(5) = 16.60,p < 0.01, therefore degrees of freedom were corrected us-ing Greenhouse-Geisser estimates of sphericity (� = 0.47).The results show that there was no significant effect on theprecision of the devices, F (1.42, 5.67) = 4.38, p = 0.08.

4.2 Throughput and Error Rate

Analysis of the target selection task was performed using a4×2×2 ANOVA, with device, amplitude and target width asthe independent variables. Throughput and error rate wereanalyzed as the dependent variables. An LSD post-hoc testwas applied after the analysis. All data were included.

Mean throughput for mouse, Tobii, Mirametrix and web-cam was 4.00, 2.63, 2.00 and 2.31 bps, respectively (left-sidebars in Figure 4). The main effect of device on throughputwas statistically significant, F (3, 12) = 9.61, p < 0.01. Thepost-hoc test showed a significant difference between mouseand all other devices. There was a main effect of ampli-tude F (3, 12) = 10.73, p < 0.05, with short amplitudes (M= 2.83 bps) having a significantly higher throughput thanlong amplitudes (M = 2.62 bps), t(4) = 3.30, p < 0.05. Nosignificance of target width was found F (3, 12) = 2.00, p =0.23.

Mean error rate for Mouse, Tobii, Mirametrix and Web-cam was 5.34%, 19.21%, 39.29% and 27.50%, respectively(right-side bars in Figure 4). The main effect of device onerror rate was statistically significant, F (3, 12) = 9.71, p <0.01. The post-hoc test showed a significant difference be-tween mouse and all other devices. Tobii had a significantlylower error rate than the webcam, t(4) = 4.96, p < 0.05.We found no effect of amplitude F (3, 12) = 0.37, p = 0.58nor target width F (3, 12) = 0.37, p = 0.58.

Page 4: Skovsgaard.2011.evaluation of a remote webcam based eye tracker

!"#"$!"$#"%!"%#"&!"&#"'!"'#"#!"

!"!(#"$"

$(#"%"

%(#"&"

&(#"'"

'(#"#"

)*+,-" .*/00" )0123-4105" 6-/723"

811*1"9

24-":;

<"

.=1*+>=?

+4":/

?,<"

.=1*+>=?+4"

811*1"924-"

Figure 4: Overall throughput and error rate by de-vice. Error bars show ± SD.

5. DISCUSSION

Our results suggest that the accuracy of the webcam-based gaze tracker (0.88◦) is significantly better than theaccuracy of the Mirametrix system (1.34◦), while showingno significant difference to the Tobii T60 (0.67◦). This in-dicates that the ITU Gaze Tracker can be used in softwareapplications meant to be controlled by gaze input.

Although we did not find any significant effect of the indi-vidual devices in the precision study, the data indicates thatthe mouse and the Tobii system had a higher precision thanthe Mirametrix S1 and the webcam-based system. It mustbe noted that the precision is calculated after the low-passfiltering that the eye trackers perform on the data samplesduring fixations. This is done to smooth the signal and pre-vent a jittery cursor from annoying the user. The ITU GazeTracker gives users control over the level of smooth duringfixations, a feature that many commercial systems do notprovide.

The results obtained in the target-selection task indicatethat the webcam-based eye tracker has a similar perfor-mance to the other two commercial systems in terms ofthroughput. The error rate of the webcam tracker was, how-ever, significantly higher than the error rate of the Tobii T60.Throughput values were slightly lower than in previous stud-ies [6, 9]. This can be due to the lower control over hardwaresetup in our experiment, as well as the lack of experience ofnovice users, who tended to be rather slow.

6. CONCLUSION

Our study on performance evaluation shows that a re-mote, webcam-based eye tracker can have a performancecomparable to expensive systems. However, there are othercrucial factors for the practical usefulness of an eye track-ing device that have not been evaluated in this study, suchas the quality of the documentation, API, tolerance againsthead movements, ease of use and stability over time.

In our future work, we aim to further investigate theseissues and implement new algorithms to improve the perfor-mance. Specifically, we would like to explore how continuousrecalibrations and repositioning of the participants can im-prove performance over time. In this study we would alsolike test various hardware setups for the ITU Gaze Tracker(e.g. better cameras), and different algorithms for calcu-lating the point-of-regard. A usability and user experiencestudy should also be employed to include subjective mea-sures of the different systems.

Finally, it is our hope that researchers, students and hob-byists will collaborate in the development of the software,and contribute to make the open-source ITU Gaze Trackera more reliable system.

7. ACKNOWLEDGEMENTS

We would like to thank EYEFACT for supporting the ex-periment, and the open source community for their help withimproving the ITU Gaze Tracker.

8. REFERENCES[1] J. S. Babcock and J. B. Pelz. Building a lightweight

eyetracking headgear. In Proceedings of the 2004

symposium on Eye tracking research & applications,pages 109–114, San Antonio, Texas, 2004. ACM.

[2] J. P. Hansen, D. Hansen, and A. Johansen. Bringinggaze-based interaction back to basics. In Universal

Access in HCI (UAHCI): Towards an Information

Society for All, volume 3, pages 325–329, NewOrleans, USA, 2001. Lawrence Erlbaum.

[3] ISO. Ergonomic requirements for office work withvisual display terminals (VDTs) - part 9. InRequirements for nonkeyboard input devices.International Organization for Standardization, 2000.

[4] D. Li, J. Babcock, and D. J. Parkhurst. openEyes. InProceedings of Eye tracking research & applications,pages 95–100, San Diego, California, 2006. ACM.

[5] F. Mulvey. Eye tracker accuracy terms and definitions- working copy. Technical report, COGAIN, 2010.

[6] J. San Agustin, H. Skovsgaard, J. P. Hansen, andD. W. Hansen. Low-cost gaze interaction: ready todeliver the promises. In Proceedings of CHI’09, pages4453–4458, Boston, MA, USA, 2009. ACM.

[7] W. Sewell and O. Komogortsev. Real-time eye gazetracking with an unmodified commodity webcamemploying a neural network. In Proceedings of the 28th

of the international conference extended abstracts on

Human factors in computing systems, page 3739–3744,New York, USA, 2010. ACM. ACM ID: 1754048.

[8] A. D. Wilson. Sensor- and Recognition-Based inputfor interaction. In The Human Computer Interaction

Handbook, pages 177–199. Lawrence ErlbaumAssociates, 2007.

[9] X. Zhang and I. S. MacKenzie. Evaluating eyetracking with ISO 9241 - part 9. In Proceedings of the

12th international conference on HCI: intelligent

multimodal interaction environments, pages 779–788,Beijing, China, 2007. Springer.

[10] P. Zielinski. Opengazer: open-source gaze tracker forordinary webcams.http://www.inference.phy.cam.ac.uk/opengazer/,2010.