cikm 2014 - understanding within-content engagement through pattern analysis of mouse gestures
TRANSCRIPT
Understanding Within-Content Engagement through Pattern Analysis of Mouse Gestures Ioannis Arapakis, Mounia Lalmas, George Valkanas Yahoo Labs, Barcelona
Background Information
§ Abundance of multimedia content § Availability of large volumes of interaction data § Scalable data mining techniques
Part of the efforts have focused on understanding how users interact and engage with web content
Measurement of within-content engagement remains a difficult and unsolved task
personalisation
service quality
ad quality
Recommender algorithms
§ Lack of standardised methodologies § Absence of well-validated measures § Users often don’t provide explicit feedback about
their QoE § Existing methods don’t form scalable solutions § Traditional web analytics (e.g., clicks, dwell time,
pageviews) vs. users’ true intentions and motivations
Challenges
Why Mouse Tracking?
§ The navigation through & interaction with a digital environment involves in most cases the use of a mouse (i.e., selecting, positioning, clicking)
§ Several works have shown that the mouse cursor is a weak proxy of gaze (attention)
§ Low-cost, scalable alternative § Can be easily performed in a non-invasive
manner, without removing users from their natural setting
News Dataset
§ News corpus of 383 news articles (~300−600 words per article) • crime & law • entertainment & lifestyle • Science
§ 24 editors (non-participants) evaluated the titles of 40 randomly selected news articles on a 5-point interestingness scale
§ Pre-ranked the news articles and narrowed down our selection to the three most interesting and three least interesting, per genre, prior to conducting our study
Experimental Method
§ Two independent variables • article genre (“crime and law”, “entertainment and lifestyle”,
“science”) • article interestingness (two levels: “interesting”, “uninteresting”)
§ 22 participants (female = 9, male = 13) § Two news reading tasks:
• 1 interesting news article + 1 uninteresting news article • article interestingness was determined by asking the participants
to rank the 6 available news titles per genre (18 in total), from the most interesting to the least interesting
1
2
3
4
5
6
Measures of Engagement
§ User Engagement Scale (UES) • Positive affect (PAS) • Negative affect (NAS) • Perceived usability • Felt involvement and focused attention • Custom statements
• e.g., “I found the news article interesting to read”
§ Gaze (proxy of attention) recorded using • Tobii 1750 eye tracker
§ Cursor position • Smt2, an open source, client-server architecture mouse tracking tool
Mouse Gestures x0y0
x1y1
x2y2
x3y3 x4y4
x5y5
x6y6
x7y7
x8y8
t Δt rest Δt rest
resting cursor (500ms)
resting cursor (1000ms)
resting cursor (1500ms)
click
176,550 cursor positions 2,913 mouse gestures
Feature Engineering
§ Time § Coverage § Type § Distance § Speed § Acceleration
§ Direction § Rotations § FFT
−1000 0 1000 2000 3000
01000
2000
3000
4000
x
y
●
●●
●
−2000 0 2000 4000
02000
4000
6000
x
y
●●
●
●●●●●●●●●●●
●●●
Clustering Mouse Gestures § Perform the clustering for k = 1..40
• Agglomerative Hierarchical Clustering • Cobweb • EM • K-Means • Spectral Clustering
§ Compute cluster validity using a large number of internal criteria; each criterion results in a ranking
§ Perform Rank Aggregation to derive a single ranked list L' that has the minimum distance from a given set of ranked input lists L = {L1, L2, …, Lm}
Towards a Taxonomy
§ The top-ranked clustering configuration is the Spectral Clustering for the original dataset, with hyperbolic tangent kernel, for k = 38
Findings: User × Task Interactions
§ 44 news reading tasks § 5-point scale where high scores represent a stronger agreement
and low scores represent less agreement with the given statement § Significant difference (z = −3.817, p = .000, r = −0.171) in the
frequency distribution of mouse gestures between the interesting and uninteresting task
§ Indicates that certain types of mouse gestures occur more or less often, depending on how interesting the news article is perceived to be
Findings: User × Task Interactions
Gaze behaviour variation
Interesting news article Uninteresting news article - took significantly less time to perform their first fixation (TFF) on an interesting news article - performed more fixations (FC) - fixations lasted for longer periods (FD, TFD) - looked more times at the body of the article (VC) - and the duration of each visit lasted longer (VD)
- fixated more times on other elements (FB)
Correlation Analysis
Predicting Interestingness
Classifier Performance metrics
Precision Recall F-Measure Accuracy Baseline .273 .523 .359 .522
1NN .664 .659 .659 .659
SMO .700 .682 .678 .681
RandomForest .727 .727 .727 .727
Stacking (1NN + SMO) .751 .750 .750 .750
Conclusions § The frequency distribution of mouse gestures varies
per user and content (interesting vs. uninteresting) § Certain types of mouse gestures occur more or less
often, depending on how interesting the news article is perceived to be
§ We report several medium-size correlations between certain mouse gestures and eye metrics, which connects our approach to analysing cursor behaviour with gaze
Conclusions § We report for the first time several significant correlations between
certain types of mouse gestures and preNAS, prePAS, postNAS, postPAS, affect, and focused attention
§ Correlations indicate that cursor behaviour can go beyond measuring frustration to inform us about the positive and negative valence of an interaction
§ Our model outperforms the baseline and introduces a notable improvement of 23% over accuracy
Limitations § Capturing and analysing cursor behaviour arises as a low-cost and
scalable alternative § Recording of the cursor position is easy to deploy, can be
performed in a non-invasive manner, and without removing the users from their natural setting
§ However, we need to consider two types of costs: • Network cost (large amounts of data transferred) • Time cost (injecting the js in the user’s browser may slow down all interactions)
§ Difficult to generalize; creating a ground truth that can capture the diversity observed in the user population, in a controlled manner, is a challenging task; large numbers of users are required
Questions?
This work was supported by MULTISENSOR project, partially funded by the European Commission, under the contract number FP7-610411
iarapakis
http://www.slideshare.net/iarapakis/