cikm 2014 - understanding within-content engagement through pattern analysis of mouse gestures

Understanding Within-Content Engagement through Pattern Analysis of Mouse Gestures Ioannis Arapakis, Mounia Lalmas, George Valkanas Yahoo Labs, Barcelona

Background Information

§  Abundance of multimedia content §  Availability of large volumes of interaction data §  Scalable data mining techniques

Part of the efforts have focused on understanding how users interact and engage with web content

Measurement of within-content engagement remains a difficult and unsolved task

personalisation

service quality

ad quality

Recommender algorithms

§  Lack of standardised methodologies §  Absence of well-validated measures §  Users often don’t provide explicit feedback about

their QoE §  Existing methods don’t form scalable solutions §  Traditional web analytics (e.g., clicks, dwell time,

pageviews) vs. users’ true intentions and motivations

Challenges

Why Mouse Tracking?

§  The navigation through & interaction with a digital environment involves in most cases the use of a mouse (i.e., selecting, positioning, clicking)

§  Several works have shown that the mouse cursor is a weak proxy of gaze (attention)

§  Low-cost, scalable alternative §  Can be easily performed in a non-invasive

manner, without removing users from their natural setting

News Dataset

§  News corpus of 383 news articles (~300−600 words per article) •  crime & law •  entertainment & lifestyle •  Science

§  24 editors (non-participants) evaluated the titles of 40 randomly selected news articles on a 5-point interestingness scale

§  Pre-ranked the news articles and narrowed down our selection to the three most interesting and three least interesting, per genre, prior to conducting our study

Experimental Method

§  Two independent variables •  article genre (“crime and law”, “entertainment and lifestyle”,

“science”) •  article interestingness (two levels: “interesting”, “uninteresting”)

§  22 participants (female = 9, male = 13) §  Two news reading tasks:

•  1 interesting news article + 1 uninteresting news article •  article interestingness was determined by asking the participants

to rank the 6 available news titles per genre (18 in total), from the most interesting to the least interesting

1

2

3

4

5

6

Measures of Engagement

§  User Engagement Scale (UES) •  Positive affect (PAS) •  Negative affect (NAS) •  Perceived usability •  Felt involvement and focused attention •  Custom statements

•  e.g., “I found the news article interesting to read”

§  Gaze (proxy of attention) recorded using •  Tobii 1750 eye tracker

§  Cursor position •  Smt2, an open source, client-server architecture mouse tracking tool

Mouse Gestures x0y0

x1y1

x2y2

x3y3 x4y4

x5y5

x6y6

x7y7

x8y8

t Δt rest Δt rest

resting cursor (500ms)



click

176,550 cursor positions 2,913 mouse gestures

Feature Engineering

§  Time §  Coverage §  Type §  Distance §  Speed §  Acceleration

§  Direction §  Rotations §  FFT

−1000 0 1000 2000 3000

01000

2000

3000

4000

x

y

●

●●

●

−2000 0 2000 4000

02000

4000

6000

x

y

●●

●

●●●●●●●●●●●

●●●

Clustering Mouse Gestures §  Perform the clustering for k = 1..40

•  Agglomerative Hierarchical Clustering •  Cobweb •  EM •  K-Means •  Spectral Clustering

§  Compute cluster validity using a large number of internal criteria; each criterion results in a ranking

§  Perform Rank Aggregation to derive a single ranked list L' that has the minimum distance from a given set of ranked input lists L = {L1, L2, …, Lm}

Towards a Taxonomy

§  The top-ranked clustering configuration is the Spectral Clustering for the original dataset, with hyperbolic tangent kernel, for k = 38

Findings: User × Task Interactions

§  44 news reading tasks §  5-point scale where high scores represent a stronger agreement

and low scores represent less agreement with the given statement §  Significant difference (z = −3.817, p = .000, r = −0.171) in the

frequency distribution of mouse gestures between the interesting and uninteresting task

§  Indicates that certain types of mouse gestures occur more or less often, depending on how interesting the news article is perceived to be

Findings: User × Task Interactions

Gaze behaviour variation

Interesting news article Uninteresting news article - took significantly less time to perform their first fixation (TFF) on an interesting news article - performed more fixations (FC) - fixations lasted for longer periods (FD, TFD) - looked more times at the body of the article (VC) - and the duration of each visit lasted longer (VD)

- fixated more times on other elements (FB)

Correlation Analysis

Predicting Interestingness

Classifier Performance metrics

Precision Recall F-Measure Accuracy Baseline .273 .523 .359 .522

1NN .664 .659 .659 .659

SMO .700 .682 .678 .681

RandomForest .727 .727 .727 .727

Stacking (1NN + SMO) .751 .750 .750 .750

Conclusions §  The frequency distribution of mouse gestures varies

per user and content (interesting vs. uninteresting) §  Certain types of mouse gestures occur more or less

often, depending on how interesting the news article is perceived to be

§  We report several medium-size correlations between certain mouse gestures and eye metrics, which connects our approach to analysing cursor behaviour with gaze

Conclusions §  We report for the first time several significant correlations between

certain types of mouse gestures and preNAS, prePAS, postNAS, postPAS, affect, and focused attention

§  Correlations indicate that cursor behaviour can go beyond measuring frustration to inform us about the positive and negative valence of an interaction

§  Our model outperforms the baseline and introduces a notable improvement of 23% over accuracy

Limitations §  Capturing and analysing cursor behaviour arises as a low-cost and

scalable alternative §  Recording of the cursor position is easy to deploy, can be

performed in a non-invasive manner, and without removing the users from their natural setting

§  However, we need to consider two types of costs: •  Network cost (large amounts of data transferred) •  Time cost (injecting the js in the user’s browser may slow down all interactions)

§  Difficult to generalize; creating a ground truth that can capture the diversity observed in the user population, in a controlled manner, is a challenging task; large numbers of users are required

Questions?

This work was supported by MULTISENSOR project, partially funded by the European Commission, under the contract number FP7-610411

[email protected]

iarapakis

http://www.slideshare.net/iarapakis/