euba: the emory user behavior analysis system eugene agichtein, qi guo and ryan kelly intelligent...

10
EUBA: The Emory User Behavior Analysis System Eugene Agichtein, Qi Guo and Ryan Kelly Intelligent Information Access Lab http://ir.mathcs.emory.edu Math & Computer Science Department Arthur Murphy, Selden Deemer, Kyle Fenton Emory Libraries

Upload: cameron-lane

Post on 26-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EUBA: The Emory User Behavior Analysis System Eugene Agichtein, Qi Guo and Ryan Kelly Intelligent Information Access Lab ://ir.mathcs.emory.edu

EUBA: The Emory User Behavior Analysis System

Eugene Agichtein, Qi Guo and Ryan KellyIntelligent Information Access Lab http://ir.mathcs.emory.edu Math & Computer Science Department

Arthur Murphy, Selden Deemer, Kyle FentonEmory Libraries

Page 2: EUBA: The Emory User Behavior Analysis System Eugene Agichtein, Qi Guo and Ryan Kelly Intelligent Information Access Lab ://ir.mathcs.emory.edu

2Intelligent Information Access Lab

http://ir.mathcs.emory.edu/

Goals/Motivation Evaluate effectiveness of search and discovery with

automatic behavioral metrics Perform aggregate and longitudinal studies

Develop tools for usability studies “in the wild” Scale (hundreds/thousands of “participants”) Realistic behavior and tasks On-demand playback of “interesting” sessions

Unified analysis/query framework for internal and external resource access and usage statistics Web-based query and statistics interface Access auditing, privacy, anonymity enforced

Page 3: EUBA: The Emory User Behavior Analysis System Eugene Agichtein, Qi Guo and Ryan Kelly Intelligent Information Access Lab ://ir.mathcs.emory.edu

3Intelligent Information Access Lab

http://ir.mathcs.emory.edu/

Approach: Client-side instrumentation

Implemented on top of the Emory Installation of the LibX Toolbar: (http://www.libx.org)

Extended LibX to track UI events: JavaScript patch to sample the mouse movements and other events on pre-specified web search pages. Events are encoded into a string and buffered, and periodically sent to the server (on internal library network).

Page 4: EUBA: The Emory User Behavior Analysis System Eugene Agichtein, Qi Guo and Ryan Kelly Intelligent Information Access Lab ://ir.mathcs.emory.edu

4Intelligent Information Access Lab

http://ir.mathcs.emory.edu/

Events captured (v0.5, Aug. 2008) Button/link clicks/Url changes

Name of the button, link, other meta-info Mouse movements

(x,y) coordinates sampled ~every 10ms Scrolling

Start, stop position, ~ every 10ms Text entry, keypress (ctrl-c, ctrl-v)

Query text, options changes Menu item events

Print, bookmark, save (all of them) Hover over important elements Mouse-in/out of browser

Page 5: EUBA: The Emory User Behavior Analysis System Eugene Agichtein, Qi Guo and Ryan Kelly Intelligent Information Access Lab ://ir.mathcs.emory.edu

5Intelligent Information Access Lab

http://ir.mathcs.emory.edu/

How it works On login to Learning Commons, Firefox is

started with http://irlib.library.emory.edu/consent.cgi?user=USERID

If previously opted in (or out), goto homepage Else show consent form

Store user choice in database; if opted in, also store salted hash string for user login Can opted-in user behavior over “lifetime” No way to recover login id by dictionary attack Can be removed at any time by deleting mapping

Page 6: EUBA: The Emory User Behavior Analysis System Eugene Agichtein, Qi Guo and Ryan Kelly Intelligent Information Access Lab ://ir.mathcs.emory.edu

6Intelligent Information Access Lab

http://ir.mathcs.emory.edu/

How it works (2 of 3): ConsentRequest for Logging of Internet UseRequest for Logging of Internet Use

To improve our web services, Emory Libraries are evaluating the use of our discovery tools (EUCLID, Databases, eJournals, Research Guides, Reserves Direct, Google Scholar, etc.).

We would like to capture the web traffic of your browser session to enable us to log and evaluate our patrons’ success in finding scholarly resources within the Learning Commons.

All data logged will be anonymous so that specific internet use will not be connected to a specific individual. (Details of Research Protocol)

Despite the data capture safeguards, you may wish to “opt out” of this log file recording process. Please select a choice:

This study is being undertaken by the University Libraries under the auspices of Emory University’s Institutional Review Board. To contact the Principal investigators of this study, please send email to: [email protected] or [email protected].

Log my Internet use during this semester

Do not log my Internet use during this semesterContinue Logon

http://irlib.library.emory.edu/

Page 7: EUBA: The Emory User Behavior Analysis System Eugene Agichtein, Qi Guo and Ryan Kelly Intelligent Information Access Lab ://ir.mathcs.emory.edu

7Intelligent Information Access Lab

http://ir.mathcs.emory.edu/

How it works (3 or 3): which URLs? For all visited URLs LibX notifies the server;

information varies by type of site: White list (search sites): Black list (known private sites): Only domain name is

saved All “https://” and “mail.*” URLs

White list (known search/discovery sites): EUCLID, Primo, Google, Google Scholar, Yahoo and

Live search engines, Wikipedia All events captured

Gray list (search results and important public sites) Mouse moves and clicks (no keypress/text)

The rest: Only URL, button clicks, and menu items

Page 8: EUBA: The Emory User Behavior Analysis System Eugene Agichtein, Qi Guo and Ryan Kelly Intelligent Information Access Lab ://ir.mathcs.emory.edu

8Intelligent Information Access Lab

http://ir.mathcs.emory.edu/

Emory User Behavior Analysis System

Combines client side instrumentation, server-side caching, log management, querying, and analysis Client-side instrumentation, data

mining/machine learning (Qi Guo) Log DB parsing, indexing, web-based

interface for querying, playback, annotation (Ryan Kelly)

Plan: to release the system to research/library community (2009?)

Page 9: EUBA: The Emory User Behavior Analysis System Eugene Agichtein, Qi Guo and Ryan Kelly Intelligent Information Access Lab ://ir.mathcs.emory.edu

9Intelligent Information Access Lab

http://ir.mathcs.emory.edu/

EUBA Web-based analysis interface

Prototype:http://ir.mathcs.emory.edu/library/private/index.pl

user: testpassword: notsafe

Page 10: EUBA: The Emory User Behavior Analysis System Eugene Agichtein, Qi Guo and Ryan Kelly Intelligent Information Access Lab ://ir.mathcs.emory.edu

10Intelligent Information Access Lab

http://ir.mathcs.emory.edu/

Future Plans Incorporate log data for ranking, discovery,

query suggestion, collaborative filtering

Richer statistics and visualization

Streamline usability studies

Comments and suggestions welcome!