(big data analytics for everyone)

20
1/20 (Big Data Analytics for Everyone) Remco Chang Assistant Professor Department of Computer Science Tufts University Big Data Visual Analytics: A User-Centric Approach

Upload: plato-diaz

Post on 02-Jan-2016

39 views

Category:

Documents


2 download

DESCRIPTION

(Big Data Analytics for Everyone). Big Data Visual Analytics: A User-Centric Approach. Remco Chang Assistant Professor Department of Computer Science Tufts University. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: (Big Data Analytics for Everyone)

1/20

(Big Data Analytics for Everyone)

Remco Chang

Assistant ProfessorDepartment of Computer Science

Tufts University

Big Data Visual Analytics:A User-Centric Approach

Page 2: (Big Data Analytics for Everyone)

2/20

“The computer is incredibly fast, accurate, and stupid. Man is unbelievably slow, inaccurate, and

brilliant. The marriage of the two is a force beyond calculation.”

-Leo Cherne, 1977 (often attributed to Albert Einstein)

Page 3: (Big Data Analytics for Everyone)

3/20

Work Distribution

Crouser et al., Balancing Human and Machine Contributions in Human Computation Systems. Human Computation Handbook, 2013Crouser et al., An affordance-based framework for human computation and human-computer collaboration. IEEE VAST, 2012

Creativity

Perception

Domain Knowledge

Data ManipulationStorage and Retrieval

Bias-Free Analysis

LogicPrediction

Page 4: (Big Data Analytics for Everyone)

4/20

Visual Analytics = Human + Computer

• Visual analytics is “the science of analytical reasoning facilitated by visual interactive interfaces.”

1. Thomas and Cook, “Illuminating the Path”, 2005.2. Keim et al. Visual Analytics: Definition, Process, and Challenges. Information Visualization, 2008

Interactive Data Exploration

Automated Data Analysis

Feedback Loop

Page 5: (Big Data Analytics for Everyone)

5/20

Visual Analytics Systems

• Political Simulation– Agent-based analysis– With DARPA

• Wire Fraud Detection– With Bank of America

• Bridge Maintenance – With US DOT– Exploring inspection

reports

• Biomechanical Motion– Interactive motion

comparisonCrouser et al., Two Visualization Tools for Analysis of Agent-Based Simulations in Political Science. IEEE CG&A, 2012

Page 6: (Big Data Analytics for Everyone)

6/20

Visual Analytics Systems

R. Chang et al., WireVis: Visualization of Categorical, Time-Varying Data From Financial Transactions, VAST 2008.

• Political Simulation– Agent-based analysis– With DARPA

• Wire Fraud Detection– With Bank of America

• Bridge Maintenance – With US DOT– Exploring inspection

reports

• Biomechanical Motion– Interactive motion

comparison

Page 7: (Big Data Analytics for Everyone)

7/20

Visual Analytics Systems

R. Chang et al., An Interactive Visual Analytics System for Bridge Management, Journal of Computer Graphics Forum, 2010.

• Political Simulation– Agent-based analysis– With DARPA

• Wire Fraud Detection– With Bank of America

• Bridge Maintenance – With US DOT– Exploring inspection

reports

• Biomechanical Motion– Interactive motion

comparison

Page 8: (Big Data Analytics for Everyone)

8/20

Visual Analytics Systems

R. Chang et al., Interactive Coordinated Multiple-View Visualization of Biomechanical Motion Data, IEEE Vis (TVCG) 2009.

• Political Simulation– Agent-based analysis– With DARPA

• Wire Fraud Detection– With Bank of America

• Bridge Maintenance – With US DOT– Exploring inspection

reports

• Biomechanical Motion– Interactive motion

comparison

Page 9: (Big Data Analytics for Everyone)

9/20

Current Big Data Practice

Page 10: (Big Data Analytics for Everyone)

10/20

Human+Computer in Big Data Analytics

• Goal: Allow an analyst (user) to fluidly explore and analyze a large remote data warehouse from commodity hardware

Page 11: (Big Data Analytics for Everyone)

11/20

Problem: Big Data is BIG and Far Away

Visualization on aCommodity Hardware

Large Data in aData Warehouse

Page 12: (Big Data Analytics for Everyone)

12/20

Approach: Predictive Prefetching

Page 13: (Big Data Analytics for Everyone)

13/20

Predict User Behavior from User Interactions?

Page 14: (Big Data Analytics for Everyone)

14/20

Experiment: Finding Waldo

Page 15: (Big Data Analytics for Everyone)

15/20

Predicting a User’s Completion Time

Fast completion time Slow completion time

Page 16: (Big Data Analytics for Everyone)

16/20

Analyses Results: Performance

Biometric (low-level mouse data)

Accuracy: ~70%

Interaction pattern (high-level button clicks)

Accuracy: ~80%

Page 17: (Big Data Analytics for Everyone)

17/20

Predicting a User’s Personality

External Locus of Control Internal Locus of Control

Ottley et al., How locus of control influences compatibility with visualization style. IEEE VAST , 2011.Ottley et al., Understanding visualization by understanding individual users. IEEE CG&A, 2012.

Page 18: (Big Data Analytics for Everyone)

18/20

Analysis Results: Personality Traits

• Noisy data, but can detect the users’ individual traits “Extraversion”, “Neuroticism”, and “Locus of Control” at ~60% accuracy by analyzing the user’s interactions alone.

Predicting user’s “Extraversion”

Accuracy: ~60%

Page 19: (Big Data Analytics for Everyone)

19/20

• Developed a prototype system (ForeCache) in collaboration with the Big Data Center at MIT and researchers at Brown

• Evaluated system with domain scientists using the NASA MODIS dataset (multi-sensory satellite imagery)

• Remote analysis on commodity hardware shows (near) real-time interactive analysis

Wrap Up: Theory Into Practice

Page 20: (Big Data Analytics for Everyone)

20/20

Questions?Remco Chang([email protected])