techniques for visualizing massive data sets
DESCRIPTION
Techniques for Visualizing Massive Data Sets. Leilani Battle , Mike Stonebraker. Context. Visualization System. query. result. Database. Problem. Performance Vis systems don’t scale well for big d ata Or are turning into databases Over-plotting M akes visualizations unreadable - PowerPoint PPT PresentationTRANSCRIPT
TECHNIQUES FOR VISUALIZING MASSIVE DATA SETSLeilani Battle, Mike Stonebraker
Context
Visualization System
Database
query
result
Problem• Performance
• Vis systems don’t scale well for big data• Or are turning into databases
• Over-plotting• Makes visualizations unreadable• Waste of time/resources
Solution: Resolution Reduction
Visualization System
Database
Resolution Reduction Layer
query
queryplan query
queryplan result
modified query
reduced result
ScalaR• Scalable vis system for data exploration
• Web front-end• Uses SciDB (www.scidb.org)
• Visualizes query results• Performs Resolution Reduction
Demo of ScalaR
Array Browser• Collaboration with:
• Brown: Justin DeBrabant, Stan Zdonik, Ugur Cetintemel• Stanford: Zhicheng Liu, Jeff Heer
• Google Maps-style exploration experience• Fetches subsets of the data (aka data tiles)
Array Browser Example
Array Browser Architecture
Demo of Array Browser
Future Work: Prefetching• Goal: Reduce user-wait time by prefetching tiles• Cache tiles in the tile buffer• Need algorithms to decide what to pre-fetch
User Behavior Predictor (Seer)
P
P
• Learn common query sequences from user traces
Statistical Analysis Predictor
P
P
P
• Look for statistical similarities in tiles• Try to guess what’s important based on patterns
Using Multiple Predictors• Run multiple predictors (or experts) in parallel• Compare predictions to user’s actual behavior• Use predictions from best performing expert
• May change over time based on user’s goals
Other Challenges• Lots if interesting problems left to address
• Best eviction policy for the tile buffer?• How to share data between multiple users?• More predictors?
Questions?
Gemini Sagittarius
Dogs Cats
Prefetching Experts• User behavior predictor (Seer)
• Learn common query sequences from user traces• Stats analysis predictor
• Look for statistical similarities in tiles• Try to guess what’s important based on patterns