visual analytics sandbox - information technologyvvr3254/cmps598/notes/... · 2018. 1. 27. · hue:...

25
Visual Analytics Sandbox Satya Katragadda January 25, 2018

Upload: others

Post on 31-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

Visual Analytics SandboxSatya KatragaddaJanuary 25, 2018

Page 2: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

Agenda

• Why Big Data?• Goals• Visual Analytics Sandbox• Traditional Workflow in a Big Data Environment• VA Sandbox: Software Stack• VA Sandbox: Execution Examples

Page 3: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

Why Big Data?

• Reports, e.g.,§ Track business processes, transactions

• Diagnosis, e.g.,§ Why is user engagement dropping?§ Why is the system slow?§ Detect spam, worms, viruses, DDoS attacks

• Decisions, e.g.,§ Decide what feature to add§ Decide what ad to show § Block worms, viruses, …

Page 4: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

Goals

• Low latency (interactive) queries on historical data: enable faster decisions• E.g., identify why a site is slow and fix it

• Low latency queries on live data (streaming): enable decisions on real-time data• E.g., detect & block worms in real-time (a worm may infect 1mil hosts in

1.3sec)

• Sophisticated data processing: enable “better” decisions• E.g., anomaly detection, trend analysis

Page 5: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

Visual Analytics Sandbox

Page 6: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

Big Data Workflow

Data Ingestion Data Management Data Processing Visualization

Resource Management

Page 7: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

VA Sandbox: Software Stack

Page 8: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

VA Sandbox: Resource Manager

Page 9: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

VA Sandbox: Data Injestion

Page 10: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

VA Sandbox: Data Storage

Page 11: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

VA Sandbox: Processing and Visualization

Page 12: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

VA Sandbox

• Stephens Hall• Accessible through university network

Page 13: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

VA Sandbox: Access

Page 14: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

VA Sandbox: Execution

Page 15: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

VA Sandbox: Input

Page 16: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

VA Sandbox: Spark Script

Page 17: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

VA Sandbox: Spark Output

Page 18: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

Alternative Execution Environment

HUE: Hadoop User ExperienceAn open-source Web interface that supports Apache Hadoop and its ecosystem

Component Applications

Editor SQL, Pig, Spark

Browsers YARN, Oozie, Impala, HBase, Livy

Scheduler Oozie

Dashboard Solr, SQL (Impala, Hive...)

Page 19: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

HUE: File Browser

Page 20: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

HUE: Job Execution

Page 21: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

HUE: Output

Page 22: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

HUE: Editors

Page 23: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

HUE: Schedulers

Page 24: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

HUE: Dashboards

Page 25: Visual Analytics Sandbox - Information Technologyvvr3254/CMPS598/Notes/... · 2018. 1. 27. · HUE: Hadoop User Experience An open-source Web interface that supports Apache Hadoop

Questions?

Satya KatragaddaRM 118, Abdalla [email protected]