spark summit europe 2016 keynote - databricks ceo
Post on 06-Jan-2017
536 Views
Preview:
TRANSCRIPT
Democratizing AI with Apache Spark
Ali GhodsiCo-Founder and CEO
AI is changing the world
2
Why now?
AlphaGoSIRI/assistantsSelf-driving cars
Data is the catalyst
3
AI hasn’t been democratized
Better training, tuning, validationMore data
Clickstreams
Sensor data (IoT)
Video
Speech
Handwriting
…
The hardest part of AI isn’t AI
4
“Hidden Technical Debt in Machine Learning Systems “, Google NIPS 2015
How do we democratize AI?
5
“Hidden Technical Debt in Machine Learning Systems “, Google NIPS 2015
+ AI
FLEXIBLE FAST BIG DATA
Some gaps remain
6
Manage Data infrastructure
• Create, configure, monitor resilient big data clusters.• Securely access silos of disparate data sources.• Enforce proper data governance.•1
Empower teams to be productive
• Interactively explore data and prototype ideas.• Securely share big data clusters among analysts.• Debug, troubleshoot, version-control big data applications.•
2
Establish Production-Ready Applications
• Setup robust ML data pipelines for ETL/ELT.• Productionize real-time applications with HA, FT.• Build, serve, maintain advanced machine learning models.•3
Databricks: Closing the gap
7
• Separate compute & storage
• Integrate existing data stores
• Efficient cache on first access
Just-in-Time Data Platform1
Agile + Low TCO
• Interactive notebooks, dashboards, reports
• Real-time exploration, machine learning, graph use cases
Integrated Workspace2
Accelerate Time to Value
• Workflow scheduler for ML, streaming, SQL, ETL
• Performance-optimized, high availability, fault-tolerant
Automated Spark Management3
Performance
Enterprise AI use-cases
8
Predict credit score, credit limit, anomalies
Predict energy demand based on massive weather data
Natural language processing to extract author graph
Predict player churn, predicting network outages
Predict machine equipment failure
New Frontier of AI: Deep Learning
9
Detect cancer Understand speech Infer locationIdentify landmarks in photosRecognize Mandarin and
EnglishImprove cancer detection
Faster and easier deep learning with Databricks
10
GPUs
• TensorFlow: The most popular deep learning framework.
• TensorFrames: Makes TensorFlow computations faster and easier to program on Spark.
TensorFlow on
TensorFrames and GPUs support out-of-the-box
Massive parallelism
Deep Learning on Databricks
11
Data Ingest
Feature extraction
Model Training
Product-ionizeClusters
Jobs & WorkflowsTensorFrames+
GPUs
Interactive exploration
Just-in-time data platform
Automated management
Thank you.
top related