cognitive blueprints with spectrum conductor...at quarter-end 80 hours 112.8 hours 125.6 hours...
Post on 20-May-2020
4 Views
Preview:
TRANSCRIPT
Cognitive Blueprints with Spectrum Conductor Daniel Reiberg | reiberg@de.ibm.com
Agenda
o Cognitive cross industries
o Cognitive definition
o Challenges
o IBM Systems Blueprint
o DeepLearning
o Workload Management
o Native Service Management
o Resource Management & Orchestration
o Summary
CognitiveCross industries
IBM Systems
Deep Learning, Cognitive and AI Overview:
Automotive and Transportation
Financial Services Broadcast, Mediaand Entertainment
Consumer Web, Mobile, Retail
Security and PublicSafety
Medicine and Biology
• Autonomous driving:• Pedestrian detection• Accident avoidance
Automotive, trucking, heavy equipment, Tier 1 suppliers
• Sentiment Analysis• Market prediction• Fraud/Risk
Retail and Investment banks, capital markets firms
• Captioning• Search• Recommendations• Real time
translation
Consumer facing companies with media streaming, or real time content
• Image tagging• Speech recognition• Natural language • Sentiment analysis
Hyperscale web companies, large retailers
• Video Surveillance• Image analysis• Facial recognition
and detection
Local and national police, public and private safety/ security
• Drug discovery• Diagnostic assistance• Cancer cell detection
Pharmaceutical,Medical equipment, Diagnostic labs
CognitiveDefinition
IBM Systems
AI & Conginitve Definitions
|
7
o Artificial Intelligence (AI)
• Intelligence exhibited by machines or software
o Cognitive Computing
• Computing systems exhibiting acts of perception, memory, judgment or reasoning
o Machine Learning (ML)
• Type of AI that enables computers to learn without being explicitly programmed
o Deep Learning (DL)
• Type of ML, based on neural networks loosely modeled after the brain
• Learns features and representations of data
o Training
• Neural “inspired”, fed by millions of data points
• Repetition drives weighting and connections
o Inference• Classifying a new data point by inferring its similarity to the trained model
CognitiveChallenges
IBM Systems
1. Challenge: Build the environmentCognitive environments consist of many (open source) components
IBM Systems
Time to value matters
Just because you can
build a congitive environment from scratch
doesn’t mean you should!
IBM Systems
Many changes / new Versions
|
Commits
within projects
IBM Systems
2. Challenges: Managing cognitive applications Creating infrastructure silos to accommodate applications is inefficient
Many new solution workloads
in addition to existing apps
Leads to costly, complex, siloed, under-utilized infrastructure and
replicated data
Compliance
Trade Surveillance
Counterparty
Credit Risk
Modeling
Distributed ETL, Sentiment
Analysis
Low Utilization
= Higher cost
• Different LOBs
• Multiple Spark versions
• Different notebooks and versions
• Security, governance
• Application SLA
• DEV, UAT, PROD
• Different data sources, e.g.,
HDFS, Cassandra
• Existing applications
• Siloed organization
• Limited by technology
• It is new!
• …
CognitiveIBM Blueprint
IBM Systems
AI Frameworks
+
Spark
+
Notebook
+
Scheduler
+
Resource Management
+
Distributed File System
Client Options for Spark Adoption in the Enterprise
Ambari
Flume
HDFS
YARN
Hbase
Hive
Knox
Impala
Oozie
Pig
Slider
Navigator
Spark
Sqoop
Zookeeper
Falcon
Accumulo
Kafka
Atlas
Phoenix
Ranger
Titan
Tez
Hue
Storm
Hadoop
“Data Lake First” AnalyticsSpectrum Conductor with Spark
“Analytics First”
• Install base Hadoop vendors promoting this approach
• Management complexities and storage rich server costs
are driving customers to look at alternative approaches
• Spark life cycle management is very difficult
• Provide alternative approach from IBM, focused on a
Spark centric future
• Leverage existing HDFS data as a data store, but not
dependent on it
• Easily manage Spark life cycle
• Efficiently using / training AI models
IBM Systems
Why Spark?
1. Unified Analytics Platform• Widely understood programming logic
2. Performance
(faster than Hadoop MR)• 100x faster in memory, 10x on disk
3. Data Agnostic• can access diverse data sources
4. Rich Set of Certifications &
Expanding EcoSystem
5. Fast evolution• Most Active Apache
Open Source Project
Spark is a fast and general engine for large-scale data processing
IBM Systems
Common Use Cases for Spark
Spark is implemented inside many types of products, across a multitude of industries and organizations
Source: 2015 Databricks Spark Survey
IBM Systems
IBM Turn key solution for cognitive
Business Problem
• Organizations looking to introduce Deep Learning applications leveraging their existing data to enhance their business and applications.
• Data Scientists moving from initial individual experiments to need for enterprise Deep Learning platform as production use cases are deployed.
Solution
• Integrated solution from open source frameworks, Deep Learning tools, and foundational big data platform on optimized Cognitive systems for DL workloads
• Combination of Conductor with Spark , Deep Learning Module, Power AI and Power Minsky
Our Value
• High performance Spark centric analytic environment with dis-aggregated storage and compute architecture.
• Marriage of the best of Spark and HPC to achieve accuracy for ML/DL applications with highest performance and at lowest cost
• Enterprise-class multi-tenant platform that provides ease of use to data scientists and efficiency to admins.
• Extensible to new DL frameworks and preferred Data Science user tools.
x86Power (optimized) +
GPU
MLLib Graphx
Conductor with Spark DL Module
User Application and Data Science Tools
Spectrum Scale
Spectrum Conductor with Spark
Tensorflow Caffe Theano Torch7
Sp
ec
tru
m C
lus
ter
Fo
un
da
tio
n
Make scalable and efficient Big Data DL
platform accessible to enterprises
IBM Systems
Enterprise Class Spark Solution
Red Hat Linux
Spark Workload Management
Resource Management & Orchestration
…x86
Native Services Management
IBM Spectrum Conductor
with Spark
Spark SQLSpark
StreamingGraphXMlib / DL
DL Module
Tensorflow Caffe Theano Torch7
Deep LearningModel & Data Management
IBM Systems
Data Preparation, Model Management & Training
Import data from different formats
Transform, split and shuffle data
Training / Hyper-parameter Search & Tuning
IBM Systems
Data Preparation for Deep Learning
Tumor Proliferation Assessment – mitosis detection
Images from electron-microscope
Size of image - 70K * 60K
Framework Format Input Size (Faster R-CNN)
Caffe LMDB 1K*1K
TensorFlow TensorRecord 1K*1K
Data Transformation
Data Distribution among
training, validation and testing
Data Shuffle
WorkloadManagement & Monitoring
IBM Systems
Linux
LinuxLinux
Linux
Linux
Linux
Spark Shared Services Model – On-Premise Spark Cloud
Physical view: Spectrum Conductor with Spark installed on each Linux Server
Logical view: Users (groups) have their own Spark cluster and they are isolated, protected, secured by Spark Instance Groups – Managed by SLA
Linux
LOB
Data scientist
Researcher
Virtual Spark cluster
(PaaS)
LinuxCustomer behavior...
Trend analysis...
HPC...
Marketing...
Fraud detection...
instance
group #1
instance
group #2
instance
group #3
Management
Nodes Pool
Compute Nodes Pool
Spectrum Scale
Administrator
Web consoleCreate Spark
instance group
LinuxLinux
LinuxLinux
Linux
LinuxLinux
LinuxLinux
Linux
IT or Data
Warehouse
ETL / Batch
instance
group #4
LOB
IoT
instance
group #5
IBM Systems
Workload Management Features Summary
• Lifecycle management for Spark, AI frameworks and Data Science Tools
(Downloads for different versions from FixCentral and DevOps pages)
• Fast scheduling with Spark Session Scheduler (see benchmark)
• GPU aware (utilization of NVLink on Power)
• Shared RDD
• Multi tennancy
• Fine grained control of life cycle of spark binary, notebook update, deployment, resource plan,
reporting, monitoring, log retrieval and execution of either notebook or batch submission
• Runtime isolation with Spark instance groups
(Driver/executor process are owned by submitter)
• Data at Rest isolation
(Data/log can only be accessed by owner)
• Encryption
• SSL with daemon authentication for all communications within IBM Spectrum Conductor
• Storage encryption when combined with IBM Spectrum Scale
IBM SystemsIBM Systems
Why Conductor : Performance – most recent STAC Benchmark
Key Points• Conductor continues to
demonstrate performance and throughput leadership driving ROI for client
UC1 Throughput• 56% higher than YARN
• 57% higher than Mesos
UC2 Throughput• 30% higher than YARN
• 62% higher than Mesos
UC3 Throughput• 55% higher than YARN
• 88% higher than Mesos
UC4 –Throughput• 224% higher than YARN
• 25% higher than Mesos
IBM Systems
Competitive advantage through faster analytics
41% greater throughput than Spark with YARN
57% greater than Spark with Mesos
Platform Conductor for Spark
Spark / YARN Spark / Mesos
When minutes count 10 minutes 14.1 minutes 15.7 minutes
At quarter-end 80 hours 112.8 hours 125.6 hours
Product development 26 weeks 36.7 weeks 40.8 weeks
Source: STAC Report: Spark Resource Managers, Phase 1 (March 28, 2016)
Note: IBM is an active contributor in the Mesos community, helping to advance its capabilities and integration with IBM solutions
IBM SystemsIBM Systems
Monitoring and Reporting
• Integrated Elastic Search, Logstash, & Rave for customizable monitoring• Built-in monitoring Metrics
• Cross Spark Instance Groups
• Cross Spark Applications within Spark Instance Group
• Within Spark Application
• Built-in monitoring inside Zeppelin Notebook• Basis for charge back models
|
3
ServiceManagement
IBM Systems
Application / Service composition
✓ Service and application definition
✓ Service life cycle management
✓ Integration with OS containers
(cgroups, Docker)
✓ Complex service dependency
✓ HA, Persistency, virtual IP mgmt
✓ Elastic service pool
✓ Stateful vs. stateless services
✓ API & scriptable interface
IBM Systems
Application management & monitoring accross the whole life cycle
✓ Easily monitor and multiple application
instances from a single interface
✓ Elastic service pools
✓ Multiple triggers for “grow/shrink”
✓ Dynamic services deployment
✓ Resource Sharing among
“long running” services and short “tasks/jobs”
✓ Applications organized hierarchically, mapping
to the organization hierarchy
✓ Complex services can be managed by
personnel not familiar with details of each
service
✓ System services as well as complex
application frameworks
ResourceManagement & Orchestration
IBM Systems
Resource ManagementManage, Monitor and report on infrastructure usage through a single interface
✓ Monitor how assets are being used
across on or more facilities
✓ Partition resources to support multiple
application instances or lines of business
✓ Enforce security policies enabling secure
multitenant deployments
✓ Track resource usage over time for
show-back or chargeback accounting
purposes
✓ Integration in enterprise monitoring via
SNMP, SOAP or REST API
Summary
IBM SystemsIBM Systems
Software Defined Cognitive Infrastructure Benefits
Eliminate Silos
Simplified Administration
Single-pane-of-glass management and monitoring of shared services
Global shared access to ensure data is available right when it’s needed
Minimize Deployment time
Run application workload in the most flexible & efficient ways possible Intelligent Resourcing
Multi-tenant integrated application and data fabric
Improve data availability
Automated deployment of physical and virtualized resources
IBM SystemsIBM Systems
IBM Software Defined Infrastructure
Heterogeneous Infrastructure Support
Workload AwareScheduling
SharedResourceManagement
High Performance Analytics Risk Analytics
High Performance ComputingDesign/Simulation/Modeling
On-premises, On-cloud, Hybrid Infrastructure
‘New-gen Workloads’Hadoop, Spark, Containers
Global Data Management
DiskFlash Tape Power x86 Linux on z docker VMARMSparcGPU
Thank you.
IBM Systems
ibm.com/systems
top related