overview of slider project
Post on 06-May-2015
1.045 Views
Preview:
DESCRIPTION
TRANSCRIPT
© Hortonworks Inc. 2013
Deploying & Managing distributed apps on YARN
© Hortonworks Inc. 2012
Hadoop as Next-Gen Platform
HADOOP 1.0
HDFS(redundant, reliable storage)
MapReduce(cluster resource management
& data processing)
HDFS2(redundant, reliable storage)
YARN(cluster resource management)
MapReduce(data processing)
Others(data processing)
HADOOP 2.0
Single Use SystemBatch Apps
Multi Purpose PlatformBatch, Interactive, Online, Streaming, …
Page 2
© Hortonworks Inc.
Slider
Page 3
Applications Run Natively IN Hadoop
HDFS2 (Redundant, Reliable Storage)
YARN (Cluster Resource Management)
BATCH(MapReduce)
INTERACTIVE(Tez)
STREAMING(Storm, S4,…)
GRAPH(Giraph)
HPC MPI(OpenMPI)
OTHER(Search)
(Weave…)Samza
Availability (always on)
Flexibility (dynamic scaling)
Resource Management (optimization)
IN-MEMORY(Spark)
HDFS2 (Redundant, Reliable Storage)
YARN (Cluster Resource Management)
BATCH(MapReduce)
INTERACTIVE(Tez)
STREAMING(Storm, S4,…)
GRAPH(Giraph)
HPC MPI(OpenMPI)
OTHER(Search)
(Weave…)HBaseIN-MEMORY
(Spark)
HDFS2 (Redundant, Reliable Storage)
YARN (Cluster Resource Management)
BATCH(MapReduce)
INTERACTIVE(Tez)
STREAMING(Storm, S4,…)
GRAPH(Giraph)
HPC MPI(OpenMPI)
OTHER(Search)
(Weave…)HBase
IN-MEMORY(Spark)
© Hortonworks Inc.
Initial work: Hoya –on-demand HBase
JSON config in HDFS drives cluster setup and config
1. Small HBase cluster in large YARN cluster
2. Dynamic, self-healing
3. Freeze & Thaw
4. Custom versions & configurations
5. More efficient utilization/sharing of cluster
Page 5
© Hortonworks Inc. 2012
YARN manages the cluster
Page 6
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
HDFS
YARN Node Manager
• Servers run YARN Node Managers• NM's heartbeat to Resource Manager• RM schedules work over cluster• RM allocates containers to apps• NMs start containers• NMs report container health
© Hortonworks Inc. 2012
Client creates App Master
Page 7
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
HDFS
YARN Node Manager
CLIApplication Master
© Hortonworks Inc. 2012
AM deploys HBase with YARN
Page 8
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
CLI
HDFS
YARN Node Manager
Application Master
HBase Region Server
HBase Region Server
HBase Master
© Hortonworks Inc. 2012
HBase & clients bind via Zookeeper
Page 9
HDFS
YARN Node Manager
HBase Region Server
HDFS
YARN Node Manager
HBase Region Server
HDFS
YARN Resource Manager
HBase Client HDFS
YARN Node Manager
Application MasterCLI
HBase Master
© Hortonworks Inc. 2012
YARN notifies AM of failures
Page 10
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HBase Region Server
HDFS
YARN Resource Manager
CLI
HDFS
YARN Node Manager
Application Master
HBase Region Server
HBase Region Server
HBase Master
© Hortonworks Inc.
Flexing/failure handling is same code
boolean flexCluster(ConfTree updated) { appState.updateResourceDefinitions(updated); return reviewRequestAndReleaseNodes();}
void onContainersCompleted(List<ContainerStatus> completed) { for (ContainerStatus status : completed) { appState.onCompletedNode(status); } reviewRequestAndReleaseNodes();}
Page 11
© Hortonworks Inc.
Limitations
• Needs app with built in discovery/binding protocol-HBase, Accumulo
• Static configuration –no dynamic information • Kill-without-warning sole shutdown mechanism• Custom Java in client & App Master for each service• Client code assumed CLI –but embedded/PaaS use was popular.
Page 12
© Hortonworks Inc.
Slider
“Imagine starting a farm of tomcat servers hooked up to an HBase cluster you just deployed –servers processing requests forwarded by a load-balancing service”
Page 13
© Hortonworks Inc.
Slider: evolution of & successor to Hoya
1. Packaging format for deployable applications
2. Service registration and discovery
3. Propagation of dynamic config information back to clients
4. Client API for embedding –CLI only one use case.
Page 14
goal: no code changes to deploy applications in a YARN cluster
© Hortonworks Inc.
Packaging
• Packaging format for deployable applications• metadata, template configurations• template logic to go from config to scripts• simple .py scripts for starting different components in different YARN containers
• future: snapshot, stop, decommission
Page 15
© Hortonworks Inc. 2012
Slider AppMaster
Page 16
AM Main
Slider Engine
YARNIntegration
Launcher
ProviderProvider
ProviderREST API
Web UI
REST API
view
controller
YARN RM
YARN NM
provider REST services
events
builds containers
© Hortonworks Inc. 2012
Model
Page 17
NodeMapmodel of YARN cluster
ComponentHistorypersistent history of
component placements
Specificationresources.json &c
Container Queuesrequested, starting,
releasing
Component Mapcontainer ID -> component
instance
Event Historyapplication history
Persisted Rebuilt Transient
© Hortonworks Inc. 2012
Slider as a Service
Page 18
Slider AM
Components
YARN RM
YARN NM
Components
Components
Application
resources appconf
2. queues
1. builds
7. REST Commands
3.launches
4. reads
5. starts container
s
8. Events
6. heartbeats commands
© Hortonworks Inc. 2012
Bonded: app owns agents
Page 19
Slider AM
Components
YARN RM
YARN NM
Components
Components
Application
resources appconf
2. queues
1. builds
7. REST Commands
3.launches
4. reads
5. starts container
s
8. Events
6. heartbeats commands
© Hortonworks Inc.
Configuration Management
• JSON specs with two-level (global, component) values• Resolution: inherit cluster, global and component values with overrideno: x-refs, deep inheritance, undefinition, schemas, …
• For PaaS deployment: list needed options• Python scripts in packages get componentK-Vs & choose how to interpret: templates, actions, generate CM specs…
• leads to need: publish of dynamic configurations
Page 20
Goal: avoid grand-Hadoop-only CM infrastructures
© Hortonworks Inc. 2012
{ "schema" : "http://example.org/specification/v2.0.0", "metadata" : { }, "global" : { "yarn.vcores" : "1", "yarn.memory" : "256", }, "components" : { "rs" : { "yarn.memory" : "512", "yarn.priority" : "2", "yarn.instances" : "4" }, "slider-appmaster" : { "yarn.instances" : "1" }, "master" : { "yarn.priority" : "1", "yarn.instances" : "1" } }}
JSON Specification
© Hortonworks Inc.
Service Registration and Discovery
• extends beyond YARN to whole cluster-why cant I bootstrap -site.xml configs from ZK info
• LDAP story?• curator: flat service list; REST view• Helix: separate internal and external views• …others
Page 22
recurrent need - YARN-913
Help Needed!
© Hortonworks Inc.
Service registry thoughts
hadoop/services/$user/$service
• $service/externalView$service/internalView$service/remoteView
• components underneath with same structurehadoop/services/$user/$service/components/$id
• ports & URLs with metadata (protocol, description)
Page 23
For enumeration, UIs, Knox,
© Hortonworks Inc.
How to publish configuration docs?
• K-V pairs costly for ZK• LDAP integration? ApacheDS™ & enterprise• String Template generation of documents• HTTP serving
Initial plan• Serve content and servicein the slider AM• ZK entries to list URLs
Page 24
Hard problem
© Hortonworks Inc.
YARN-896: long-lived services
1. Container reconnect on AM restart
2. YARN Token renewal on long-lived apps
3. Containers: signalling, >1 process sequence
4. AM/RM managed gang scheduling
5. Anti-affinity hint in container requests
6. Service Registry (YARN-913)
7. Logging
Page 25
✓
© Hortonworks Inc.
SLAs & co-existence with analytics
1. Make IO bandwidth/IOPs schedule resources
2. Need to monitor what's going on with IO & net load from containers apps queues
3. Dynamic adaptation of cgroup HDD, Net, RAM limits
4. Could we throttle MR job File & HDFS IO bandwidth?
Page 26
© Hortonworks Inc.
YARN Proxy doesn’t like REST APIs
• Proxy service downgrades all actions to GET• AmIpFilter (in AM Web filters) redirects all operations to Proxy with 302 (GET) if origin != proxy IP
Workaround:SliderAmIPFilter selective on redirects
Long Term:–Proxy to forward–Filter to 307 redirect,–httpclient callers to enable 307 handling
Page 27
© Hortonworks Inc.
Status as of April 2014
• Initial agent, REST API, container launch
• Package-based deployments of HBase, Ambari, Storm
• Hierarchical configuration JSON evolving
• Service registry - work in progress
• Incubator: proposal submitted
top related