running services on yarn
TRANSCRIPT
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Running Services on YARNMunich, April 2017
Varun Vasudev
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
About myself
⬢ Apache Hadoop contributor since 2014
⬢ Apache Hadoop committer and PMC member
⬢ Currently working for Hortonworks
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Introduction to Apache Hadoop YARN
⬢ Architectural center of big data workloads
⬢ Enterprise adoption–Secure mode is popular
–Multi-tenant support
⬢ SLAs–Tolerance for slow running jobs decreasing
–Consistent performance desired
⬢ Diverse workloads increasing–LLAP on Slider
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Introduction to Apache Hadoop YARN
YARN: Data Operating System
(Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
Script
Pig
SQL
Hive
TezTez
Java
Scala
Cascading
Tez
° °
° °
° ° ° ° °
° ° ° ° °
Others
ISV
Engines
HDFS (Hadoop Distributed File System)
Stream
Storm
Search
Solr
NoSQL
HBase
Accumulo
Slider Slider
BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
In-Memory
Spark
YARNThe Architectural Center of Hadoop
• Common data platform, many applications
• Support multi-tenant access & processing
• Batch, interactive & real-time use cases
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Several important trends in age of Hadoop 3.0 +
YARN and Other Platform Services
Storage
Resource
Management SecurityService
Discovery Management
Monitoring
Alerts
IOT Assembly
Kafka Storm HBase Solr
Governance
MR Tez Spark …
Innovating
frameworks:
Flink,
DL(TensorFlow),
etc.
Various Environments
On Premise Private Cloud Public Cloud
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Services workloads becoming more popular
⬢Users using more and more long running services like LLAP, HiveServer, HBase, etc
⬢ Service workloads are gaining more importance–Need a webserver to serve results from a MR job
–New YARN UI can be run in its own container
–ATSv2 would like to launch ATS reader containers as well
–Applications want to run their own shuffle service
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Node 1
NodeManager128G, 16 vcores
Launch Applicaton 1 AMAM process
Launch AM process via
ContainerExecutor – DCE, LCE, WSCE.
Monitor/isolate memory and cpu
Application Lifecycle
ResourceManager
(active)
Request containers
Allocate containersContainer 1 process
Container 2 process
Launch containers on node using
DCE, LCE, WSCE. Monitor/isolate
memory and cpu
History Server(ATS – leveldb,
JHS - HDFS)
HDFS
Log aggregation
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Application Lifecycle
⬢ Designed for batch jobs–Jobs run for hours, days
–Jobs are using frameworks(like MR, Tez, Spark) which are aware of YARN
–Container failure is bad but frameworks have logic to handle it
•Local container state loss is handled
–Jobs are chained/pipelined using application ids
–Debugging is an offline event
⬢ Doesn’t carry over cleanly for services–Services run for longer periods of time
–Services may or may not be aware of YARN
–Container loss is a bigger problem, can have really bad consequences
–Services would like to discover other services
–Debugging is an online event
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Enabling Services on YARN
10
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Enabling Services on YARN
⬢ AM to manage services
⬢ Service discovery
⬢ Container lifecycle
⬢ Scheduler changes
⬢ YARN UI
⬢ Application upgrades
⬢Other issues–Log collection
–Support for monitoring
11
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
AM to manage services
⬢ Any service/job on YARN requires an AM–AM’s are hard to write
–Different services will re-implement the same functionalities
–AM has to keep up with changes in Apache Hadoop
⬢Native YARN framework layer for services(YARN-5079)–Provides an AM that ships as part of Apache Hadoop that can be used to manage services
–AM is from the Apache Slider project
–AM provides REST APIs to manage applications
–Has support for functionality such as port scheduling, flexing the number of containers
–Maintained by the Apache Hadoop developers so it’s kept up to date with the rest of YARN
–New YARN REST APIs to launch services
12
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN REST API to launch services
{ "name": "vvasudev-druid-2017-03-16","resource": {
"cpus": 16, "memory": "51200"
}, "components" : [
{ "name": "vvasudev-druid", "dependencies": [ ], "artifact": { "id": ”druid-image:0.1.0.0-25", "type": "DOCKER"
}, "configuration": { "properties": {
"env.CUSTOM_SERVICE_PROPERTIES": "true", "env.ZK_HOSTS": ”zkhost1:2181,zkhost2:2181,zkhost3:2181"
} }
} ],
"number_of_containers": 5, "launch_command": "/opt/druid/start-druid.sh", "queue" : ”yarn-services”
}
13
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Service discovery
⬢ Long running services require a way to discover them–Application ids are constant for the lifetime of the application
–Container ids are constant for the lifetime of the container but containers will come up and go down
⬢ Add support for discovery of long running services using DNS and the Registry Service(YARN-4757)–DNS is well understood
–Registry service will have a record of the application to DNS name
–YARN has a DNS server but currently this is for testing and experimentation only
–YARN will need to add support for DNS updates to fit into existing DNS solutions
14
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Service Discovery
NodeManagerNodeManager
NodeManager
ResourceManager
DNS Server Registry Service
ApplicationManager
ZookeeperZookeeper
Zookeeper
User
15
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Container lifecycle
⬢When the container exits, the NodeManager(NM) reclaims all the resources immediately–NM also cleans up any local state that the container maintained
⬢ AM may or may not be able to get a container back on the same node–NM has to download any private resources again for the container leading to delays in restarts
⬢ Added support for first class container re-tries(YARN-4725)–AM can specify retry policy when starting the container
–On process exit, the NM will not clean up any state or resources
– Instead it will attempt to retry the container
–AM can specify limits on the number of retries as well as the delay between retries
16
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Container Lifecycle
NodeManager Container process
Disk 1 Disk 2Disk 3
HDFS
ApplicationContainer
Data
17
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Scheduler improvements
⬢ In case of services, affinity and and anti-affinity become important–Affinity and anti-affinity apply at a container and an application level – e.g. don’t schedule two
HBase region servers on the same node but schedule the Spark containers on the same nodes as the region server
⬢ Support is being added for affinity and anti-affinity in the RM(YARN-5907)–Slider AM already has some basic support for container affinity and anti-affinity via re-tries
–RM can do a better job of container placement if it has first class support
–AMs can specify affinity and anti-affinity policies to get the right placement they need
18
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Scheduler improvements - Affinity and Anti-affinity
⬢ Anti-Affinity–Some services don’t want their daemons run on the same host/rack for better fault recovering or
performance.
–For example, don’t run >1 HBase region server on the same fault zone.
19
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Scheduler Improvements - Affinity and Anti-affinity
⬢ Affinity–Some services want to run their daemons on the same host/rack, etc. for performance.
–For example, run Storm workers as close as possible for better data exchanging performance. (SW = Storm Worker)
20
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN UI(YARN-3368)
21
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN UI - Services
22
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Application upgrades
⬢ YARN has no support for container or application upgrades–Container upgrade support support needs to be added in NM
–Application upgrade support has to be added in the RM
⬢ Support added for container upgrade and rollback(YARN-4726)–Application upgrade support still to be carried out
23
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Other issues
⬢ Log rotation–Log rotation used to run on application completion
–Support has been added to fetch the logs for running containers
⬢ Support for container monitoring/health checks
24
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
In Conclusion
⬢ Services workloads becoming more and more popular on YARN
⬢ Fundamental pieces to add support for services are in place but few additional pieces remain
25
© Hortonworks Inc. 2011 – 2016. All Rights Reserved25
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank you!