field notes: yarn meetup at linkedin
DESCRIPTION
Notes from a variety of speakers at the YARN Meetup at LinkedIn on Sept 27, 2013TRANSCRIPT
YARN Meet Up Sep 2013@LinkedIn
By (lots of speakers)
(Editable) Agenda• Hadoop 2.0 beta – Vinod Kumar Vavilapalli
– YARN APIs stability– Existing applications
• Application History Server – Mayank Bansal• RM reliability – Bikas Saha, Jian He, Karthik Kambatla
– RM restartability– RM fail-over
• Apache Tez – Hitesh Shah, Siddharth Seth• Apache Samza• Apache Giraph • Apache Helix• gohadoop: go YARN application - Arun• Llama - Alejandro
Hadoop 2.0 beta
Vinod Kumar Vavilapalli
Hadoop 2.0 beta
• Stable YARN APIs• MR binary compatibility• Testing with the whole stack• Ready for prime-time!
YARN API stability
• YARN-386• Broke APIs for one last time• Hopefully • Exceptions/method-names• Security: Tokens used irrespective of kerberos• Read-only IDs, factories for creating records• Protocols renamed!• Client libraries
Compatibility for existing applications
• MAPREDUCE-5108• Old mapred APIs binary compatible• New mapreduce APIs source compatible• Pig, Hive, Oozie etc work with latest versions.
No need to rewrite your scripts.
Application History Server
Mayank Bansal
Contributions
Mayank Bansal
Zhije Shen
Devaraj K
Vinod Kumar Vavilapalli
YARN-321
Why we need AHS ?
• Job History Server is MR specific
• Jobs which are not MR
• RM Restart
• Hard coded Limits for number of jobs
• Longer running jobs
AHS• Different process or Embedded in RM
• Contains generic application Data
• Application• Application Attempts• Container
• Client Interfaces
• WEB UI• Client Interface• Web Services
AHS History Store• Pluggable History Store
• Storage Format is PB
• Backward Compatible• Much easier to evolve the storage
• HDFS implementation
AHS
RM
AHS
Store Write Interface
Store
Store
Reading
Interface
A
pp Submitted
App Finished
WEB APP
RPC
WS
Remaining Work
• Security
• Command Line Interface
Next Phase
Application Specific Data ???
Long running services
DEMO
RM reliability
Bikas SahaJian He
Karthik Kambatla
RM reliability
• Restartability• High availabilty
RM restartability
Jian HeBikas Saha
Design and Work Plan• YARN-128 – RM Restart
– Creates framework to store and load state information. Forms the basis of failover and HA. Work close to completion and being actively tested.
• YARN-149 – RM HA– Adds HA service to RM in order to failover between
instances. Work in active progress.• YARN-556 – Work preserving RM Restart
– Support loss-less recovery of cluster state when RM restarts or fails over
– Design proposal up• All the work is being done in a carefully planned manner
directly on trunk. Code is always stable and ready.
RM Restart (YARN-128)
• Current state of the impl• Internal details• Impact on applications/frameworks• How to use
RM Restart
• Supports ZooKeeper, HDFS, and local FileSystem as the underlying store.
• ClientRMProxy – all Clients (NM, AM, clients) of RM have the same
retry behavior while RM is down.• RM restart is working in secure environment
now!
Internal details
RMStateStore
• Two types of State Info:– Application related state info: asynchronously
• ApplicationState– ApplicationSubmissionContext ( AM ContainerLaunchContext,
Queue, etc.)
• ApplicationAttemptState– AM container, AMRMToken, ClientTokenMasterKey, etc.
– RMDelegationTokenSecretManager State(not application specific) : synchronously
• RMDelegationToken• RMDelegationToken MasterKey• RMDelegationToken Sequence Number
RM Recovery Workflow
• Save the app on app submission– User Provided credentials (HDFSDelegationToken)
• Save the attempt on AM attempt launch– AMRMToken, ClientToken
• RMDelegationTokenSecretManager– Save the token and sequence number on token
generation– Save master key when it rolls
• RM crashes….
What happens after RM restarts?
• Instruct the old AM to shutdown• Load the ApplicationSubmissionContext
– Submit the application• Load the earlier attempts
– Loads the attempt credentials (AMRMToken, ClientToken)
• Launch a new attempt
Impact on applications/frameworks
Consistency between Downstream consumers of AM and YARN
• AM should notify its consumers that the job is done only after YARN reports it’s done– FinishApplicationMasterResponse.getIsUnregister
ed()– User is expected to retry this API until it becomes
true.– Similarly, kill-application (fix in progress)
For MR AM
• Races:– JobClient: AM crashes after JobClient sees
FINISHED but before RM removes the app when app finishes
• Bugs: relaunch FINISHED application(succeeded, failed, killed)
– HistoryServer: History files flushed before RM removes the app when app finishes
How to use?
How to use: 3 steps
• 1. Enable RM restart– yarn.resourcemanager.recovery.enabled
• 2. Choose the underlying store you want (HDFS, ZooKeeper, local FileSystem)– yarn.resourcemanager.store.class– FileSystemRMStateStore / ZKRMStateStore
• 3. Configure the address of the store– yarn.resourcemanager.fs.state-store.uri– hdfs://localhost:9000/rmstore
YARN – Fail over
Karthik Kambatla
RM HA (YARN-149)
● Architecture
● Failover / Admin
● Fencing
● Config changes
● FailoverProxyProvider
Architecture
● Active / Standby
○ Standby is powered up, but doesn’t have any
state
● Restructure RM services (YARN-1098)
○ Always On services
○ Active Services (e.g. Client <-> RM, AM <-> RM)
● RMHAService (YARN-1027)
Failover / Admin
● CLI: yarn rmhaadmin
● Manual failover
● Automatic failover (YARN-1177)
○ Use ZKFC
○ Start it as an RM service instead of a separate
daemon.
○ Re-visit and strip out unnecessary parts.
Fencing (YARN-1222)
● Implicit fencing through ZK RM StateStore
● ACL-based fencing on store.load() during
transition to active
○ Shared read-write-admin access to the store
○ Claim exclusive create-delete access
○ All store operations create-delete a fencing node
○ The other RM can’t write to the store anymore
Config changes (YARN-1232, YARN-986)
1. <name>yarn.resourcemanager.address</name><value>clusterid</value>
2. <name>yarn.resourcemanager.ha.nodes.clusterid</name><value>rm1,rm2</value>
3. <name>yarn.resourcemanager.ha.id</name><value>rm1</value>
4. <name>yarn.resourcemanager.address.clusterid.rm1</name><value>host1:23140</value>
5. <name>yarn.resourcemanager.address.clusterid.rm2</name><value>host2:23140</value>
FailoverProxyProvider
● ConfiguredFailoverProxyProvider (YARN-
1028)
○ Use alternate RMs from the config during retry
○ ClientRMProxy
■ addresses client-based RPC addresses
○ ServerRMProxy
■ addresses server-based RPC addresses
Apache TEZ
Hitesh Shah
© Hortonworks Inc. 2011
What is Tez?• A data processing framework that can execute a complex DAG
of tasks.
Page 44Architecting the Future of Big Data
© Hortonworks Inc. 2011
Tez DAG and Tasks
Page 45Architecting the Future of Big Data
© Hortonworks Inc. 2011
TEZ as a Yarn Application• No deployment of TEZ jars required on all nodes in the Cluster
– Everything is pushed from either the client or from HDFS to the Cluster using YARN’s LocalResource functionality
– Ability to run multiple different versions
• TEZ Sessions– A single AM can handle multiple DAGs (“jobs”)– Amortize and hide platform latency
• Exciting new features– Support for complex DAGs – broadcast joins (Hive map joins)– Support for lower latency – container reuse and shared objects– Support for dynamic concurrency control – determine reduce parallelism at runtime
Page 46Architecting the Future of Big Data
© Hortonworks Inc. 2011
TEZ: Community• Early adopters and contributors welcome
– Adopters to drive more scenarios. Contributors to make them happen.– Hive and Pig communities are on-board and making great progress - HIVE-4660 and PIG-
3446 for Hive-on-Tez and Pig-on-Tez
• Meetup group – Please sign up to know more– http://www.meetup.com/Apache-Tez-User-Group
• Useful Links:– Website: http://tez.incubator.apache.org/– Code: http://git-wip-us.apache.org/repos/asf/incubator-tez.git– JIRA: https://issues.apache.org/jira/browse/TEZ– Mailing Lists:
– [email protected]– [email protected]
https://issues.apache.org/jira/browse/TEZ-65
Page 47Architecting the Future of Big Data
© Hortonworks Inc. 2011
Apache TezApache SamzaApache Giraph
Apache Helix
YARN usage @LinkedIn
YARN Go demo
https://github.com/hortonworks/gohadoopArun C Murthy
Llama
Alejandro Abdelnur