Field Notes: YARN Meetup at LinkedIn

Download Field Notes: YARN Meetup at LinkedIn

Post on 26-Jan-2015




1 download

Embed Size (px)


Notes from a variety of speakers at the YARN Meetup at LinkedIn on Sept 27, 2013


  • 1. YARN Meet Up Sep 2013 @LinkedIn By (lots of speakers)

2. (Editable) Agenda Hadoop 2.0 beta Vinod Kumar Vavilapalli YARN APIs stability Existing applications Application History Server Mayank Bansal RM reliability Bikas Saha, Jian He, Karthik Kambatla RM restartability RM fail-over Apache Tez Hitesh Shah, Siddharth Seth Apache Samza Apache Giraph Apache Helix gohadoop: go YARN application - Arun Llama - Alejandro 3. Hadoop 2.0 beta Vinod Kumar Vavilapalli 4. Hadoop 2.0 beta Stable YARN APIs MR binary compatibility Testing with the whole stack Ready for prime-time! 5. YARN API stability YARN-386 Broke APIs for one last time Hopefully Exceptions/method-names Security: Tokens used irrespective of kerberos Read-only IDs, factories for creating records Protocols renamed! Client libraries 6. Compatibility for existing applications MAPREDUCE-5108 Old mapred APIs binary compatible New mapreduce APIs source compatible Pig, Hive, Oozie etc work with latest versions. No need to rewrite your scripts. 7. Application History Server Mayank Bansal 8. ContributionsMayank Bansal Zhije Shen Devaraj K Vinod Kumar VavilapalliYARN-321 9. Why we need AHS ? Job History Server is MR specificJobs which are not MRRM RestartHard coded Limits for number of jobsLonger running jobs 10. AHS Different process or Embedded in RMContains generic application Data Application Application Attempts ContainerClient Interfaces WEB UI Client Interface Web Services 11. AHS History Store Pluggable History Store Storage Format is PB Backward Compatible Much easier to evolve the storageHDFS implementation 12. AHS Store Write InterfaceStoreRMStore Reading InterfaceApp FinishedWEB APPWS AHSRPC 13. Remaining Work SecurityCommand Line Interface 14. Next PhaseApplication Specific Data ???Long running services 15. DEMO 16. RM reliability Bikas Saha Jian He Karthik Kambatla 17. RM reliability Restartability High availabilty 18. RM restartability Jian He Bikas Saha 19. Design and Work Plan YARN-128 RM Restart Creates framework to store and load state information. Forms the basis of failover and HA. Work close to completion and being actively tested. YARN-149 RM HA Adds HA service to RM in order to failover between instances. Work in active progress. YARN-556 Work preserving RM Restart Support loss-less recovery of cluster state when RM restarts or fails over Design proposal up All the work is being done in a carefully planned manner directly on trunk. Code is always stable and ready. 20. RM Restart (YARN-128) Current state of the impl Internal details Impact on applications/frameworks How to use 21. RM Restart Supports ZooKeeper, HDFS, and local FileSystem as the underlying store. ClientRMProxy all Clients (NM, AM, clients) of RM have the same retry behavior while RM is down. RM restart is working in secure environment now! 22. Internal details 23. RMStateStore Two types of State Info: Application related state info: asynchronously ApplicationState ApplicationSubmissionContext ( AM ContainerLaunchContext, Queue, etc.) ApplicationAttemptState AM container, AMRMToken, ClientTokenMasterKey, etc. RMDelegationTokenSecretManager State(not application specific) : synchronously RMDelegationToken RMDelegationToken MasterKey RMDelegationToken Sequence Number 24. RM Recovery Workflow Save the app on app submission User Provided credentials (HDFSDelegationToken) Save the attempt on AM attempt launch AMRMToken, ClientToken RMDelegationTokenSecretManager Save the token and sequence number on token generation Save master key when it rolls RM crashes. 25. What happens after RM restarts? Instruct the old AM to shutdown Load the ApplicationSubmissionContext Submit the application Load the earlier attempts Loads the attempt credentials (AMRMToken, ClientToken) Launch a new attempt 26. Impact on applications/frameworks 27. Consistency between Downstream consumers of AM and YARN AM should notify its consumers that the job is done only after YARN reports its done FinishApplicationMasterResponse.getIsUnregister ed() User is expected to retry this API until it becomes true. Similarly, kill-application (fix in progress) 28. For MR AM Races: JobClient: AM crashes after JobClient sees FINISHED but before RM removes the app when app finishes Bugs: relaunch FINISHED application(succeeded, failed, killed) HistoryServer: History files flushed before RM removes the app when app finishes 29. How to use? 30. How to use: 3 steps 1. Enable RM restart yarn.resourcemanager.recovery.enabled 2. Choose the underlying store you want (HDFS, ZooKeeper, local FileSystem) FileSystemRMStateStore / ZKRMStateStore 3. Configure the address of the store yarn.resourcemanager.fs.state-store.uri hdfs://localhost:9000/rmstore 31. YARN Fail over Karthik Kambatla 32. RM HA (YARN-149) Architecture Failover / Admin Fencing Config changes FailoverProxyProvider 33. Architecture Active / Standby Standby is powered up, but doesnt have any state Restructure RM services (YARN-1098) Always On servicesActive Services (e.g. Client RM, AM RM) RMHAService (YARN-1027) 34. Failover / Admin CLI: yarn rmhaadmin Manual failover Automatic failover (YARN-1177) Use ZKFCStart it as an RM service instead of a separate daemon.Re-visit and strip out unnecessary parts. 35. Fencing (YARN-1222) Implicit fencing through ZK RM StateStore ACL-based fencing on store.load() during transition to active Shared read-write-admin access to the storeClaim exclusive create-delete accessAll store operations create-delete a fencing nodeThe other RM cant write to the store anymore 36. Config changes (YARN-1232, YARN-986) 1. yarn.resourcemanager.addressclusterid 2. yarn.resourcemanager.ha.nodes.clusteridrm1,rm2 3. yarn.resourcemanager.ha.idrm1 4. yarn.resourcemanager.address.clusterid.rm1host1:23140 5. yarn.resourcemanager.address.clusterid.rm2host2:23140 37. FailoverProxyProvider ConfiguredFailoverProxyProvider (YARN1028) Use alternate RMs from the config during retryClientRMProxy addresses client-based RPC addressesServerRMProxy addresses server-based RPC addresses 38. Apache TEZ Hitesh Shah 39. What is Tez? A data processing framework that can execute a complex DAG of tasks.Architecting the Future of Big Data Hortonworks Inc. 2011Page 44 40. Tez DAG and TasksArchitecting the Future of Big Data Hortonworks Inc. 2011Page 45 41. TEZ as a Yarn Application No deployment of TEZ jars required on all nodes in the Cluster Everything is pushed from either the client or from HDFS to the Cluster using YARNs LocalResource functionality Ability to run multiple different versions TEZ Sessions A single AM can handle multiple DAGs (jobs) Amortize and hide platform latency Exciting new features Support for complex DAGs broadcast joins (Hive map joins) Support for lower latency container reuse and shared objects Support for dynamic concurrency control determine reduce parallelism at runtimeArchitecting the Future of Big Data Hortonworks Inc. 2011Page 46 42. TEZ: Community Early adopters and contributors welcome Adopters to drive more scenarios. Contributors to make them happen. Hive and Pig communities are on-board and making great progress - HIVE-4660 and PIG-3446 for Hive-on-Tez and Pig-on-Tez Meetup group Please sign up to know more Useful Links: Website: Code: JIRA: Mailing Lists: user-subscribe@tez.incubator.apache.org the Future of Big Data Hortonworks Inc. 2011Page 47 43. Hortonworks Inc. 2011 44. Apache Tez Apache Samza Apache Giraph Apache Helix YARN usage @LinkedIn 45. YARN Go demo Arun C Murthy 46. Llama Alejandro Abdelnur