a birds-eye view of pig and scalding jobs with hraven
DESCRIPTION
As Twitter's use of mapreduce rapidly expands, tracking usage on our clusters grows correspondingly more difficult. With an ever increasing job load, and a reliance on higher level abstractions such as Pig and Scalding, the utility of existing tools for viewing job history decreases rapidly, and extracting insights becomes a challenge. At Twitter, we created hRaven to fill this gap. hRaven archives the full history and metrics from all mapreduce jobs on our clusters, and strings together each job from a Pig or Scalding script execution into a combined flow. From this archive, we can easily derive aggregate resource utilization by user, pool, or application. While the historical trending of an individual application allows us to perform runtime optimization of resource scheduling. We will cover how hRaven provides a rich historical archive of mapreduce job execution, and how the data is structured into higher level flows representing the job sequence for frameworks such as Pig, Scalding, and Hive. We will then explore how we mine hRaven data to account for Hadoop resource utilization, to optimize runtime scheduling, and to identify common anti-patterns in user jobs. Finally, we will look at the end user experience, including Ambrose integration for flow visualization.TRANSCRIPT
![Page 1: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/1.jpg)
A Bird’s-Eye View of Pig and Scalding
with hRavena tale by @gario and @joep
Hadoop Summit 2013
v1.2
![Page 2: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/2.jpg)
@Twitter#HadoopSummit2013 2
Apache HBase PMC member andCommitter
Software Engineer @ Twitter
Core Storage Team - Hadoop/HBase
•
••
About the authors
Software Engineer @ Twitter
Engineering Manager Hadoop/HBaseteam @ Twitter
••
![Page 3: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/3.jpg)
@Twitter#HadoopSummit2013 3
Chapter 1: The ProblemChapter 2: Why hRaven?Chapter 3: How Does it Work?
3a: Loading
3b: Table structure / queryingChapter 4: Current UsesAppendix: Future Work
•
•
•
••
•
•
Table of Contents
![Page 4: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/4.jpg)
Chapter 1: The Problem
Illustration by Sirxlem (CC BY-NC-ND3.0)
![Page 5: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/5.jpg)
@Twitter#HadoopSummit2013 5
Most users run Pig and Scalding scripts, not straight map reduceJobTracker UI shows jobs, not DAGs of jobs generated by Pig and Scalding
•
•
Chapter 1: Mismatched Abstractions
![Page 6: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/6.jpg)
@Twitter#HadoopSummit2013
Chapter 1: A Problem of Scale
6
![Page 7: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/7.jpg)
@Twitter#HadoopSummit2013 7
How many Pig versus Scalding jobs do we run ?What cluster capacity do jobs in my pool take ?How many jobs do we run each day ?What % of jobs have > 30k tasks ?Why do I need to hand-tune these (hundreds) of jobs, can’t the cluster learn ?
•
•
•
•
•
Chapter 1: Questions
![Page 8: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/8.jpg)
@Twitter#HadoopSummit2013 8
How many Pig versus Scalding jobs do we run ?What cluster capacity do jobs in my pool take ?How many jobs do we run each day ?What % of jobs have > 30k tasks ?Why do I need to hand-tune these (hundreds) of jobs, can’t the cluster learn ?
•
•
•
•
•
Chapter 1: Questions
#Nevermore
![Page 9: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/9.jpg)
Chapter 2: Why hRaven?
Photo by DAVID ILIFF. License: CC-BY-SA3.0
![Page 10: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/10.jpg)
@Twitter#HadoopSummit2013 10
Stores stats, configuration and timing for every map reduce job on everyclusterStructured around the full DAG of jobs from a Pig or Scalding applicationEasily queryable for historical trendingAllows for Pig reducer optimization based on historical run statsKeep data online forever (12.6M jobs, 4.5B tasks + attempts)
•
•
•
•
•
Chapter 2: Why hRaven?
![Page 11: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/11.jpg)
@Twitter#HadoopSummit2013 11
cluster - each cluster has a unique name mapping to the Job Trackeruser - map reduce jobs are run as a given userapplication - a Pig or Scalding script (or plain map reduce job)flow - the combined DAG of jobs executed from a single run of anapplicationversion - changes impacting the DAG are recorded as a new version of thesame application
•
•
•
•
•
Chapter 2: Key Concepts
![Page 12: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/12.jpg)
@Twitter#HadoopSummit2013 12
Chapter 2: Application Flows
Edgar
![Page 13: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/13.jpg)
@Twitter#HadoopSummit2013 13
Chapter 2: Application Flows
Edgar
![Page 14: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/14.jpg)
@Twitter#HadoopSummit2013 14
All jobs in a flow are ordered together•
Chapter 2: Flow Storage
![Page 15: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/15.jpg)
@Twitter#HadoopSummit2013 15
Most recent flow is ordered first•
Chapter 2: Flow Storage
![Page 16: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/16.jpg)
@Twitter#HadoopSummit2013 16
All jobs in a flow are ordered togetherPer-job metrics stored
Total map and reduce tasks
HDFS bytes read / written
File bytes read / written
Total map and reduce slot milliseconds
Easy to aggregate stats for an entire flowEasy to scan the timeseries of each application’s flows
•
•
••••
•
•
Chapter 2: Key Features
![Page 17: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/17.jpg)
Chapter 3: How Does it Work?
![Page 18: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/18.jpg)
@Twitter#HadoopSummit2013 18
Chapter 3: ETL - Step 1: JobFilePreprocessor
![Page 19: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/19.jpg)
@Twitter#HadoopSummit2013 19
Chapter 3: ETL - Step 2: JobFileRawLoader
![Page 20: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/20.jpg)
@Twitter#HadoopSummit2013 20
Chapter 3: ETL - Step 3: JobFileProcessor
![Page 21: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/21.jpg)
@Twitter#HadoopSummit2013 21
Chapter 3: ETL - Step 3: JobFileProcessor
Jobs finish out of order with respect to job_id
![Page 22: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/22.jpg)
@Twitter#HadoopSummit2013 22
job_history_raw
job_history
job_history_task
job_history_app_version
•
•
•
•
Chapter 3: Tables
![Page 23: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/23.jpg)
@Twitter#HadoopSummit2013 23
Row key: cluster!jobID
Columns:
jobconf - stores serialized raw job_*_conf.xml file
jobhistory - stored serialized raw job history log file
job_processed_success - indicates whether job has been processed
•••
Chapter 3: job_history_raw
![Page 24: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/24.jpg)
@Twitter#HadoopSummit2013 24
Row key: cluster!user!application!timestamp!jobIDcluster - unique cluster name (ie. “cluster1@dc1”)
user - user running the application (“edgar”)
application - application ID derived from job configuration:
uses “batch.desc” property if set
otherwise parses a consistent ID from “mapred.job.name”
timestamp - inverted (Long.MAX_VALUE - value) value of submission time
jobID - stored as Job Tracker start time (long), concatenated with job sequence number
job_201306271100_0001 -> [1372352073732L][1L]
•••
••
••
•
Chapter 3: job_history
![Page 25: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/25.jpg)
@Twitter#HadoopSummit2013 25
Row key: cluster!user!application!timestamp!jobID!taskIDsame components as job_history key (same ordering)
taskID - (ie. “m_00001”) uniquely identifies individual task/attempt in job
Two row types:Task - “meta” row
cluster1@dc1!edgar!wordcount!9654...!...[00001]!m_00001
Task Attempt - individual execution on a Task Trackercluster1@dc1!edgar!wordcount!9654...!...[00001]!m_00001_1
••
•
•
Chapter 3: job_history_task
![Page 26: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/26.jpg)
@Twitter#HadoopSummit2013 26
Row key: cluster!user!application
Example: cluster1@dc1!edgar!wordcount
Columns:v1=1369585634000
v2=1372263813000
Chapter 3: job_history_app_version
![Page 27: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/27.jpg)
@Twitter#HadoopSummit2013 27
Using Pig’s HBaseStorage (or direct HBase APIs)Through Client APIThrough REST API
•
•
•
Chapter 3: Querying hRaven
![Page 28: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/28.jpg)
Chapter 4: Current Uses
![Page 29: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/29.jpg)
@Twitter#HadoopSummit2013 29
Pig reducer optimizationsCluster utilization / capacity planningApplication performance trending over timeIdentifying common job anti-patternsAd-hoc analysis troubleshooting cluster problems
•
•
•
•
•
Chapter 4: Current Uses
![Page 30: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/30.jpg)
@Twitter#HadoopSummit2013 30
Chapter 4: Cluster reads-writes
![Page 31: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/31.jpg)
@Twitter#HadoopSummit2013
Chapter 4: Pool / Application reads/writes
31
Pool view
Spike in File size read
Indicates jobs spilling
•
••
Application view
Spike in HDFS sizeread
Indicates spiking input
•
•
•
![Page 32: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/32.jpg)
@Twitter#HadoopSummit2013
Chapter 4: Pool usage: Used vs. Allocated
32
![Page 33: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/33.jpg)
@Twitter#HadoopSummit2013 33
Chapter 4: Compute cost
![Page 34: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/34.jpg)
Appendix: Future Work
![Page 35: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/35.jpg)
@Twitter#HadoopSummit2013 35
Real-time data loading from Job Tracker / Application MasterFull flow-centric UI (Job Tracker UI replacement)Hadoop 2.0 compatibility (in-progress)Ambrose integration
•
•
•
•
Appendix: Future Work
![Page 36: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/36.jpg)
@Twitter#HadoopSummit2013 36
hRaven on Githubhttps://github.com/twitter/hraven
hRaven Mailing [email protected]
•
••
Additional Resources
![Page 37: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/37.jpg)
@Twitter#HadoopSummit2013
Afterword
37
Now will thou drop your job data on the floor ?Quoth the hRaven, 'Nevermore.'
![Page 38: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/38.jpg)
#TheEnd@gario and @joep
Come visit us at booth #26 to continue the story
![Page 39: A Birds-Eye View of Pig and Scalding Jobs with hRaven](https://reader035.vdocuments.us/reader035/viewer/2022070315/55510281b4c9057b478b4eb3/html5/thumbnails/39.jpg)
@Twitter#HadoopSummit2013 39
Desired orderjob_201306271100_9999job_201306271100_10000...job_201306271100_99999job_201306271100_100000...job_201306271100_999999job_201306271100_1000000
•
Sort order Variable length job_idLexical order
job_201306271100_10000job_201306271100_100000job_201306271100_1000000job_201306271100_9999job_201306271100_99999job_201306271100_999999
•