tailored for spark
TRANSCRIPT
![Page 1: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/1.jpg)
Tailored for Spark
Hadoop Summit Dublin 2016Petr IgrevskiJohn ScheibmeireBay
![Page 2: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/2.jpg)
eBay - Tailored for Spark 2
How to tailor Spark for maximum impact
1. Optimal infrastructure
2. Customized user experience
![Page 3: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/3.jpg)
eBay - Tailored for Spark 3
Outline
1. eBay, Analytics, Hadoop, and Spark2. Spark Opportunities at eBay3. QA
![Page 4: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/4.jpg)
BACKGROUND
![Page 5: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/5.jpg)
5
eBay
eBay - Tailored for Spark
Q4 2015
![Page 6: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/6.jpg)
6
Analytics at eBay
Analytics
BI
Kylin MicroStrategy Tableau R / SAS
ETL
Ab Initio
Data Platform
Hadoop Teradata
eBay - Tailored for Spark
Streaming Spark
![Page 7: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/7.jpg)
7
Hadoop at eBay
1. Search Index2. Log Management3. Operation Metric Management4. Analytics
eBay - Tailored for Spark
![Page 8: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/8.jpg)
8
Hadoop Hardware
Multiple Generations
12-18 Cores
72-128GB RAM
24-72TB Storage
Provisioned by cabinet
eBay - Tailored for Spark
![Page 9: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/9.jpg)
9
Spark at eBay
• Uses– Spark 1.4 to Spark 1.6
• Methods– Yarn
• Current utilization– 20% analytic clusters
• Use Cases– Purchase Suggestions– Marketing Optimization– Customer Interests, Consistency, and Similarity– Kylin Cube Building
eBay - Tailored for Spark
![Page 10: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/10.jpg)
10
Spark Challenges
• Capacity Management and Efficiency– Map Reduce => Yarn– Job Sizing
• Support– Missing vendor support– Missing expertise
• Deployment– Library conflicts– Configuration challenges– Distribution sprawl
• Integration– Configuration
eBay - Tailored for Spark
![Page 11: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/11.jpg)
TAILORING SPARKSimple things should be simple. Complex things should be possible.
Alan Kay
eBay - Tailored for Spark11
![Page 12: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/12.jpg)
12
We can
• Copy• Test • Run
eBay - Tailored for Spark
![Page 13: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/13.jpg)
13
Opportunities for Spark
•Flexibility•Usability•Simplicity•Speed•Transparency
eBay - Tailored for Spark
![Page 14: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/14.jpg)
14
On YARN
• Security• Multitenancy• Reliability• Experience• Performance
eBay - Tailored for Spark
YARNSpark
HDFS HDFS SWIFT NFS
Ker
bero
s
![Page 15: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/15.jpg)
15
Does it fit?
• Compute• Storage• Network• Provisioning
eBay - Tailored for Spark
Shared Compute resources
Independently scalable storage
Flat Network
![Page 16: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/16.jpg)
16
Can we make it feel better?
• Standard ADLC• Test to your level of comfort• Single click deployment• Watch every step• Certify your job• Let it run• Did you say UI?
eBay - Tailored for Spark
Development
Test
Packaging
Certification
Runtime
RegisterRepos
CIMetadata DBProvisioning
Runtime farmOrchestrator
![Page 17: Tailored for Spark](https://reader035.vdocuments.us/reader035/viewer/2022070603/58713b971a28abf0568b6e73/html5/thumbnails/17.jpg)
17
Q/A
eBay - Tailored for Spark