alluxio: the missing piece of on-demand clusters at alluxio meetup 2016
TRANSCRIPT
About Me
• Calvin Jia
• Software Engineer @ Alluxio, Inc.
• Alluxio PMC
• #1 Alluxio Contributor
• Twitter: @JiaCalvin
2
Alluxio Inc.
• Founded by Alluxio creators and top committers • Formerly Tachyon Nexus, Inc. • $7.5 million Series A by Andreessen Horowitz • Committed to the Alluxio Open Source Project • Company Website: http://www.alluxio.com • We are hiring!
3
Cloud Architectures – Overview
• Mostly service based, from providers – Amazon Web Services – Google Cloud Platform
• Separate compute and storage clusters • Compute clusters are ephemeral
5
Cloud Architectures – Pros & Cons
Pros • Low maintenance • Pay as you go • Elastic and scalable • Cost effective storage
Cons • Lower Performance
6
Alluxio in Cloud Architectures – Overview
• Deployed in compute clusters – Memory speed data access – Transparent data access to any storage
• Simple to deploy – Mount storage systems similar to local disks
8
Alluxio in Cloud Architectures – Benefits
10
• Remedies the performance drawback • Acceleration due to memory-speed I/O • Designed to improve the affinity of compute
and storage
Alluxio in Cloud Architectures – Data Path
11
FAST 104 - 105 MB/s
MODERATE 103 - 104 MB/s
SLOW 102 - 103 MB/s
Only when necessary Limited
Often
SSD HDD
Mem
Takeaways – Experiment Results
12
0
100
200
300
400
500
600
Ini/al Read Subsequent Read Read from Separate Job
Run+
me (secon
ds)
Spark -‐ No Persist
Spark -‐ Persist
Alluxio
Takeaways – Alluxio & Cloud Architectures
13
• Cloud architectures have significant upsides • Alluxio alleviates the major downsides