evaluating gpu passthrough in xen for high performance cloud computing
DESCRIPTION
Evaluating GPU Passthrough in Xen for High Performance Cloud Computing. Andrew J. Younge 1 , John Paul Walters 2 , Stephen P. Crago 2 , and Geoffrey C. Fox 1 1 Indiana University 2 USC / Information Sciences Institute. Where are we in the Cloud?. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/1.jpg)
Evaluating GPU Passthrough in Xen for High Performance Cloud
Computing Andrew J. Younge1, John Paul Walters2, Stephen P. Crago2, and Geoffrey C. Fox1
1 Indiana University
2 USC / Information Sciences Institute
![Page 2: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/2.jpg)
Where are we in the Cloud?
• Cloud computing spans may areas of expertise
• Today, focus only on IaaS and the underlying hardware
• Things we do here effect the entire pyramid!
http://futuregrid.org 2
![Page 3: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/3.jpg)
Motivation
• Need for GPUs on Clouds– GPUs are becoming commonplace in scientific
computing– Great performance-per-watt
• Different competing methods for virtualizing GPUs– Remote API for CUDA calls– Direct GPU usage within VM
• Advantages and disadvantages to both solutions
3
![Page 4: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/4.jpg)
Front-end GPU API • Translate all CUDA calls into remote method
invocations• Users share GPUs across a node or cluster• Can run within a VM, as no hardware is needed,
only a remote API• Many implementations for CUDA
– rCUDA, gVirtus, vCUDA, GViM, etc..
• Many desktop virtualization technologies do the same for OpenGL & DirectX
http://futuregrid.org 4
![Page 5: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/5.jpg)
Front-end GPU API
http://futuregrid.org 5
![Page 6: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/6.jpg)
Front-end API Limitations
• Can use remote GPUs, but all data goes over the network– Can be very inefficient for applications with non-
trivial memory movement
• Usually doesn’t support CUDA extensions in C– Have to separate CPU and GPU code– Requires special decouple mechanism
• Cannot directly drop in solution with existing solutions.
http://futuregrid.org 6
![Page 7: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/7.jpg)
Direct GPU Passthrough
• Allow VMs to directly access GPU hardware• Enables CUDA and OpenCL code• Utilizes PCI-passthrough of device to guest VM
– Uses hardware directed I/O virt (VT-d or IOMMU)– Provides direct isolation and security of device– Removes host overhead entirely
• Similar to what Amazon EC2 uses
http://futuregrid.org 7
![Page 8: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/8.jpg)
Direct GPU Passthrough
http://futuregrid.org 8
![Page 9: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/9.jpg)
9
Hardware SetupSandy Bridge + Kepler Westmere + Fermi
CPU (cores) 2x E5-2670 (16) 2x X5660 (12)
Clock Speed 2.6 GHz 2.6 GHz
RAM 48 GB 192 GB
NUMA Nodes 2 2
GPU 1x Nvidia Tesla K20m 2x Nvidia Tesla C2075
Type Linux Kernel Linux DistroNative Host 2.6.32-279 CentOS 6.4
Xen Dom0 4.2.22 3.4.53-8 CentOS 6.4
DomU Guest VM 2.6.32-279 CentOS 6.4
![Page 10: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/10.jpg)
SHOC Benchmark Suite
• Developed by Future Technologies Group @ Oak Ridge National Laboratory• Provides 70 benchmarks
– Synthetic micro benchmarks– 3rd party applications– OpenCL and CUDA implementations
• Represents well-rounded view for GPU performance
http://futuregrid.org 10
![Page 11: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/11.jpg)
http://futuregrid.org 11
![Page 12: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/12.jpg)
http://futuregrid.org 12
![Page 13: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/13.jpg)
http://futuregrid.org 13
![Page 14: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/14.jpg)
http://futuregrid.org 14
![Page 15: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/15.jpg)
Initial Thoughts
• Raw GPU computational abilities impacted less than 1% in VMs compared to base system– Excellent sign for supporting GPUs in the Cloud
• However, overhead occurs during large transfers between CPU & GPU– Much higher overhead for Westmere/Fermi test
architecture– Around 15% overhead in worst-case benchmark– Sandy-bridge/Kepler overhead lower
http://futuregrid.org 15
![Page 16: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/16.jpg)
http://futuregrid.org 16
![Page 17: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/17.jpg)
http://futuregrid.org 17
![Page 18: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/18.jpg)
Discussion• GPU Passthrough possible in Xen!
– Results show high performance GPU computation a reality with Xen
• Overhead is minimal for GPU computation – Sandy-Bridge/Kepler has < 1.2% overall overhead– Westmere/Fermi has < 1% computational overhead, 7-25%
PCIE overhead
• PCIE overhead not likely due to VT-d mechanisms– NUMA configuration in Westmere CPU architecture
• GPU PCI Passthrough performs better than other front-end remote API solutions
http://futuregrid.org 18
![Page 19: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/19.jpg)
Future Work
• Support PCI Passthrough in Cloud IaaS Framework – OpenStack Nova– Work for both GPUs and other PCI devices– Show performance better than EC2
• Resolve NUMA issues with Westmere architecture and Fermi GPUs
• Evaluate other hypervisor GPU possibilities • Support large scale distributed CPU+GPU
computation in the Cloudhttp://futuregrid.org 19
![Page 20: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/20.jpg)
Conclusion
• GPUs are here to stay in scientific computing– Many Petascale systems use GPUs– Expected GPU Exascale machine (2020-ish)
• Providing HPC in the Cloud is key to the viability of scientific cloud computing.
• OpenStack provides an ideal architecture to enable HPC in clouds.
http://futuregrid.org 20
![Page 21: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/21.jpg)
Thanks!
Acknowledgements:• NSF FutureGrid project
– GPU cluster hardware– FutureGrid team @ IU
• USC/ISI APEX research group
• Persistent Systems Graduate Fellowship
• Xen open source community
About Me:Andrew J. Younge
Ph.D CandidateIndiana University Bloomington, IN USAEmail – [email protected] – http://ajyounge.com
http://portal.futuregrid.org
http://futuregrid.org 21
![Page 22: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/22.jpg)
Extra Slides
http://futuregrid.org 22
![Page 23: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/23.jpg)
FutureGrid: a Distributed Testbed
PrivatePublic FG Network
NID: Network Impairment Device
![Page 24: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/24.jpg)
http://futuregrid.org 24
![Page 25: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/25.jpg)
OpenStack GPU Cloud Prototype
http://futuregrid.org 25
![Page 26: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/26.jpg)
~ 1.25%
26
![Page 27: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/27.jpg)
~.64%
~3.62%
27
![Page 28: Evaluating GPU Passthrough in Xen for High Performance Cloud Computing](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681572a550346895dc4c488/html5/thumbnails/28.jpg)
Overhead in Bandwidth
28