evaluating gpu passthrough in xen for high performance cloud computing

Post on 28-Jan-2016

44 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Evaluating GPU Passthrough in Xen for High Performance Cloud Computing. Andrew J. Younge 1 , John Paul Walters 2 , Stephen P. Crago 2 , and Geoffrey C. Fox 1 1 Indiana University 2 USC / Information Sciences Institute. Where are we in the Cloud?. - PowerPoint PPT Presentation

TRANSCRIPT

Evaluating GPU Passthrough in Xen for High Performance Cloud

Computing Andrew J. Younge1, John Paul Walters2, Stephen P. Crago2, and Geoffrey C. Fox1

1 Indiana University

2 USC / Information Sciences Institute

Where are we in the Cloud?

• Cloud computing spans may areas of expertise

• Today, focus only on IaaS and the underlying hardware

• Things we do here effect the entire pyramid!

http://futuregrid.org 2

Motivation

• Need for GPUs on Clouds– GPUs are becoming commonplace in scientific

computing– Great performance-per-watt

• Different competing methods for virtualizing GPUs– Remote API for CUDA calls– Direct GPU usage within VM

• Advantages and disadvantages to both solutions

3

Front-end GPU API • Translate all CUDA calls into remote method

invocations• Users share GPUs across a node or cluster• Can run within a VM, as no hardware is needed,

only a remote API• Many implementations for CUDA

– rCUDA, gVirtus, vCUDA, GViM, etc..

• Many desktop virtualization technologies do the same for OpenGL & DirectX

http://futuregrid.org 4

Front-end GPU API

http://futuregrid.org 5

Front-end API Limitations

• Can use remote GPUs, but all data goes over the network– Can be very inefficient for applications with non-

trivial memory movement

• Usually doesn’t support CUDA extensions in C– Have to separate CPU and GPU code– Requires special decouple mechanism

• Cannot directly drop in solution with existing solutions.

http://futuregrid.org 6

Direct GPU Passthrough

• Allow VMs to directly access GPU hardware• Enables CUDA and OpenCL code• Utilizes PCI-passthrough of device to guest VM

– Uses hardware directed I/O virt (VT-d or IOMMU)– Provides direct isolation and security of device– Removes host overhead entirely

• Similar to what Amazon EC2 uses

http://futuregrid.org 7

Direct GPU Passthrough

http://futuregrid.org 8

9

Hardware SetupSandy Bridge + Kepler Westmere + Fermi

CPU (cores) 2x E5-2670 (16) 2x X5660 (12)

Clock Speed 2.6 GHz 2.6 GHz

RAM 48 GB 192 GB

NUMA Nodes 2 2

GPU 1x Nvidia Tesla K20m 2x Nvidia Tesla C2075

Type Linux Kernel Linux DistroNative Host 2.6.32-279 CentOS 6.4

Xen Dom0 4.2.22 3.4.53-8 CentOS 6.4

DomU Guest VM 2.6.32-279 CentOS 6.4

SHOC Benchmark Suite

• Developed by Future Technologies Group @ Oak Ridge National Laboratory• Provides 70 benchmarks

– Synthetic micro benchmarks– 3rd party applications– OpenCL and CUDA implementations

• Represents well-rounded view for GPU performance

http://futuregrid.org 10

http://futuregrid.org 11

http://futuregrid.org 12

http://futuregrid.org 13

http://futuregrid.org 14

Initial Thoughts

• Raw GPU computational abilities impacted less than 1% in VMs compared to base system– Excellent sign for supporting GPUs in the Cloud

• However, overhead occurs during large transfers between CPU & GPU– Much higher overhead for Westmere/Fermi test

architecture– Around 15% overhead in worst-case benchmark– Sandy-bridge/Kepler overhead lower

http://futuregrid.org 15

http://futuregrid.org 16

http://futuregrid.org 17

Discussion• GPU Passthrough possible in Xen!

– Results show high performance GPU computation a reality with Xen

• Overhead is minimal for GPU computation – Sandy-Bridge/Kepler has < 1.2% overall overhead– Westmere/Fermi has < 1% computational overhead, 7-25%

PCIE overhead

• PCIE overhead not likely due to VT-d mechanisms– NUMA configuration in Westmere CPU architecture

• GPU PCI Passthrough performs better than other front-end remote API solutions

http://futuregrid.org 18

Future Work

• Support PCI Passthrough in Cloud IaaS Framework – OpenStack Nova– Work for both GPUs and other PCI devices– Show performance better than EC2

• Resolve NUMA issues with Westmere architecture and Fermi GPUs

• Evaluate other hypervisor GPU possibilities • Support large scale distributed CPU+GPU

computation in the Cloudhttp://futuregrid.org 19

Conclusion

• GPUs are here to stay in scientific computing– Many Petascale systems use GPUs– Expected GPU Exascale machine (2020-ish)

• Providing HPC in the Cloud is key to the viability of scientific cloud computing.

• OpenStack provides an ideal architecture to enable HPC in clouds.

http://futuregrid.org 20

Thanks!

Acknowledgements:• NSF FutureGrid project

– GPU cluster hardware– FutureGrid team @ IU

• USC/ISI APEX research group

• Persistent Systems Graduate Fellowship

• Xen open source community

About Me:Andrew J. Younge

Ph.D CandidateIndiana University Bloomington, IN USAEmail – ajyounge@indiana.eduWebsite – http://ajyounge.com

http://portal.futuregrid.org

http://futuregrid.org 21

Extra Slides

http://futuregrid.org 22

FutureGrid: a Distributed Testbed

PrivatePublic FG Network

NID: Network Impairment Device

http://futuregrid.org 24

OpenStack GPU Cloud Prototype

http://futuregrid.org 25

~ 1.25%

26

~.64%

~3.62%

27

Overhead in Bandwidth

28

top related