hpc performance in the cloud: status and future prospects€¦ · hpc performance in the cloud:...
TRANSCRIPT
![Page 1: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/1.jpg)
© 2009 VMware Inc. All rights reserved
HPC Performance in the Cloud:
Status and Future Prospects
ISC Cloud 2012
Josh Simons, Office of the CTO, VMware
![Page 2: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/2.jpg)
2
Cloud
“Cloud computing is a model for enabling ubiquitous, convenient,
on demand network access to a shared pool of configurable
computing resources (e.g., networks, servers, storage, applications,
and services) that can be rapidly provisioned and released with
minimal management effort or service provider interaction.”
National Institute of Standards and Technology, U.S. Department of Commerce
![Page 3: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/3.jpg)
3
VMware vCloud API
Users IT
Research Group 1 Research Group m
Public Clouds
Programmatic
Control and
Integrations
User Portals
Secure Private Cloud
Security
VMware
vShield
Research Cluster 1 Research Cluster n
VMware vCloud Director
VMware
vCenter Server
VMware vSphere VMware vSphere VMware vSphere
Catalogs
VMware
vCenter Server
VMware
vCenter Server
![Page 4: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/4.jpg)
4
Run Any Software Stacks
application a
operating system a
application b
operating system b
virtualization layer
hardware
virtualization layer
hardware
virtualization layer
hardware
![Page 5: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/5.jpg)
5
Separate workloads
application a
operating system a
application b
operating system b
virtualization layer
hardware
virtualization layer
hardware
virtualization layer
hardware
Secure multi-tenancy
Fault isolation
![Page 6: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/6.jpg)
6
Separate workloads
application a
operating system a
application b
operating system b
virtualization layer
hardware
virtualization layer
hardware
virtualization layer
hardware
Secure multi-tenancy
Fault isolation
…and sometimes Performance
![Page 7: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/7.jpg)
7
Use Resources More Efficiently
application a
operating system a
application b
operating system b
virtualization layer
hardware
virtualization layer
hardware
virtualization layer
hardware
application a
operating system a
application c
operating system b
application c
operating system a
![Page 8: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/8.jpg)
8
Use Resources More Efficiently
application a
operating system a
application b
operating system b
virtualization layer
hardware
virtualization layer
hardware
virtualization layer
hardware
application a
operating system a
application c
operating system b
application c
operating system a
![Page 9: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/9.jpg)
9
Protect Applications from Hardware Failures
virtualization layer
hardware
virtualization layer
hardware
virtualization layer
hardware
Reactive Fault Tolerance: “Fail and Recover”
application a
operating system
![Page 10: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/10.jpg)
10
Protect Applications from Hardware Failures
Virtualization
Hardware
virtualization layer
hardware
virtualization layer
hardware
Reactive Fault Tolerance: “Fail and Recover”
application a
operating system
![Page 11: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/11.jpg)
11
Protect Applications from Hardware Failures
virtualization layer
hardware
virtualization layer
hardware
virtualization layer
hardware
mpi rank 0
operating system
mpi rank 1
operating system
mpi rank 2
operating system
Proactive Fault Tolerance: “Move and Continue”
![Page 12: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/12.jpg)
12
Protect Applications from Hardware Failures
virtualization layer
hardware
Virtualization
Hardware
virtualization layer
hardware
mpi rank 0
operating system
mpi rank 1
operating system
mpi rank 2
operating system
Proactive Fault Tolerance: “Move and Continue”
![Page 13: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/13.jpg)
13
Virtualization
Performance
![Page 14: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/14.jpg)
14
When a Picture Is NOT Worth a Thousand Words
virtualization layer
hardware
application
operating system
![Page 15: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/15.jpg)
15
Linear Algebra
L. Youseff, K. Seymour, H. You, J. Dongarra, and R. Wolski. The impact of paravirtualized memory
hierarchy on linear algebra computational kernels and software. In HPDC, pages 141-152. ACM, 2008.
![Page 16: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/16.jpg)
16
Biosequence Analysis: BLAST
C. Macdonell and P. Lu, "Pragmatics of Virtual Machines for High-Performance Computing: A Quantitative
Study of Basic Overheads, " in Proc. of the High Perf. Computing & Simulation Conf., 2007.
![Page 17: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/17.jpg)
17
Biosequence Analysis: HMMer
C. Macdonell and P. Lu, "Pragmatics of Virtual Machines for High-Performance Computing: A Quantitative
Study of Basic Overheads, " in Proc. of the High Perf. Computing & Simulation Conf., 2007.
![Page 18: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/18.jpg)
18
Molecular Dynamics: GROMACS
C. Macdonell and P. Lu, "Pragmatics of Virtual Machines for High-Performance Computing: A Quantitative
Study of Basic Overheads, " in Proc. of the High Perf. Computing & Simulation Conf., 2007.
![Page 19: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/19.jpg)
19
Other Single-process Workloads
Also generally within 1 – 5% of native performance
• Electronic Design Automation (EDA)
• Digital Content Creation (DCC)
GPGPU
• About 98% of native performance, based on simple CUDA testing
![Page 20: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/20.jpg)
20
vHadoop Benchmarking Study
Seven-node Hadoop cluster (AMAX ClusterMax)
Standard tests: PI, DFSIO, Teragen / Terasort
Configurations: native, one VM per host, two VMs per host
Details:
• two-socket Intel X5650, 96 GB, Mellanox 10 GbE, 12x 7200rpm SATA
• RHEL 6.1, 6- or 12-vCPU VMs, vmxnet3
• Cloudera CDH3U0, replication=2, max 40 map and 10 reduce tasks per host
• Each physical host considered a “rack” in Hadoop’s topology description
• ESXi 5.0 w/dev Mellanox driver, disks passed to VMs via RDMs
A Benchmarking Case Study of Virtualized Hadoop Performance on
VMware vSphere 5, J. Buell
http://www.vmware.com/files/pdf/VMW-Hadoop-Performance-vSphere5.pdf
![Page 21: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/21.jpg)
21
Ratio to Native, Lower is Better
0
0.2
0.4
0.6
0.8
1
1.2 R
ati
o t
o N
ati
ve
1 VM
2 VMs
![Page 22: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/22.jpg)
22
Intel 2009 Experiments
Hardware
• Eight two-socket 2.93GHz X5570 (Nehalem-EP) nodes, 24 GB
• Dual-ported Mellanox DDR InfiniBand adaptor
• Mellanox 36-port switch
Software
• vSphere 4.0 (current version is 5.1)
• Platform Open Cluster Stack (OCS) 5 (native and guest)
• Intel compilers 11.1
• HPCC 1.3.1
• STAR-CD V4.10.008_x86
![Page 23: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/23.jpg)
23
HPCC Virtual to Native Run-time Ratios (Lower is Better)
Data courtesy of Marco Righini, Intel Italy
0
0.5
1
1.5
2
2.5
2n16p
4n32p
8n64p
![Page 24: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/24.jpg)
24
Point-to-point Message Size Distribution: STAR-CD
Source: http://www.hpcadvisorycouncil.com/pdf/CD_adapco_applications.pdf
![Page 25: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/25.jpg)
25
Collective Message Size Distribution: STAR-CD
Source: http://www.hpcadvisorycouncil.com/pdf/CD_adapco_applications.pdf
![Page 26: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/26.jpg)
26
STAR-CD Virtual to Native Run-time Ratios (Lower is Better)
1.00
1.19
1.15
0.90
0.95
1.00
1.05
1.10
1.15
1.20
1.25
Physical ESX4 (1 socket) ESX4 (2 socket)
STAR-CD A-Class Model (on 8n32p)
Data courtesy of Marco Righini, Intel Italy
![Page 27: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/27.jpg)
27
InfiniBand Bandwidth with Passthrough (VM DirectPath I/O)
0
500
1000
1500
2000
2500
3000
3500
2 4 8 16 32 64 128 256 512 1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 4M 8M
Ban
dw
idth
(M
B/s
)
Message size (bytes)
Send: Native
Send: ESXi
RDMA Read: Native
RDMA Read: ESXi
![Page 28: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/28.jpg)
28
Latency with VM DirectPath I/O (RDMA Read, Polling)
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
2 4 8 16 32 64 128 256 512 1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 4M 8M
Hal
f ro
un
dtr
ip la
ten
cy (
µs)
Message size (bytes)
Native
ESXi ExpA
MsgSize (bytes) Native ESXi ExpA
2 2.28 2.98
4 2.28 2.98
8 2.28 2.98
16 2.27 2.96
32 2.28 2.98
64 2.28 2.97
128 2.32 3.02
256 2.5 3.19
![Page 29: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/29.jpg)
29
Latency with VM DirectPath I/O (Send/Receive, Polling)
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
2 4 8 16 32 64 128 256 512 1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 4M 8M
Hal
f ro
un
dtr
ip la
ten
cy (µ
s)
Message size (bytes)
Native
ESXi ExpA
MsgSize (bytes) Native ESXi ExpA
2 1.35 1.75
4 1.35 1.75
8 1.38 1.78
16 1.37 2.05
32 1.38 2.35
64 1.39 2.9
128 1.5 4.13
256 2.3 2.31
![Page 30: HPC Performance in the Cloud: Status and Future Prospects€¦ · HPC Performance in the Cloud: Status and Future Prospects ISC Cloud 2012 Josh Simons, Office of the CTO, VMware](https://reader036.vdocuments.us/reader036/viewer/2022070909/5f8e0623a3b1be011b75a710/html5/thumbnails/30.jpg)
30
Summary and Conclusions
Virtualization can enable new capabilities not available in
traditional bare-metal HPC environments
Throughput-oriented applications generally run very well in virtual
environments with typical slowdowns of 1 - 5%, depending on I/O
characteristics
Latency sensitive MPI applications will generally experience
noticeable slowdowns, dependent on application messaging
profiles. 15% seen for industrial scale CFD code in 2009
Further latency reductions will improve MPI performance over time
Both the value of benefits and the performance cost will vary by
organization
Today’s cloud is not tomorrow’s cloud – we are at the very
beginning