amazon web services - cs.ucf.edudcm/teaching/cda5532-cloudcomputing... · amazon web services:...

17
Amazon Web Services: Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud Summarized by: Michael Riera 9/17/2011 University of Central Florida – CDA5532

Upload: dangdat

Post on 11-May-2019

217 views

Category:

Documents


0 download

TRANSCRIPT

Amazon Web Services:Performance Analysis of High Performance Computing Applications on the

Amazon Web Services Cloud

Summarized by: Michael Riera

9/17/2011

University of Central Florida – CDA5532

Agenda

• Purpose

• Benchmarks used

• Machine Setups (including EC2)

• Experiment Setup• Experiment Setup

• Results

• Conclusions

Introduction

• The purpose of this paper is to compareAmazon EC2 service performance againstindustry standard benchmarks for HighPerformance Computing data centers.Performance Computing data centers.

• This papers draws comparison betweenknown super computers, and HP data center,and AWS EC2

Benchmarks

• NERSC Framework– Workload includes:

• Areas of climate

• Materials science

• Fusion• Fusion

• Accelerator modeling

• Astrophysics

• Quantum Chromodynamics

• Integrated Performance Monitoring– Used to quantify the computing and communications

with MPI interfaces.

Machine Setup

• Carver

– National Energy Research Scientific ComputingCenter at Lawrence Berkeley National Labs.

– 400 nodes– 400 nodes

• Quad-core Intel Nehalem 2.67 Ghz

• Dual socket nodes and a single Quad Data Rate (QDR)

• Each Node has 24 GB of RAM (3GB per core)

Machine Setup

• Franklin– National Energy Research Scientific Computing

(NERSC) Center at Lawrence Berkeley NationalLabs.

– 9660 nodes– 9660 nodes• Cray XT4 supercomputers

• Single quad-core 2.3 Ghz AMD Opteron “Budapest”processpr

• 6.4Gb interconnects (node innerconnect)

• Each Node has 8 GB of RAM (2 GB per core)

Machine Setup

• Lawrencium

– Information Technology Division at Berkeley

– 198 nodes (1584 core)

• Dell PowerEdge 1950 server• Dell PowerEdge 1950 server

• Two Intel Xeon quad-core 64 bit, 2.66Ghz Harptownprocessors

• DDR Infiniband network

• Each node, 16GB of RAM (2GB per core)

Machine Setup

• Amazon EC2

– Virtual configuration

• CPU Capacity is defined in terms of an abstract AmazonEC2 compute unit.EC2 compute unit.

• EC2 CU are approximately equivalent to 1.0 – 1.2 Ghz

• The large instances has:– 4 EC2 Compute Units

– 2 Virtual Cores

– 7.5 GB of memory

– Interconnect: Gigabit ethernet

Machine Setup

Machine Setup

• /proc/cpuinfo

• Different combinations (no control overassignation)

– Intel Xeon E5430 2.66Ghz quad-core processor– Intel Xeon E5430 2.66Ghz quad-core processor

– AMD Opteron 270 2.0Ghz dual-cores

– AMD Opteron 2218 HE 2.6Ghz dual-core

Experiment Setup

• CAM

– The community Atmosphere Model (CAM) is theatmospheric component of the CommunityClimate System Model (CCSM)Climate System Model (CCSM)

• GAMESS

– Uses sockets communication

– Considered stride-1 memory access, whichstresses memory bandwidth, and interconnectcollective performance

Experiment Setup

• GTC– Fully self-consistent, gyrokinetic 3-D Particle-in-cell (PIC) code with a

non-spectral poisson solver

• IMPACT-T– Integrated Map and Particle Accelerator Tracking Time– Uses Hockneys FFT

• MAESTRO• MAESTRO– Used to simulating astrophysical flows such as those leading up to

ignition in Type Ia supernovae

• MILC– Represents lattice computation that is used to study Quantum

ChromoDynamics.

• Paratec– Performs Density Functional Theory quantum-mechanical total energy

calculations using pseudi-potentials

Results

Results

Franklin, Lawrence, and EC2, are 1.4x, 2.6x and 2.7x slower than Carver In GAMES Worse case onPARATEC, EC2 is more than 50x slower than Carver. Paratec performs a 3-DFFT and EC2

performed 52x slower than carver

Results

Results:AWS Cloud HW Variance

CONCLUSION

• Cannot control type of hardware in the cloud

• Near supercomputer speeds at every household