(bdt202) hpc now means 'high personal computing' | aws re:invent 2014

November 13, 2014 | Las Vegas, NV

Sérgio Mafra – ONS (Operador Nacional do Sistema Elétrico)

Ricardo Geh – AWS Enterprise Solutions Architect

Shifting

the

Paradigm

FlexibilityHow HPC can be used as

utility

Shifting

the

Paradigm

Pay As You Go Model

Use only what you need

Multiple pricing models

On-Premises

Capital Expense Model

High upfront capital cost

High cost of ongoing support

HPC as utility

Elastic Cloud-Based Resources

Actual demand

Resources scaled to demand

Waste Customer

Dissatisfaction

Actual Demand

Predicted Demand

Rigid On-Premises Resources

Scale using Elastic Capacity

>600 cores

Scalability on AWS

<10 cores

>1500

cores

Making Production Cloud HPC easy from 64 cores to

…

PharmaJohnson &

Johnson

ManufacturingHGST, a Western

Digital Company

Financial ServicesPacific Life Insurance

GenomicsLife Technologies

ResearchThe Aerospace

Corporation

… 156,314 cores for better solar panel materials for $33k, not $68M

Amazon EC2

16,788 Spot

Instances

Amazon S3

4TB

Processed

Spot Instances

on all 8 Regions

1.21 PetaFLOPS

Intel SandyBridge

on CC2


utility

Cost-optimizationIt’s about new cost models and

new ways to enable your business

to do more.

Shifting

the

Paradigm

On-Demand

Pay for compute

capacity by the hour

with no long-term

commitments

For spiky workloads,

or to define needs

Reserved

Make a low, one-time

payment and receive a

significant discount on

the hourly charge

For committed

utilization

Spot

Bid for unused capacity,

charged at a Spot Price

which fluctuates based

on supply and demand

For time-insensitive or

transient workloads

0

2

4

6

8

10

12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

Heavy Utilization Reserved Instances

Light RI Light RILight RILight RI

On-DemandSpot and

On-

Demand

100%

80%

60%

40%

20%

Percentage of Peak Requirements Over Time

EC2 Compute Units (HP)

Mem

ory

(G

B)

256

128

64

32

16

8

4

2

1

1 2 4 8 16 32 64 128

High C

PU

High M

emory

Cluste

r Com

pute

& Hig

h I/O

Mic

ro

Standard

Cluste

r Hig

h

Memory

& H

igh

Stora

ge



to do more.


utility

Performance and powerFrom embarrassingly parallel to

tightly coupled

Shifting

the

Paradigm

CLI, API, and console

Scripted configurations

Automation & control

Automatic re-sizing of compute clusters

based upon demand and policies

cfncluster (“CloudFormation cluster”)

Command Line Interface Tool

Deploy and demo an HPC cluster

For more info:

aws.amazon.com/hpc/resources

Try our HPC AWS CloudFormation-based demo

Cluster compute instances

Implement HVM process execution

Intel® Xeon® processors

10 Gigabit Ethernet – Enhanced networking, SR-IOV

c3.8xlarge

32 vCPUs

2.8 GHz Intel Xeon

E5-2680v2 Ivy Bridge

60GB RAM

2 x 320 GB

Local SSD

Performance for tightly-coupled workloads

r3.8xlarge

32 vCPUs

2.5 GHz Intel Xeon


244 GB RAM

2 x 320 GB

Local SSD

i2.8xlarge

32 vCPUs

2.5 GHz Intel Xeon


244 GB RAM

8 x 800 GB

Local SSD

Network placement groups

Cluster instances deployed in a Placement

Group enjoy low latency, full bisection

10 Gbps bandwidth

10Gbps


GPU compute instances

cg1.8xlarge

33.5 EC2 Compute Units

20GB RAM

2x NVIDIA GPU

448 Cores

3GB Mem

g2.2xlarge

26 EC2 Compute Units

16GB RAM

1x NVIDIA GPU

1536 Cores

4GB Mem

G2 instances

Intel® Intel Xeon E5-2670

1 NVIDIA Kepler GK104 GPU

I/O Performance: Very High

CG1 instances

Intel® Xeon® X5570 processors

2 x NVIDIA Tesla “Fermi” M2050 GPUs

I/O Performance: Very High



utility

Achieve morePerform bigger, more

complex jobs in a much

reduced time

Performance and powerFlexibility to choose platforms

Shifting

the

Paradigm



to do more

Oil and Gas

Seismic Data Processing

Reservoir Simulations,

Modeling

Geospatial applications

Predictive Maintenance

Manufacturing & Engineering

Computational Fluid Dynamics

(CFD)

Finite Element Analysis (FEA)

Wind Simulation

Life Sciences

Genome Analysis

Molecular Modeling

Protein Docking

Media & Entertainment

Transcoding and Encoding

DRM, Encryption

Rendering

Energy & Scientific

Computing

Computational Chemistry

High Energy Physics

Stochastic Modeling

Quantum Analysis

Energy Models

Climate Models

Financial

Monte Carlo Simulations

Wealth Management Simulations

Portfolio, Credit Risk Analytics

High Frequency Trading

Analytics

Customers are using AWS for more and more

HPC workloads

Who is ONS?

ONS’s

Journey to

the Cloud

NorthIsolated

BrasiliaMain CC

RecifeN/NE Branch

Rio de JaneiroSoutheast Branch

FlorianopolisSouth Branch

The ChallengeWho is ONS?

ONS’s

Journey to

the Cloud

Math Models

Medium

Term

Short

Term

Horizon: 1 to 6 months

Stage: week

Horizon: 5 years

Stage: month NEWAVE

DECOMP

More uncertainty and fewer details

Less uncertainty and more details

Up

da

tin

g o

f o

per

ati

ng c

on

dit

ion

s

Use Hydro

Use Thermal to supplement Hydro

OK

Energy Deficit(load shedding)

Spillage(waste)

OK

Decision

Immediate Cost Future CostTotal Cost

Decomp

“We need more power”

NewaveWeather

Forecast

Parallel Processing

“Sorry, we don´t have any power left”“My job first… pleease!”

1. Elastic Environment• Unlimited processing power

• Ideal for unexpected load

2. Low data transfer• Input and output of small files

• Ideal for internet connection

3. Variable and right cost• Pay per use

• Don´t need to buy huge servers

http://star.mit.edu/cluster/

http://star.mit.edu/cluster/

Cluster 1 Cluster 2 Cluster 3

Config

Work A Work B Work C


The Journey

ONS’s

Journey to

the Cloud

Simply put.. SHOW me

the results..

Engineers would

get lost with the

AWS Management

Console

They needed a easy, task-

oriented portal

1. Self Service

3. Accountability2. Usage Control

Cluster Name # Nodes On/Off

Work

(EBS)

Amazon EC2

Amazon EBS

MIT StarCluster

1 Tb

HPC Cluster

1 Gbps

10 Gbps

Controller

Master

Node1

Node3

Node N

ComputingNodes

c3.8xlarge(2 reserved)

MasterNode

c3.8xlarge

Node2

Chrome

Data

Control

Virtual Private Cloud

Work

Private Subnet Public Subnet

HPC Cluster Controller

Internet/

AWS

VPN site-site

Internet

Gateway


The Journey

ONS’s

Journey to

the Cloud

Lessons Learned

Infrastructure as code

HPC gets personal

http://bit.ly/awsevals

[email protected]

[email protected]

http://bit.ly/awsevals

mailto:[email protected]

mailto:[email protected]

(bdt202) hpc now means 'high personal computing' | aws re:invent 2014

Technology