cs 744: datacenter as a computer

40
CS 744: DATACENTER AS A COMPUTER Shivaram Venkataraman Fall 2019 With slides from Mosharaf Chowdhury and Ion Stoica

Upload: others

Post on 13-Jun-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 744: DATACENTER AS A COMPUTER

CS 744: DATACENTER AS A COMPUTER

Shivaram Venkataraman Fall 2019

With slides from Mosharaf Chowdhury and Ion Stoica

Page 2: CS 744: DATACENTER AS A COMPUTER

ANNOUNCEMENTS

-  Assignments -  Assignment zero is due! -  Form groups for Assignment 1 on Piazza

-  Class format -  Lecture -  Review -  Discussion

Page 3: CS 744: DATACENTER AS A COMPUTER

Scalable Storage Systems

Datacenter Architecture

Resource Management

Computational Engines

Machine Learning SQL Streaming Graph

Applications

Page 4: CS 744: DATACENTER AS A COMPUTER

OUTLINE

-  Hardware Trends -  Datacenter design -  WSC workloads -  Discussion

Page 5: CS 744: DATACENTER AS A COMPUTER

Why is One Machine Not Enough?

Page 6: CS 744: DATACENTER AS A COMPUTER

What’s in a Machine?

Interconnected compute and storage Newer Hardware

- GPUs, FPGAs - RDMA, NVlink

Memory Bus

Ethe

rnet SATA

PCIe v4

Page 7: CS 744: DATACENTER AS A COMPUTER

Scale Up: Make More Powerful Machines

Moore’s law –  Stated 52 years ago by Intel

founder Gordon Moore –  Number of transistors on

microchip double every 2 years

–  Today “closer to 2.5 years” Intel CEO Brian Krzanich

Page 8: CS 744: DATACENTER AS A COMPUTER

Dennard Scaling is the Problem

Suggested that power requirements are proportional to the area for transistors

–  Both voltage and current being proportional to length

–  Stated in 1974 by Robert H. Dennard (DRAM inventor)

Broken since 2005 “Adapting to Thrive in a New Economy of Memory Abundance,” Bresniker et al

Page 9: CS 744: DATACENTER AS A COMPUTER

Dennard Scaling is the Problem

Performance per-core is stalled Number of cores is increasing

“Adapting to Thrive in a New Economy of Memory Abundance,” Bresniker et al

Page 10: CS 744: DATACENTER AS A COMPUTER

Memory TRENDS

Page 11: CS 744: DATACENTER AS A COMPUTER

MEMORY TAKEAWAY

Growing +15% per year Data access from memory is getting more expensive !

Page 12: CS 744: DATACENTER AS A COMPUTER

HDD CAPACITY

Page 13: CS 744: DATACENTER AS A COMPUTER

HDD BANDWIDTH

Disk bandwidth is not growing

Page 14: CS 744: DATACENTER AS A COMPUTER

SSDs

Performance: –  Reads: 25us latency –  Write: 200us latency –  Erase: 1,5 ms

Steady state, when SSD full –  One erase every 64 or 128 reads (depending on page size)

Lifetime: 100,000-1 million writes per page

Page 15: CS 744: DATACENTER AS A COMPUTER

SSD VS HDD COST

Page 16: CS 744: DATACENTER AS A COMPUTER

Amazon EC2 (2014)

Machine Memory (GB) Compute Units (ECU)

Local Storage (GB) Cost / hour

t1.micro 0.615 1 0 $0.02

m1.xlarge 15 8 1680 $0.48

cc2.8xlarge 60.5 88 (Xeon 2670) 3360 $2.40

1 ECU = CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor

Page 17: CS 744: DATACENTER AS A COMPUTER

Amazon EC2 (2018)

Machine Memory (GB) Compute Units (ECU)

Local Storage (GB) Cost / hour

t2.nano 0.5 1 0 $0.0058

r5d.24xlarge 244 768 104 96 4x900 NVMe $6.912

x1.32xlarge 2 TB 4 * Xeon E7 3.4 TB (SSD) $13.338

p3.16xlarge 488 GB 8 Nvidia Tesla V100 GPUs 0 $24.48

Page 18: CS 744: DATACENTER AS A COMPUTER

Amazon EC2 (2019)

Machine Memory (GB) Compute Units (ECU)

Local Storage (GB) Cost / hour

t2.nano 0.5 1 0 $0.0058

r5d.24xlarge 768 96 4x900 NVMe $6.912

x1e.32xlarge 2 TB 4 TB 4 * Xeon E7 3.4 TB (SSD) $26.68

p3dn.24xlarge 488 768 GB 8 Nvidia Tesla V100 GPUs 0 $31.21

Page 19: CS 744: DATACENTER AS A COMPUTER

Ethernet Bandwidth

1998

1995

2002

2017

Growing 33-40% per year !

Page 20: CS 744: DATACENTER AS A COMPUTER
Page 21: CS 744: DATACENTER AS A COMPUTER

TRENDS SUMMARY

CPU speed per core is flat Memory bandwidth growing slower than capacity SSD, NVMe replacing HDDs Ethernet bandwidth growing Scale up vs Scale out? (Discussion)

Page 22: CS 744: DATACENTER AS A COMPUTER

DATACENTER ARCHITECHTURE

Memory Bus

Ethe

rnet

SATA

PCIe

ServerServer

Page 23: CS 744: DATACENTER AS A COMPUTER

Datacenter Networks

Traditional hierarchical topology –  Expensive –  Difficult to scale –  High oversubscription –  Smaller path diversity –  …

Core

Agg.

Edge

Page 24: CS 744: DATACENTER AS A COMPUTER

STORAGE HIERARCHY (v2)

Page 25: CS 744: DATACENTER AS A COMPUTER

Scale Out: Warehouse-Scale Computers

Single organization Homogeneity (to some extent) Cost efficiency at scale

–  Multiplexing across applications and services

–  Rent it out!

Many concerns –  Infrastructure –  Networking –  Storage –  Software –  Power/Energy –  Failure/Recovery –  …

Page 26: CS 744: DATACENTER AS A COMPUTER

MAIN COMPONENTS OF WSC

Page 27: CS 744: DATACENTER AS A COMPUTER

SOFTWARE IMPLICATIONS

Workload Diversity

Reliability

Single organization

Storage Hierarchy

Page 28: CS 744: DATACENTER AS A COMPUTER

Three Categories of Software

1.  Platform-level –  Software firmware that are present in every machine

2.  Cluster-level –  Distributed systems to enable everything

3.  Application-level –  User-facing applications built on top

Page 29: CS 744: DATACENTER AS A COMPUTER

BigData

WORKLOAD: Partition-Aggregate

Top-level Aggregator

Mid-level Aggregators

Workers

Page 30: CS 744: DATACENTER AS A COMPUTER

WORKLOAD: Map-Reduce

Reduce StageMap Stage

Page 31: CS 744: DATACENTER AS A COMPUTER

VIDEO ENCODING

Page 32: CS 744: DATACENTER AS A COMPUTER

MACHINE LEARNING

Page 33: CS 744: DATACENTER AS A COMPUTER

WORKLOAD PATTERNS

Page 34: CS 744: DATACENTER AS A COMPUTER

DATACENTER VS DESKTOP

Parallelism Available Workload churn Platform homogeneity Fault-free operation

Page 35: CS 744: DATACENTER AS A COMPUTER

DISCUSSION

Form groups of 4 students Pick up a discussion form per group Fill out responses at https://forms.gle/hhuKktMb5pKkotwc9

Page 36: CS 744: DATACENTER AS A COMPUTER

Discussion

Scale-up vs Scale-out

Page 37: CS 744: DATACENTER AS A COMPUTER

DISCUSSION

Differences between web-search and MapReduce

Page 38: CS 744: DATACENTER AS A COMPUTER

DISCUSSION

Microsoft Word vs. online document editor like Google Docs

Page 39: CS 744: DATACENTER AS A COMPUTER

DISCUSSION

Page 40: CS 744: DATACENTER AS A COMPUTER

NEXT STEPS

9/12 class on Storage Systems Assignment 1 out Thursday