ramcloud: concept and challenges john ousterhout stanford university

18
RAMCloud: Concept and Challenges John Ousterhout Stanford University

Upload: stuart-davidson

Post on 27-Dec-2015

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: RAMCloud: Concept and Challenges John Ousterhout Stanford University

RAMCloud:Concept and Challenges

John OusterhoutStanford University

Page 2: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 2

Introduction

● RAMCloud: massive datacenter storage entirely in DRAM

● New research project: Kozyrakis Mazieres McKeown Mitra Ousterhout Parulkar Prabhakar Rosenblum

● Goal for today: feedback What are we missing? Related work Pitfalls

Page 3: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 3

Outline

● Overview of RAMCloud

● Motivation

● Challenges in making RAMCloud a reality

Page 4: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 4

RAMCloud Overview

● Storage for datacenters

● 1000-10000 commodity servers

● 64 GB DRAM/server

● All data always in RAM

● Low-latency access:5-10µs RPC

● High throughput:1,000,000 ops/sec/server

● High level of durability & availability

Application Servers

Storage Servers

Datacenter

Page 5: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 5

Example Configurations

Today 5-10 years

# servers 1000 1000

GB/server 64GB 1024GB

Total capacity 64TB 1PB

Total server cost $4M $4M

$/GB $60 $4

Page 6: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 6

RAMDisk Motivation

● Relational databases don’t scale

● Every large-scale Web application has problems: Facebook: 4000 MySQL instances + 2000 memcached servers

● New forms of storage starting to appear: Google BigTable

● Many apps don’t need all RDBMS features, can’t afford them

Page 7: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 7

RAMDisk Motivation, cont’d

Disk access rate not keeping up with capacity:

● Disks must become more archival

● Can’t afford small random accesses

● RAM cost today = disk cost 10 years ago

Mid-1980’s Today Change

Disk capacity 30 MB 200 GB 6667x

Max. transfer rate 2 MB/s 100 MB/s 50x

Latency (seek & rotate) 20 ms 10 ms 2x

Capacity/bandwidth(large blocks)

15 s 2000 s 133x

Capacity/bandwidth(200B blocks)

3000 s 115 days 3333x

Page 8: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 8

Can We Make RAMCloud Work?

● Is capacity sufficient?

● Data durability/availability

● Low-latency RPCs

● Data model

● Concurrency model

● Data distribution, scaling

● Multi-tenancy

● Functional distribution

● Automated management

Page 9: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 9

Is RAMCloud Capacity Sufficient?

● Target applications: online data, not media

● Facebook: 200 TB of (non-image) data today

● Amazon:Revenues/year: $16B

Orders/year: 400M? ($40/order?)

Bytes/order: 10000?

Order data/year 4 TB?

● United Airlines:Total flights/day: 4000? (30,000 for all airlines in U.S.)

Passenger flights/year: 200M?

Data/passenger-flight: 10000?

Order data/year: 2 TB?

Page 10: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 10

Data Durability/Availability

● Data must be durable when write RPC returns.

● Non-starters: Synchronous disk write (100-1000x too slow) Replicate in other memories (too expensive)

● One possibility: Log operations in RAM of other server(s) Log to disk in batches Occasional disk checkpoints to truncate logs

● Additional issues: Fast recovery Role of flash memory Datacenter disaster recovery

Page 11: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 11

Low-Latency RPCs

Achieving 5-10µs will impact every layer of the system:

● Network: cut-through routing, flow control

● OS: Dedicated cores No interrupts? No virtual memory?

● Protocol stack TCP too slow (especially with packet loss) Must avoid copies

● Storage service

Could we someday get to 1µs??

Page 12: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 12

Data Model

● Probably not relations: Too rigid Too much fragmentation Better to collect larger objects?

● Unstructured blobs?

● Hierarchical hash tables (e.g. JSON)?

● What is the role of indexing?

Page 13: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 13

Concurrency Model

● Locking

● Transactions?

● What level of consistency for operations on distributed data?

Page 14: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 14

Distribution and Scaling

● How to distribute data among storage servers? Concentrate to minimize RPCs, simplify consistency Distribute to reduce hot spots

● How to achieve locality of access?(Is there any locality for Web apps?)

● Smooth scaling: Automatic redistribution of data

Page 15: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 15

Multi-Tenancy

● Design for cloud computing environments: Multiple users/applications sharing a RAMCloud

● Security/access control issues

● Scale down as well as up Small apps should get RAMCloud benefits at proportional cost

Page 16: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 16

Functional Distribution

● RAMCloud will include software on both application servers and storage servers

● What is the right distribution?

● Application servers: Offload storage system, increase scalability, flexibility Example: aggregation operators

● Storage servers: More trusted May benefit from knowledge of internals

Application Servers

Storage Servers

RAMCloudSoftware

Page 17: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 17

Automated Management

● Manual management a non-starter:system too complex

● Things that must be automatic: Recovery from failures Distribution of data among servers Reorganization as system scales

Page 18: RAMCloud: Concept and Challenges John Ousterhout Stanford University

April 10, 2009 RAMCloud Slide 18

Discussion

● Why hasn’t this been done before? Or has it?

● If this is a good idea, what happened to main-memory databases?

● What are the biggest issues that might keep RAMCloud from working?

● What’s the most important database work we should be aware of?

● Would any of you like to participate in RAMCloud?