amazon aurora clouddm 2016 - university of...

51
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Sailesh Krishnamurthy Senior Engineering Manager Amazon Web Services Amazon Aurora: A Database for the Cloud

Upload: hoangmien

Post on 07-Mar-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

©"2015,"Amazon"Web"Services," Inc."or"its"Affiliates."All"rights"reserved.

Sailesh"KrishnamurthySenior"Engineering"Manager

Amazon"Web"Services

Amazon'Aurora:'A'Database'for'the'Cloud

Page 2: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Rapidly"growing"global"footprint

Over"1"million"active customers"across"190"countries

800+"government"agencies

3,000+"educational" institutions

12"regions

33"availability"zones

52"edge"locations

Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"was"a"$7"billion"global"enterprise.

Page 3: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Moving"from"a"world"where"we"design"for"sharing"scarce"system"resources" to"one"where"the"central"challenge"is"taking"advantage"of"their"abundance

How"is"the"Cloud"changing"the"World"?

Page 4: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Designing"Databases"for"the"Cloud

Scarcity"to"Abundance

Monolithic"to"Service"Oriented

Single"cluster"to"a"Fleet"of"clusters

Page 5: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

What"is"Amazon"Aurora"?

MySQLVcompatible"relational"database

Performance"and availability(of"commercial"databases

Simplicity and costVeffectiveness of"open"source"databases

Delivered"as"a"managed"service

Page 6: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Agenda

Motivation

Aurora"Architecture

Performance

Availability

Usability

Beyond"Benchmarks

Page 7: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Current"DB"Architectures"are"Monolithic

SQL

Transactions

Caching

Logging

SQL

Transactions

Caching

Logging

Storage

Application

Even%when%you%scale%it%out,%you’re%still% replicating% the%same%stack

SQL

Transactions

Caching

Logging

SQL

Transactions

Caching

Logging

Application

StorageStorage

SQL

Transactions

Caching

Logging

Application

Storage

SQL

Transactions

Caching

Logging

Application

Storage

Page 8: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Aurora"Architecture:"ReVimagined"for"the"CloudMoved"the"logging"and"storage"layer"into"a"multitenant,"scaleVout"storage"service"optimized"for"OLTP"database"workloads

Leverage"existing"AWS"services:"Amazon"EC2,"Amazon"VPC,"Amazon"DynamoDB,"Amazon"SWF,"and"Amazon"S3

Maintain"compatibility"with"MySQL"–customers"can"migrate"their"MySQL"applications"asVis,"use"all"MySQL"tools.

Control'PlaneData'Plane

Amazon DynamoDB

Amazon SWF

Amazon Route 53

Logging'+'Storage

SQL

Transactions

Caching

Amazon S3

1

2

3

Page 9: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Aurora"Storage:"A"ServiceVOriented"Architecture• ScaleVout,"multiVtenant,"SSD"storage• Seamless" storage"scalability• Up"to"64"TB"database"size• Only" pay"for"what"you"use

• LogVstructured"storage• Many"small" segments," each"with"their"own"redo"logs• Redo"logs"used" to"generate"data"pages"on"demand• Eliminates" chatter"between"database"and"storage

• Highly"available/durable"by"default• 6Vway"replication" across"3"AZs• 4"of" 6"write"quorum• Automatic"fallback" to"3"of"4"if"an"Availability" Zone"(AZ)"is"unavailable

• 3"of" 6"read"quorum• Continuous" backup" to"S3

AZ"1 AZ"2 AZ"3

Amazon" S3

Page 10: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Aurora:"Database/Storage"Interaction

SQL

Transactions

AZ"1 AZ"2 AZ"3

Caching

Amazon" S3

• NewVAPI"model• Reads"are"blockVbased" (read"pages)

• Writes"are"deltaVbased" (write"redo"logs)

• Distributed"quorumVbased"writes• Ordered"logVstream" in"a"single" LSN"space

• Database"writes"logVstream" to"6"nodes" in"3"AZs

• Transaction" commit"only" after"write"quorum"established

• Continuous"state"exchange"protocol"• Segments" can"have"holes" (lost" log"records)

• Read"at"an"LSN"directed"to"the"right"segment

• Storage"segments" know"when" to"coalesce" redo"logs

Page 11: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Agenda

Motivation

Aurora"Architecture

Performance

Availability

Usability

Beyond"Benchmarks

Page 12: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

5x"faster"than"RDS"MySQL"5.6"&"5.7

WRITE PERFORMANCE READ PERFORMANCE

MySQL" SysBench results

R3.8XL:"32"cores"/"244"GB"RAMFive%times%higher%throughput% than%stock%MySQL,

based% on%industry%standard% benchmarks.

Page 13: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

How"did"we"achieve"this"?

Do"fewer"IOs

Minimize"network"packets

Cache"prior"results

Offload"the"database"engine

DO'LESS'WORK

Process"asynchronously

Reduce" latency"path

Use"lockVfree"data"structures

Batch"operations"together

BE'MORE'EFFICIENT

DATABASES'ARE'ALL'ABOUT'I/O

NETWORKOATTACHED'STORAGE'IS'ALL'ABOUT'PACKETS/SECOND

HIGHOTHROUGHPUT'PROCESSING'DOES'NOT'ALLOW'CONTEXT'SWITCHES

Page 14: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

IO"Traffic"in"RDS"MySQL

BINLOG DATA DOUBLEVWRITEREDO"LOG FRM"FILES

T YPE ' O F 'WRI T E

MYSQL'WITH'REPLICA

EBS"mirrorEBS"mirror

AZ"1 AZ"2

Amazon" S3

EBSAmazon"Elastic"Block"

Store"(EBS)

PrimaryInstance

ReplicaInstance

1

2

3

4

5

Issue"write"to"EBS"– EBS"issues"to"mirror,"ackwhen"both"doneStage"write"to"standby"instance"through"DRBDIssue"write"to"EBS"on"standby"instance

IO'FLOW

Steps"1,"3,"5"are"sequential"and"synchronousThis"amplifies"both"latency"and"jitterMany"types"of"writes"for"each"user"operationHave"to"write"data"blocks"twice"to"avoid"torn"writes

OBSERVATIONS

780K"transactions7,388K"I/Os per"million"txns (excludes"mirroring,"standby)Average"7.4"I/Os per"transaction

PERFORMANCE

30"minute" SysBench" writeonly workload," 100GB" dataset,"RDS'MultiAZ," 30K" PIOPS

Page 15: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

IO"Traffic"in"Aurora

BINLOG DATA DOUBLEVWRITEREDO"LOG FRM"FILES

T YPE ' O F 'WRI T E

AZ"1 AZ"3

PrimaryInstance

Amazon S3

AZ"2

ReplicaInstance

AMAZON& AURORA

ASYNC4/6"QUORUM

DISTRIBUTED"WRITES

IO&FLOW

Only"write"redo"log"records;"all"steps"asynchronousNo"data"block"writes"(checkpoint,"cache"replacement)6Xmore log"writes,"but"9X less network"trafficTolerant"of"network"and"storage"outlier"latency

OBSERVATIONS

27,378K"transactions" 35X MORE

950K"I/Os per"1M"txns (6X"amplification)Average"0.95"I/O"per"txn 7.7X LESS

PERFORMANCE

Boxcar"redo"log"records"– fully"ordered"by"LSNShuffle"to"appropriate"segments"– partially"orderedBoxcar"to"storage"nodes"and"issue"writesReplica

Instance

30"minute" SysBench" writeonly workload," 100GB" dataset

Page 16: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

LOG" RECORDS

Primary"Instance

INCOMING"QUEUE

STORAGE'NODE

S3"BACKUP

1

2

3

4

5

6

7

8

UPDATE"QUEUE

ACK

HOTLOG

DATABLOCKS

POINT"IN"TIMESNAPSHOT

GC

SCRUBCOALESCE

SORTGROUP

PEER" TO"PEER" GOSSIPPeerStorageNodes

All"steps"are"asynchronousOnly"steps"1"and"2"are"in"foreground"latency"pathInput"queue"is"46X less than"MySQL"(unamplified,"per"node)Favor"latencyVsensitive"operationsUse"disk"space"to"buffer"against"spikes"in"activity

OBSERVATIONS

IO&FLOW

① Receive"record"and"add"to"inVmemory"queue② Persist"record"and"ACK"③ Organize"records"and"identify"gaps"in"log④ Gossip"with"peers"to"fill"in"holes⑤ Coalesce"log"records"into"new"data"block"versions⑥ Periodically"stage"log"and"new"block"versions"to"S3⑦ Periodically"garbage"collect"old"versions⑧ Periodically"validate"CRC"codes"on"blocks

IO"Traffic"in"Aurora"(Storage"Node)

Page 17: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

IO"Traffic"in"Aurora"Read"Replicas

PAGE" CACHE"UPDATE"SELECTIVE" LOG" APPLY

Aurora' Master

30%"Read

70%"Write

Aurora' Replica

100%"New"Reads

Shared' MultiOAZ' Storage

MySQL' Master

30%"Read

70%"Write

MySQL' Replica

30%"New" Reads

70%"Write

SINGLEVTHREADEDBINLOG" APPLY

Data' Volume Data' Volume

• Logical: Ship"SQL"statements"to"Replica

• Write"workload"similar"on"both"instances

• Independent"storage

• Can"result"in"data"drift"between"Master"and"Replica

Physical: Ship"redo"from"Master"to"Replica

Replica"shares"storage."No"writes"performed

Cached"pages"have"redo"applied

Advance"read"view"when"all"commits"seen

MYSQL'READ'SCALING AMAZON'AURORA'READ'SCALING

Page 18: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Adaptive"Thread"Pool

• ReVentrant"connections"multiplexed"to"active"threads

• KernelVspace"epoll()"inserts"into"latchVfree"event"queue

• Dynamically"size"thread"pool"

• Gracefully"handles"5000+"concurrent"client"sessions"on"r3.8xl

Standard"MySQL"– one"thread"per"connection

Doesn’t"scale"with"connection"count

MySQL"EE"– connections"assigned"to"thread"group

Requires"careful"stall"threshold"tuning

CLIENT

"CONN

ECTION

CLIENT

"CONN

ECTION LATCH" FREE

TASK"QUEUE

epoll()

MYSQL'THREAD'MODEL AURORA'THREAD'MODEL

Page 19: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Asynchronous"Group"Commits

Read

Write

Commit

Read

Read

T1

Commit " ( T1 )

Commit " ( T2 )

Commit' (T3)

L SN " 1 0

LSN " 1 2

LSN ' 22

L SN " 5 0

LSN ' 30'

LSN ' 34

LSN ' 41

LSN ' 47

LSN"20

LSN"49

Commit' (T4)

Commit' (T5)

Commit' (T6)

Commit' (T7)

Commit " ( T8 )

LSN"GROWTHDurable"LSN"at"headVnode"

COMMIT"QUEUEPending"commits"in"LSN"order

TIME

GROUPCOMMIT

TRANSACTIONS

Read

Write

Commit

Read

Read

T1

Read

Write

Commit

Read

Read

Tn

• TRADITIONAL'APPROACH AMAZON&AURORAMaintain( a(buffer(of(log(records(to(write(out(to(disk

Issue(write( when( buffer(full(or(time(out(waiting( for(writes

First(writer( has(latency(penalty(when(write( rate(is(low

Request( I/O(with( first(write,(fill(buffer(till(write( picked(up

Individual( write(durable( when( 4(of(6(storage(nodes(ACK

Advance(DB(Durable( point(up(to(earliest(pending( ACK

Page 20: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Agenda

Motivation

Aurora"Architecture

Performance

Availability

Usability

Benchmarks

Page 21: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

What"about"availability?

“Performance%only%matters%if%your%database%is%up”

Page 22: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

More"Replicas• Aurora"cluster"contains"primary"node"and"up"to"fifteen"secondary"nodes

• Failing"database"nodes"are"automatically"detected"and"replaced

• Failing"database"processes"are"automatically"detected"and"recycled

• Secondary"nodes"automatically"promoted"on"persistent"outage,"no"single"point"of"failure

• Customer"application"may"scaleVout"read"traffic"across"secondary"nodes

AZ"1 AZ"3AZ"2

PrimaryNodePrimaryNodePrimaryNode

PrimaryNodePrimaryNodeSecondaryNode

PrimaryNodePrimaryNodeSecondaryNode

! Customer"specifiable" failVover"order

! Read"balancing"across"read"replicas

Page 23: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Storage"Durability• Storage"volume"automatically"grows"up"to"64"TB

• Quorum"system"for"read/write;"latency"tolerant

• Peer"to"peer"gossip"replication"to"fill"in"holes

• Continuous"backup"to"S3"(built"for"11"9s"durability)

• Continuous"monitoring"of"nodes"and"disks"for"repair"

• 10GB"segments"as"unit"of"repair"or"hotspot"rebalance

• Quorum"membership"changes"do"not"stall"writes

AZ"1 AZ"2 AZ"3

Amazon" S3

Page 24: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Continuous"BackupSegment"snapshot Log"records

Recovery"point

Segment'1

Segment'2

Segment'3

Time

• Take periodic"snapshot"of"each"segment"in"parallel;"stream"the"redo"logs"to"Amazon"S3

• Backup"happens"continuously"without"performance"or"availability"impact

• At"restore,"retrieve"the"appropriate"segment"snapshots"and"log"streams"to"storage"nodes

• Apply"log"streams"to"segment"snapshots"in"parallel"and"asynchronously

Page 25: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Survivable"Buffer"Caches• We"moved"the"cache"out"of"the"database"process

• Cache"remains"warm"in"the"event"of"database"restart

• Lets"you"resume"fully"loaded"operations"much"faster

• Instant"crash"recovery"+ survivable"cache"='quick"and"easy"recovery"from"DB"failures

SQL

Transactions

Caching

SQL

Transactions

Caching

SQL

Transactions

Caching

Caching"process"is"outside"the"DB"process"and"remains"warm"across"a"database"restar t

Page 26: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Instant"Crash"Recovery• Traditional"Databases

• Have"to"replay"logs"since"the"last"checkpoint

• Typically"5"minutes"between"checkpoints

• SingleVthreaded"in"MySQL;"requires"a"large"number"of"disk"accesses

• Amazon"Aurora

• Underlying"storage"replays"redo"records"on"demand"as"part"of"a"disk"read

• Parallel,"distributed,"asynchronous

• No"replay"for"startup

Checkpointed"Data Redo"Log

Crash" at"T0 requiresa"reVapplication" of"theSQL"in" the"redo" log"sincelast"checkpoint

T0 T0

Crash" at"T0 will" result" in" redo"logs"being" applied" to"each"segment" on"demand," in"parallel,"asynchronously

Page 27: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Fast"FailVOver

AppRunningFailure"Detection DNS"Propagation

Recovery Recovery

DBFailure

MYSQL

AppRunning

Failure"Detection DNS"Propagation

Recovery

DBFailure

AURORA"WITH"MARIADB"DRIVER

1 5 O 2 0 's e c

3 O 2 0 ' s e c

Page 28: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

RealVlife"data"V failVover"time

“In"RDS"MySQL,"it"took"minutes"or"sometimes"tens"of"minutes"to"failover."It’s"pretty"awesome"that"you"can"failover/restart"within"less"than"a"minute.”

Page 29: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Simulate"failures"using"SQL

ALTER"SYSTEM"CRASH"[{INSTANCE"|"DISPATCHER"|"NODE}]

ALTER"SYSTEM"SIMULATE"percent_failure DISK"failure_type IN"[DISK"index"|"NODE"index]"FOR"INTERVAL"interval

ALTER"SYSTEM"SIMULATE"percent_failure NETWORK"failure_type[TO"{ALL"|"read_replica |"availability_zone}]"FOR"INTERVAL"interval

• To"cause"the"failure"of"a"component"at"the"database"node:

• To"simulate"the"failure"of"disks:

• To"simulate"the"failure"of"networking:

Page 30: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Agenda

Motivation

Aurora"Architecture

Performance

Availability

Usability

Beyond"Benchmarks

Page 31: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

What"about"usability"?

“Focus%on%the%application%and%not%managing%the%system”

Page 32: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Simplify"Database"Management• Create"a"database"in"minutes

• Automated"patching

• PushVbutton"scale"compute

• Continuous"backups"to"Amazon"S3

• Automatic"failure"detection"and"failover

Amazon RDS

Page 33: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Simplify"Storage"Management• Read"replicas"are"available"as"failover"targets—no"data"loss

• Instantly"create"user"snapshots—no"performance"impact

• Continuous,"incremental"backups"to"Amazon"S3

• Automatic"storage"scaling"up"to"64"TB—no"performance"or"availability"impact

• Automatic"restriping,"mirror"repair,"hot"spot"management,"encryption

Page 34: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Simplify"Data"Security• Encryption"to"secure"data"at"rest• AESV256;"hardware"accelerated• All"blocks"on"disk"and"in"Amazon"S3"are"encrypted• Key"management"via"AWS"KMS

• SSL"to"secure"data"in"transit

• Network"isolation"via"Amazon"VPC"by"default

• No"direct"access"to"nodes

• Supports"industry"standard"security"and"data"protection"certifications

Storage

SQL

Transactions

Caching

Amazon S3

Application

Page 35: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

What%about tools%and%eco-system?All%MySQL%tools%work%as%is%…

Page 36: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Well"established"MySQL"ecosystem

Business'Intelligence Data'Integration Query'and'Monitoring SI'and'Consulting

Source:%Amazon

“We'ran'our'compatibility'test'suites'against'Amazon'Aurora'and'everything'just'

worked.""V Dan"Jewett,"Vice"President"of"Product"Management"at"Tableau

Page 37: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Integration"with"3rd"Party"Tools

Page 38: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Monitor"Aurora"with"Datadog

• Just"add"readVonly"AWS"credentials"and"select"the"services"you"wish"to"monitor"(e.g."RDS)

Page 39: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Ready%to%move?We%made%it%easy%to%migrate%..%

Page 40: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Simplify"migration"from"RDS"MySQL

• 1."Establish"baseline

a. RDS"MySQL"to"Aurora"DB"snapshot"migration

b. MySQL"dump/import

• 2."CatchVup"changesApplication'Users

MySQL Aurora

Network

Page 41: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Migration"from"EC2"&"onVpremise MySQL• Data(migration(service

• Logical(data(replication(from(onLpremise(or(EC2• Code(&(schema(conversion(across(engines

• S3(integration• Load(partial(datasets(directly(from(/(to(S3• Ingest(large(database(snapshots((>2TB)

• Snowball(integration• Ingest(huge(database(snapshots((>10TB)• Send(us(your(data(in(a(suitcase!

Page 42: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Migration"from"nonVMySQL"Databases

AWS(Database(Migration(Service

" Move"data"to"the"same"or"different"database"engine"

" Keep"your"apps"running"during"the"migration

" Start"your"first"migration"in"10"minutes"or"less

" Replicate"within,"to,"or"from"Amazon"EC2"or"RDS

Page 43: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Agenda

Motivation

Aurora"Architecture

Performance

Availability

Usability

Beyond"Benchmarks

Page 44: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Beyond"Benchmarks

• If'only'real"world"applications"saw"benchmark"performance

• POSSIBLE'DISTORTIONSReal"world"requests"contend"with"each"otherReal"world"metadata"rarely"fits" in"data"dictionary"cacheReal"world"data"rarely"fits"in"buffer"cacheReal"world"production"databases"need"to"run"HA

Page 45: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

• SysBench"OLTP"Workload

• 250"tables

Connections Amazon&AuroraRDS&MySQLw/&30K IOPS

50 40,000 10,000

500" 71,000 21,000

5,000" 110,000 13,000

8xUP' TO

FASTER

Scaling"User"Connections

Page 46: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Tables Amazon&AuroraMySQL I2.8XLlocal SSD

MySQLI2.8XLRAM&disk

RDS&MySQLw/&30K IOPS(single&AZ)

10" 60,000" 18,000" 22,000" 25,000"

100" 66,000" 19,000" 24,000" 23,000"

1,000" 64,000" 7,000" 18,000" 8,000"

10,000" 54,000" 4,000" 8,000" 5,000"

• SysBench writeVonly" workload

• Measuring" writes" per" second• 1,000" connections

11xUP&TO

FASTER

Scaling"Table"Count

Page 47: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

DB&Size Amazon& AuroraRDS& MySQLw/&30K IOPS

1GB 107,000 8,400

10GB 107,000 2,400

100GB" 101,000 1,500

1TB 26,000 1,200

67xUP&TO

FASTER

• SYSBENCH'WRITEOONLY

DB&Size Amazon& AuroraRDS& MySQLw/&30K IOPS

80GB 12,582 585

800GB 9,406 69

CLOUDHARMONY&TPCVC

136xUP&TO

FASTER

Scaling"Data"Set

Page 48: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Aurora"3X"faster

RealVlife"data"– gaming"workloadAurora"vs"RDS"MySQL"– r3.4XL,"MAZ

Page 49: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Updates'per

second Amazon' Aurora

RDS'MySQL

30K IOPS (single'AZ)

1,000 2.62"ms 0"s

2,000 3.42"ms 1"s

5,000 3.94"ms 60"s

10,000 5.38"ms 300"s

• SysBench" Writeonly Workload

• 250"tables

500xUP&TO

LOWER& LAG

Scaling"With"Replicas

Page 50: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

“In"RDS"MySQL,"we"saw"replica"lag"spike"to"almost"12"minutes"which"is"almost"absurd"from"an"application’s"perspective."The"maximum"read"replica"lag"across"4"replicas"never"exceeded"beyond"20"ms.”

RealVlife"data"V read"replica"latency

Page 51: Amazon Aurora CloudDM 2016 - University of Michigansalat0.eecs.umich.edu/clouddm2016/files/amzn.pdf · Everyday,"AWS"adds"enough"new"server"capacity"to"support"Amazon.com"when"it"

Questions'?

Thank%you!

P.S.%We’re%hiring%!%Email%me%at:%[email protected]

http://aws.amazon.com/rds/aurora