running databases on aws
TRANSCRIPT
Hardware provisioning
Data sharding
Data caching
Cluster management
Fault management
Requirement: predictable,
consistent performance
Reality: performance
degrades with scale
Scalability
Perf
orm
ance
The scalability challenge
Massive and Seamless
Scalability No table size or throughput limits
Live repartitioning for changes to storage and
throughput
Massive Scale
Short Duration
100,000 writes/second over 4 hours
Up to 1.4 Billion Writes
Estimated Throughput Cost: $400
Example: Short Term Event
Attributes
[ title => “Introduction to DynamoDB”,
date => “20120320” ]
[ title => “Disaster Recovery with AWS”,
date => “20120320”,
format => “online seminar”,
presenter => “Jeff Barr” ]
Items
[ title => “Introduction to DynamoDB”,
date => “20120320” ]
[ title => “Disaster Recovery with AWS”,
date => “20120320”,
format => “webinar”,
presenter => “Jeff Barr” ]
Items
Limited to 64 kB!
[ title => “Introduction to DynamoDB”,
date => “20120320” ]
[ title => “Disaster Recovery with AWS”,
date => “20120320”,
format => “webinar”,
presenter => “Jeff Barr” ]
Table
[ title => “Introduction to DynamoDB”,
date => “20120320” ]
[ title => “Disaster Recovery with AWS”,
date => “20120328”,
format => “webinar”,
presenter => “Jeff Barr” ]
“UserID” = “1” “UserID” =”2” “UserID” =”3”
“Date” = “20100915”
“Date” = “20100916”
“Date” = “20100917”
“Title” = “flower” “Title” = “ferrari” “Title” = “coffee”
“Tags” = “flower”,
“jasmine”, “white” “Tags” = “car”,
“italian” “Tags” = “drink”,
“delicious”
“ImageID” = “1”
“Date” = “20100915”
“Title” = “flower”
“Tags” = “flower”,
“jasmine”, “white”
Primary or hash key
“ImageID” = “1”
“Date” = “20100915”
“Title” = “flower”
“Tags” = “flower”,
“jasmine”, “white”
Primary or hash key
Composite or range key
“ImageID” = “1”
“Date” = “20100915”
“Title” = “flower”
“Tags” = “flower”,
“jasmine”, “white”
Primary or hash key
Composite or range key
Sets of strings
or numbers
1.CreateTable adds a new table to your account
2.UpdateTable updates the provisioned throughput
3.DeleteTable deletes a table and all of its items
Table APIs
4. DescribeTable returns table status, key schema, etc.
4. ListTables returns tables associated with the
current account and endpoint
Item APIs
1. PutItem creates a new item, or replaces an old
item with a new item (including all the
attributes). Can perform conditional Put.
2. GetItem returns a set of Attributes for an item that
matches the primary key. Consistent or
eventually consistent reads.
3. UpdateItem PUT, ADD, DELETE attributes. Can also
put a new item + attributes. Can
perform conditional update.
4. DeleteItem deletes a single item in a table by
primary key. Can perform conditional
delete.
1. Query gets the values of one or more items and
its attributes by primary key
– Applicable only for tables with a composite
hash range key
Query & Scan APIs
GetItem – Returns a set of attributes for an item that matches
the primary key
Query – Only works on a table with a composite hash, range
key
– Hash key = ‘xxxxx’ and a range key condition (EQ, GT, LT, GE, LE, BEGINS_WITH, BETWEEN)
– Hash key = ‘xxxxx’ and no range key condition
– Count of items (that match a hash key value or hash key + range condition)
– Top N / Bottom N items ( via ScanIndexForward = T/F & Limit N)
– Paging via Limit N
BatchGetItem – returns the attributes for multiple items from
multiple tables using their primary keys
Scan – Scans a table from beginning to end and
apply filters
PutItem – Add a new item, replace an item with a new item
– Can return all old values of an items attribute on update
– Conditional: Insert a new item only if the PK does not exist
UpdateItem – Add, update or delete an attribute (other than the
PK)
– Increment an attribute (X = X + 10) • Atomic increment and get
– Insert a new item and attributes
– Conditional: Insert a new attribute if it does not exist
DeleteItem – Delete an item
– Return ALL_OLD (optional)
– Conditional: Delete an item if it exists or if it
has an expected attribute value