aws re:invent 2016: deep dive on amazon dynamodb (dat304)
TRANSCRIPT
![Page 1: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/1.jpg)
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
December 1, 2016
DAT304
Deep Dive on Amazon DynamoDB
Rick Houlihan, Principal TPM, DBS NoSQL
![Page 2: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/2.jpg)
What to Expect from the Session
• Brief history of data processing (Why NoSQL?)
• Overview of Amazon DynamoDB
• Scaling characteristics, hot keys, and design considerations
• NoSQL data modeling
• Normalized versus unstructured schema
• Common NoSQL design patterns
• Time Series, Write Sharding, MVCC, etc.
• Amazon DynamoDB in the serverless ecosystem
![Page 3: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/3.jpg)
Timeline of Database TechnologyD
ata
Pre
ssu
re
![Page 4: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/4.jpg)
Data volume since 2011
Data
Vo
lum
e
Historical Current
90 percent of stored data
generated in last 2 years
1 terabyte of data in 2011
equals 6.5 petabytes today
Linear correlation between
data pressure and technical
innovation
No reason these trends will
not continue over time
![Page 5: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/5.jpg)
Technology adoption and the hype curve
![Page 6: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/6.jpg)
Why NoSQL?
Optimized for storage Optimized for compute
Normalized/relational Denormalized/hierarchical
Ad hoc queries Instantiated views
Scale vertically Scale horizontally
Good for OLAP Built for OLTP at scale
SQL NoSQL
![Page 7: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/7.jpg)
Amazon DynamoDB
Document or key-value Scales to any workloadFully managed NoSQL
Access control Event driven programmingFast and consistent
![Page 8: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/8.jpg)
Table
Table
Items
Attributes
PartitionKey
Sort Key
Mandatory
Key-value access pattern
Determines data distribution
Optional
Model 1:N relationships
Enables rich query capabilities
All items for key==, <, >, >=, <=“begins with”“between”“contains”“in”sorted resultscountstop/bottom N values
![Page 9: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/9.jpg)
00 55 A954 FFAA00 FF
Partition Keys
Id = 1
Name = Jim
Hash (1) = 7B
Id = 2
Name = Andy
Dept = Eng
Hash (2) = 48
Id = 3
Name = Kim
Dept = Ops
Hash (3) = CD
Key Space
Partition key uniquely identifies an item
Partition key is used for building an unordered hash index
Allows table to be partitioned for scale
![Page 10: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/10.jpg)
Scaling
![Page 11: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/11.jpg)
Scaling
Throughput
Provision any amount of throughput to a table
Size
Add any number of items to a table
- Max item size is 400 KB
- LSIs limit the number of range keys due to 10 GB limit
Scaling is achieved through partitioning
![Page 12: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/12.jpg)
Throughput
Provisioned at the table level
Write capacity units (WCUs) are measured in 1 KB per second
Read capacity units (RCUs) are measured in 4 KB per second
- RCUs measure strictly consistent reads
- Eventually consistent reads cost 1/2 of consistent reads
Read and write throughput limits are independent
WCURCU
![Page 13: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/13.jpg)
Partitioning Math
In the future, these details might change…
Number of Partitions
By Capacity (Total RCU / 3000) + (Total WCU / 1000)
By Size Total Size / 10 GB
Total Partitions CEILING(MAX (Capacity, Size))
![Page 14: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/14.jpg)
Partitioning Example
Table size = 8 GB, RCUs = 5000, WCUs = 500
RCUs per partition = 5000/3 = 1666.67
WCUs per partition = 500/3 = 166.67
Data/partition = 10/3 = 3.33 GB
RCUs and WCUs are uniformly
spread across partitions
Number of Partitions
By Capacity (5000 / 3000) + (500 / 1000) = 2.17
By Size 8 / 10 = 0.8
Total Partitions CEILING(MAX (2.17, 0.8)) = 3
![Page 15: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/15.jpg)
What causes throttling?
If sustained throughput goes beyond provisioned throughput per partition
Non-uniform workloads
Hot keys/hot partitions
Very large items
Mixing hot data with cold data
Use a table per time period
From the example before:
Table created with 5000 RCUs, 500 WCUs
RCUs per partition = 1666.67
WCUs per partition = 166.67
If sustained throughput > (1666 RCUs or 166 WCUs) per key or partition, DynamoDB
may throttle requests
- Solution: Increase provisioned throughput
![Page 16: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/16.jpg)
What bad NoSQL looks like…
Part
itio
n
Time
Heat
![Page 17: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/17.jpg)
Getting the most out of DynamoDB throughput
“To get the most out of DynamoDB throughput, create tables where the
partition key element has a large number of distinct values, and values
are requested fairly uniformly, as randomly as possible.”
—DynamoDB Developer Guide
Space: access is evenly spread over the key-space
Time: requests arrive evenly spaced in time
![Page 18: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/18.jpg)
Much better picture…
![Page 19: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/19.jpg)
Data Modeling in NoSQL
![Page 20: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/20.jpg)
It’s all about aggregations…
Document management Process controlSocial network
Data treesIT monitoring
![Page 21: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/21.jpg)
SQL vs. NoSQL Design Pattern
![Page 22: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/22.jpg)
Hierarchical dataTiered relational data structures
![Page 23: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/23.jpg)
Hierarchical data structures as items
Use composite sort key to define a hierarchy
Highly selective result sets with sort queries
Index anything, scales to any size
Primary KeyAttributes
ProductID type
Item
s
1 bookIDtitle author genre publisher datePublished ISBN
Some Book John Smith Science Fiction Ballantine Oct-70 0-345-02046-4
2 albumIDtitle artist genre label studio released producer
Some Album Some Band Progressive Rock Harvest Abbey Road 3/1/73 Somebody
2 albumID:trackIDtitle length music vocals
Track 1 1:30 Mason Instrumental
2 albumID:trackIDtitle length music vocals
Track 2 2:43 Mason Mason
2 albumID:trackIDtitle length music vocals
Track 3 3:30 Smith Johnson
3 movieIDtitle genre writer producer
Some Movie Scifi Comedy Joe Smith 20th Century Fox
3 movieID:actorIDname character image
Some Actor Joe img2.jpg
3 movieID:actorIDname character image
Some Actress Rita img3.jpg
3 movieID:actorIDname character image
Some Actor Frito img1.jpg
![Page 24: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/24.jpg)
… or as documents (JSON)
JSON data types (M, L, BOOL, NULL)
Document SDKs available
Indexing only by using DynamoDB Streams or AWS Lambda
400 KB maximum item size (limits hierarchical data structure)
Primary KeyAttributes
ProductID
Item
s
1id title author genre publisher datePublished ISBN
bookID Some Book Some Guy Science Fiction Ballantine Oct-70 0-345-02046-4
2
id title artist genre Attributes
albumID Some Album Some Band Progressive Rock
{ label:"Harvest", studio: "Abbey Road", published: "3/1/73", producer: "Pink Floyd", tracks: [{title: "Speak to Me", length: "1:30", music: "Mason", vocals:
"Instrumental"},{title: ”Breathe", length: ”2:43", music: ”Waters, Gilmour, Wright", vocals: ”Gilmour"},{title: ”On the Run", length: “3:30", music: ”Gilmour,
Waters", vocals: "Instrumental"}]}
3
id title genre writer Attributes
movieID Some Movie Scifi Comedy Joe Smith
{ producer: "20th Century Fox", actors: [{ name: "Luke Wilson", dob: "9/21/71", character: "Joe Bowers", image: "img2.jpg"},{ name: "Maya Rudolph", dob:
"7/27/72", character: "Rita", image: "img1.jpg"},{ name: "Dax Shepard", dob: "1/2/75", character: "Frito Pendejo", image: "img3.jpg"}]
![Page 25: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/25.jpg)
Design patterns and best practices
![Page 26: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/26.jpg)
DynamoDB Streams and AWS Lambda
![Page 27: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/27.jpg)
Triggers
Lambda functionNotify change
Derivative tables
Amazon CloudSearch
Amazon ElastiCache
![Page 28: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/28.jpg)
Real-time voting
Write-heavy items
![Page 29: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/29.jpg)
Partition 1
1000 WCUs
Partition K
1000 WCUs
Partition M
1000 WCUs
Partition N
1000 WCUs
Votes Table
Candidate A Candidate B
Scaling bottlenecks
Voters
Provision 200,000 WCUs
![Page 30: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/30.jpg)
Write sharding
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4Candidate A_7 Candidate B_8
Candidate A_6 Candidate A_8
Candidate A_5
Voter
Votes Table
![Page 31: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/31.jpg)
Write sharding
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4Candidate A_7 Candidate B_8
UpdateItem: “CandidateA_” + rand(0, 10)
ADD 1 to Votes
Candidate A_6 Candidate A_8
Candidate A_5
Voter
Votes Table
![Page 32: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/32.jpg)
Votes Table
Shard aggregation
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4
Candidate A_5
Candidate A_6 Candidate A_8
Candidate A_7 Candidate B_8
Periodic
process
Candidate A
Total: 2.5M
1. Sum
2. Store Voter
![Page 33: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/33.jpg)
Trade off read cost for write scalability
Consider throughput per partition key
Shard write-heavy partition keys
Your write workload is not horizontally
scalable
![Page 34: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/34.jpg)
Event loggingStoring time series data
![Page 35: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/35.jpg)
Time Series Tables (Static TTL Timestamps)
Events_table_2015_April
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
Events_table_2015_March
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
Events_table_2015_Feburary
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
Events_table_2015_January
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
RCUs = 1000
WCUs = 1
RCUs = 10000
WCUs = 10000
RCUs = 100
WCUs = 1
RCUs = 10
WCUs = 1
Current table
Older tables
Hot
data
Cold
data
Don’t mix hot and cold data; archive cold data to Amazon S3
![Page 36: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/36.jpg)
Use a table per time period
Precreate daily, weekly, monthly tables
Provision required throughput for current table
Writes go to the current table
Turn off (or reduce) throughput for older tables
Dealing with static time series data
![Page 37: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/37.jpg)
Time Series Tables (Dynamic TTL Timestamps)
Active Tickets_Table
Event_id
(Partition)
Timestamp
(Sort)
GSIKey
Rand(0-N)
… Attribute N
Expired Tickets GSI
GSIKey
(Partition)
Timestamp
(Sort)
Archive Table
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
RCUs = 10000
WCUs = 10000
RCUs = 100
WCUs = 1
Current table
Hot
data
Cold
data
Scatter Query GSI for Expired Tickets and use Streams to archive
![Page 38: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/38.jpg)
Use Lambda to purge expired items
Use a write sharded GSI to selectively query expired items
Create a Lambda “stored procedure” to delete items
Migrate data between tables with Streams/Lambda trigger
Dealing with dynamic time series data
![Page 39: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/39.jpg)
Product catalogPopular items (read)
![Page 40: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/40.jpg)
Partition 1
2000 RCUs
Partition K
2000 RCUs
Partition M
2000 RCUs
Partition 50
2000 RCU
Scaling bottlenecks
Product A Product B
Shoppers
ProductCatalog Table
SELECT Id, Description, ...
FROM ProductCatalog
WHERE Id="POPULAR_PRODUCT"
![Page 41: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/41.jpg)
![Page 42: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/42.jpg)
Partition 1 Partition 2
ProductCatalog Table
User
DynamoDB
User
Cache
popular items
SELECT Id, Description, ...
FROM ProductCatalog
WHERE Id="POPULAR_PRODUCT"
![Page 43: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/43.jpg)
![Page 44: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/44.jpg)
Multiversion ConcurrencyTransactions in NoSQL
![Page 45: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/45.jpg)
How OLTP Apps Use Data
Mostly hierarchical
structures
Entity-driven workflows
Data spread across tables
Requires complex queries
Primary driver for ACID
![Page 46: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/46.jpg)
Reading item partitions
Get all items for a product
assembly
SELECT *
FROM Assemblies
WHERE ID=1
Order
ItemPicker
ID PartID Vendor Bin
1
1 ACME 27Z19
2 ACME 39M97
3 ACME 75B25
(Many more assemblies)
Assemblies table
Partition key Sort key
![Page 47: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/47.jpg)
ID PartID Vendor Bin
1
1 ACME 27Z19
2 ACME 39M97
3 ACME 75B25
4 UAC 53G56
5 UAC 64B17
6 UAC 48J19
Updates are problematic
(Many more assemblies)
Assemblies table
Old items
SELECT *
FROM Assemblies
WHERE ID=1
Order Item Picker
New items
![Page 48: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/48.jpg)
ID PartID Vendor Bin
1
1 ACME 27Z19
2 ACME 39M97
3 ACME 75B25
4 UAC 53G56
5 UAC 64B17
6 UAC 48J19
Creating partition “locks”
(Many more assemblies)
Assemblies table
• Use metadata item for versioning
SET #ItemList.i = list append(:newList, #ItemList.i ), #Lock =
1
• Obtain locks with conditional writes
--condition-expression “Lock = 0”
• Remove old items or add version info to Sort Key
Query composite keys for specific versions
• Readers can determine how to handle “fat” reads
• Add additional metadata to manage transactional
workflow as needed
Current State, Timeout, etc.
ID PartID ItemList Lock
1 0 {i:[[1,2,3],[4,5,6]]} 0
![Page 49: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/49.jpg)
Use item partitions to manage transactional
workflows
Manage versioning across items with metadata
Tag attributes to maintain multiple versions
Code your app to recognize when updates are in progress
Implement app layer error handling and recovery logic
Transactional writes across items is
required
![Page 50: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/50.jpg)
Messaging appLarge items
Filters vs. indexes
M:N Modeling—inbox and outbox
![Page 51: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/51.jpg)
Messages
table
Messages app
David
SELECT *
FROM Messages
WHERE Recipient='David'
LIMIT 50
ORDER BY Date DESC
Inbox
SELECT *
FROM Messages
WHERE Sender ='David'
LIMIT 50
ORDER BY Date DESC
Outbox
![Page 52: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/52.jpg)
Recipient Date Sender Message
David 2014-10-02 Bob …
… 48 more messages for David …
David 2014-10-03 Alice …
Alice 2014-09-28 Bob …
Alice 2014-10-01 Carol …
Large and small attributes mixed
(Many more messages)
David
Messages table
50 items × 256 KB each
Partition key Sort key
Large message bodies
Attachments
SELECT *
FROM Messages
WHERE Recipient='David'
LIMIT 50
ORDER BY Date DESC
Inbox
![Page 53: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/53.jpg)
Computing inbox query cost
Items evaluated by query
Average item size
Conversion ratio
Eventually consistent reads
50 * 256KB * (1 RCU / 4 KB) * (1 / 2) = 1600 RCU
![Page 54: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/54.jpg)
Recipient Date Sender Subject MsgId
David 2014-10-02 Bob Hi!… afed
David 2014-10-03 Alice RE: The… 3kf8
Alice 2014-09-28 Bob FW: Ok… 9d2b
Alice 2014-10-01 Carol Hi!... ct7r
Separate the bulk data
Inbox-GSI Messages table
MsgId Body
9d2b …
3kf8 …
ct7r …
afed …
David1. Query inbox-GSI: 1 RCU
2. BatchGetItem messages: 1600 RCU
(50 separate items at 256 KB)
(50 sequential items at 128 bytes)
Uniformly distributes large item reads
![Page 55: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/55.jpg)
Messaging app
Messages
table
David
Inbox
global secondary
index
Inbox
Outbox
global secondary
index
Outbox
![Page 56: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/56.jpg)
Reduce one-to-many item sizes
Configure secondary index projections
Use GSIs to model M:N relationship
between sender and recipient
Distribute large items
Querying many large items at once
InboxMessagesOutbox
![Page 57: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/57.jpg)
Multiplayer online gaming
Query filters vs.
composite key indexes
![Page 58: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/58.jpg)
Secondary index
Opponent Date GameId Status Host
Alice 2014-10-02 d9bl3 DONE David
Carol 2014-10-08 o2pnb IN_PROGRESS Bob
Bob 2014-09-30 72f49 PENDING Alice
Bob 2014-10-03 b932s PENDING Carol
Bob 2014-10-03 ef9ca IN_PROGRESS David
BobPartition key Sort key
Multivalue sorts and filters
![Page 59: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/59.jpg)
Secondary Index
Approach 1: Query filter
Bob
Opponent Date GameId Status Host
Alice 2014-10-02 d9bl3 DONE David
Carol 2014-10-08 o2pnb IN_PROGRESS Bob
Bob 2014-09-30 72f49 PENDING Alice
Bob 2014-10-03 b932s PENDING Carol
Bob 2014-10-03 ef9ca IN_PROGRESS David
SELECT * FROM Game
WHERE Opponent='Bob'
ORDER BY Date DESC
FILTER ON Status='PENDING'
(filtered out)
![Page 60: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/60.jpg)
Needle in a haystack
Bob
![Page 61: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/61.jpg)
Send back less data “on the wire”
Simplify application code
Simple SQL-like expressions
• AND, OR, NOT, ()
Use query filter
Your index isn’t entirely selective
![Page 62: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/62.jpg)
Approach 2: Composite key
StatusDate
DONE_2014-10-02
IN_PROGRESS_2014-10-08
IN_PROGRESS_2014-10-03
PENDING_2014-09-30
PENDING_2014-10-03
Status
DONE
IN_PROGRESS
IN_PROGRESS
PENDING
PENDING
Date
2014-10-02
2014-10-08
2014-10-03
2014-10-03
2014-09-30
+ =
![Page 63: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/63.jpg)
Secondary Index
Approach 2: Composite key
Opponent StatusDate GameId Host
Alice DONE_2014-10-02 d9bl3 David
Carol IN_PROGRESS_2014-10-08 o2pnb Bob
Bob IN_PROGRESS_2014-10-03 ef9ca David
Bob PENDING_2014-09-30 72f49 Alice
Bob PENDING_2014-10-03 b932s Carol
Partition key Sort key
![Page 64: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/64.jpg)
Opponent StatusDate GameId Host
Alice DONE_2014-10-02 d9bl3 David
Carol IN_PROGRESS_2014-10-08 o2pnb Bob
Bob IN_PROGRESS_2014-10-03 ef9ca David
Bob PENDING_2014-09-30 72f49 Alice
Bob PENDING_2014-10-03 b932s Carol
Secondary index
Approach 2: Composite key
Bob
SELECT * FROM Game
WHERE Opponent='Bob'
AND StatusDate BEGINS_WITH 'PENDING'
![Page 65: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/65.jpg)
Needle in a sorted haystack
Bob
![Page 66: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/66.jpg)
Sparse indexes
Id (Partition)
User Game Score Date Award
1 Bob G1 1300 2012-12-23
2 Bob G1 1450 2012-12-233 Jay G1 1600 2012-12-24
4 Mary G1 2000 2012-10-24 Champ5 Ryan G2 123 2012-03-10
6 Jones G2 345 2012-03-20
Game-scores-table
Award (Partition)
Id User Score
Champ 4 Mary 2000
Award-GSI
Scan sparse GSIs
![Page 67: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/67.jpg)
Concatenate attributes to form useful
secondary index keys
Take advantage of sparse indexes
Replace filter with indexes
You want to optimize a query as much
as possible
Status + Date
![Page 68: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/68.jpg)
Reference architecture
AmazonKinesis
consumer
Amazon
EMR
![Page 69: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/69.jpg)
Elastic Serverless Applications
![Page 70: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/70.jpg)
Thank you!
![Page 71: AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)](https://reader034.vdocuments.us/reader034/viewer/2022042611/587125d31a28abe4448b6185/html5/thumbnails/71.jpg)
Remember to complete
your evaluations!