google megastore & google spanner - harvard...
TRANSCRIPT
Google Megastore & Google Spanner
The ProblemI have a great app idea, but I don’t want to learn much about databases. Can’t it just be easy and scalable and reliable?
Because that’s impossible!
Consistency Availability
Scalability
NoSQLRDBMS
Google Megastore
Highly ScalableRapid
Development
Low Latency
ConsistentHighly Available
BigTable, Google’s distributed KV store“anchor:cnnsicom” “anchor:my.look.ca”
“CNN” “CNN.com”t3
t8t4t5
t9“<html>...”
“contents:”
Megastore
BigTable
GFS
2, 14, 37....
58,90,102..
700, 706..
2, 14, 37....
58,90,102..
700, 706..
2, 14, 37....
58,90,102..
700, 706..
Entity Groups
Datacenters
Entity Groups
Datacenters
Paxos Basics - Read and Write
Read, Phase 1 - Reading replica polls other replicas
Read, Phase 2 - Available replicas respond with their values
0
0
0
0
0
0
0
0
Write, Prepare - Writing replica asks other replicas to conduct vote
0
0
0
0
0
0
0
0
Write, Promise -- Available replicas promise to ignore lower proposals
0, 3
0, 3
0, 3
0, 3
0, 3
0, 3
0, 3
0, 3
Write, Accept - Writing replica proposes its value
Write, Commit - Available replicas accept the value
Paxos Basics -- Write Conflicts
0
0
0
1
1
1
1
1
Prepare - 2 writing replicas want to make proposals
0
0
0
1
1
1
1
1
Promise
1
1
1
1
1
1
1
1
More prepares - Lower numbered proposals get rejected / replaced
0
0
0
0
1
1
1
1
1
More Promises
1
1
1
Accept - Only 1 replica gets the majority of promises required
1, 3
1, 3
1, 3
1, 3
1, 3
1, 3
1, 3
1, 3
Commit
How could Paxos be made more efficient?
Local Reads via a “coordinator”
Local replica up-to-date
Local replica out-of-date
Local replica out-of-date - normal Paxos read
Local replica out-of-date - update replica and coordinator
b
0, 3, b
0, 3, b
0, 3, b
0, 3, b
0, 3, b
0, 3, b
0, 3, b
0, 3, b
Faster Writes: Accept & Prepare in 1 communication
b
Writing replica requests to be proposal 0
8
b
Immediate Accept & Prepare for next round
0, 8, c
0, 8, c0, 8, c
0, 8, c
0, 8, c
0, 8, c
0, 8, c0, 8, c
Megastore Performance Analysis
Performance AnalysisOur experimental setup is that we’ve used Megastore for 100+ applications for several years and it works!
What are some drawbacks to this solution?
Google Megastore
Highly ScalableRapid
Development
Low Latency
ConsistentHighly Available
Google Spanner
Highly ScalableRapid
Development
Low Latency
ConsistentHighly Available
What time is it?
TT.now() → TTinterval:
Absolute time Upper boundLower bound
2016-02-10, 4:35:32
2016-02-10, 4:35:33
2016-02-10, 4:35:312016-02-10, 4:35:30
timestamp_0
Improved Paxos: Serialization with Lockingtimestamp _0
timestamp _0
timestamp _0
timestamp _0
timestamp _0
timestamp _0
timestamp _0
Improved Paxos: Long Leader Leases
ts_0
ts _0
ts_0
ts_0
ts _0
ts _0ts _0
ts_1
ts_1
ts_1
ts_1
ts_1
ts_1
ts_1
Spanner Performance Analysis
Setup
Next steps?
Supporting more complex SQL queries with an underlying key-value structure
key, value…………...…...
ID, name, age, address...……………...
The Future
SQL NoSQL
NewSQL