database replication - semantic scholar€¦ · a suite of replication protocols serializability...

40
DATABASE REPLICATION A TALE OF RESEARCH ACROSS COMMUNITIES Bettina Kemme Dept. of Computer Science McGill University Montreal, Canada Gustavo Alonso Systems Group Dept. of Computer Science ETH Zurich, Switzerland VLDB 2010 Singapore © ETH, Zurich and McGill University, Montreal

Upload: others

Post on 08-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

DATABASE REPLICATIONA TALE OF RESEARCH ACROSS COMMUNITIES

Bettina KemmeDept. of Computer Science

McGill UniversityMontreal, Canada

Gustavo AlonsoSystems Group

Dept. of Computer ScienceETH Zurich, Switzerland

VLDB 2010 ‐ Singapore© ETH, Zurich and McGill University, Montreal

Page 2: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Across communities

Databaseperson

Distributed systemsperson

Everything inside the database

Only performancematters

Everything failsOnly correctness

matters

Postgres‐R  (Dragon project)Protocols  [Kemme, Alonso, ICDCS’98]Implementation [Kemme, Alonso, VLDB2000]

Page 3: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Postgres‐R

Intro to Replication Postgres‐RIn perspectiveSystems TodayThe next 10 years

Page 4: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

A brief introduction to database replication

Page 5: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

DBS DBS

DBS

Database Copy

DBS

Database Copy

DBS

Database Copy

ScalabilityFault‐toleranceFast accessSpecial purpose copies

Page 6: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2
Page 7: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2
Page 8: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Primary Copy vs. Update Everywhere

Page 9: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Eager (synchr.) vs. Lazy (asynchr.)

1

2

2

1

Page 10: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Theory of replication 10 years ago

PRIMARY COPY UPDATE EVERYWHERELA

ZYEA

GER

Page 11: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Replication in practice 10 years ago

PRIMARY COPY UPDATE EVERYWHERELA

ZYEA

GER

Page 12: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Replication in 2000

Page 13: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Read‐one / Write All

Distributed locking (writes)

2 Phase Commit

Page 14: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

BOT

Read

WriteWrite Write

Read

Write

2PC

2PC2PC

Write Write

Page 15: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

The Dangers of ReplicationGray et al. SIGMOD 1996

Page 16: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

17

Response Time and Messages

centralized database

T=

T=replicated database

update: 2N messages

2PC

Page 17: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

… and that’s not all

Network becomes an issue• Messages = copies  x  write operations

Quorums?• Reads must be local for complex SQL operations 

• Different in key value stores (e.g., Cassandra)

Page 18: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Our goal 

Can we get scalability and consistency when replicating a database?

Page 19: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Postgres‐R in detail

Page 20: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Fundamentals

Exploitation of group communication systems• Ordering semantics

Affect isolation / concurrency control• Delivery semantics

Affect atomicity 

Page 21: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

22

Key insight in Postgres‐RBEFORE AFTER

Pre‐Ordering Mechanism

Scheduler Scheduler

Scheduler Scheduler

Page 22: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

The devil is in the details

Total order to serialization orderProvide various levels of isolation / atomicity degreesRead operations always localPropagate changes on transaction basisNo 2PC• Rely on delivery guarantees• Return to user once local replica commitsDeterminismPropagate changes vs. SQL statements

Page 23: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

24

Distributed locking

1 2 3 4 50

200

400

600

800Distributed locking

Number Servers

Res

pons

e Time in m

s

Did we really avoided the dangers of replication?

1 2 3 4 50

50

100

150

200Postgres- R

Number Servers

Res

pons

e Ti

me

in m

s

Postrges-R removed a lot of the overhead of

replication, providing scalability while

maintaining strong consistency

Page 24: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

In perspective

Page 25: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

What worked

Ordered and guaranteed propagation of changes through an agreement protocol external to the engine

The implementation was crucial to prove the point

Thinking through the optimizations / real system issues

Levels of consistency

Page 26: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

What did not work

Modify the engine• Today middleware based solutions

Enforce serializability• Today SI and session consistency• Data warehousing less demanding• Cloud computing has lowered the bar

Page 27: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Systems today

Page 28: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Very rich design space

Applications (OLAP vsOLTP)Data layer (DB vs. others)Throughput/Response timeStalenessAvailability guaranteesPartial vs. full replicationGranularity of changes, operations

Page 29: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

31

A Suite of Replication Protocols 

Serializability

Cursor Stability

Snapshot isolation

Hybrid

Correctness Local decisions Mssgs/txn

2 mssgswrite setconfirm commit

write locksabort conflictinglocal read locks

2 mssgswrite setconfirm commit

write locksabort conflictinglocal read locks

1 mssgwrite set

first writer tocommit wins(deterministic)

1‐copyserializability

2 mssgswrite setconfirm commit

as serializabilityfor update transactions

1‐copyserializability

no lost updatesno dirty reads

allowswrite skew

Problems

very high abort rate

inconsistentreads

high abortrate for updatehot‐spots

requiresto identifythe queries

6 versions of each protocol depending on delivery guarantees and whether deferred updates or shadow copies are used (22 protocols)[Kemme and Alonso, ICDCS’98]

Page 30: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Architecture

Eventually all this moved to a middleware layer above the database

Middle‐RAll systems out there today

Page 31: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Consistency variations

Middle‐RGORDA projectTashkentPronto/SprintC‐JDBCC. Amza et al.

Snapshot isolation, optimistic cc, 

conflict detection, relaxed consistency

Page 32: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

VM

Zimory (virtualized satellites)

Architectural variations

GanymedDBFarmSQL Azure

Primary copySpecialized satellites

Page 33: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Engine variationsXkoto (Teradata)GaleraContinuentSQL AzureGanymed

MySQL, DB2, SQL Server, Oracle,Heterogeneous

Page 34: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Change unit variations

Xkoto (Teradata)SQL statement propagation

ContinuentLog capture

Determinism!

SQL statementLog entries

Page 35: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Cloud solutionsPAXOS

Key value stores, files, 

tables

CassandraPNUTSBig TableCloudy…

Page 36: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

With a lot of related topics

Consistency (Khuzaima et al., Cahill et al.)Applications (Vandiver et al.)Agreement protocols (Lamport et al.)Determinism (Thomson et al.)Recovery (Kemme et al., Jimenez et al.)

Page 37: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

The next 10 years

Page 38: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Understanding the full picture

Paxos‐Group communication protocols differ in the exact properties they provide• Often difficult to understand for outsiders

Subtleties in implementation and efficiency• Complex implementationsAdjusting the agreement protocols to the needs of databases• Properties that suffice• Efficient implementationPlenty of use cases (one size does not fit all)

Page 39: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Database and distributed systems

Databases and distributed systems have converged in practice• Many similar conceptsResearch • Work still done in separate communitiesTeaching• Dire need for joint courses (thanks to Amr El Abbadi and DivyAgrawal from UCSB!)

Page 40: DATABASE REPLICATION - Semantic Scholar€¦ · A Suite of Replication Protocols Serializability Cursor Stability Snapshot isolation Hybrid Correctness Local decisions Mssgs/txn 2

Thanks

Andre Schiper, Fernando Pedone, Matthias Wiesmann (EPFL)Marta Patiño, Ricardo Jiménez (UPM)The PostgreSQL communityPhD and master students at ETH and McGill who have worked and are working on related ideasMany colleagues and friends ...