adaptive replication and partitioning in data systemsmtabebe/resources/... · hierarchical p2p file...

329
Adaptive Replication and Partitioning in Data Systems Brad Glasbergen, Michael Abebe, Khuzaima Daudjee Middleware 2018 Data Systems Group

Upload: others

Post on 18-Aug-2020

2 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adaptive Replication and Partitioning

in Data Systems

Brad Glasbergen, Michael Abebe,

Khuzaima DaudjeeMiddleware 2018

Data Systems Group

Page 2: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

BA .APAGE 2

Page 3: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 3

Page 4: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 4

Page 5: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 5

Page 6: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S1

A

Single Node Architecture

BOverloaded

PAGE 6

Page 7: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Single Node Architecture

How to scale beyond a single node?

Replicate and partition

PAGE 7

Page 8: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Single Node Architecture

How to scale beyond a single node?

Replicate and partition

PAGE 8

Page 9: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S1

A

Replicated Architecture

B

S2 S3

A B A B

Handle more requests PAGE 9

Page 10: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S1

A

Replicated Architecture

B

S2 S3

A B A B

Cost of coordinationPAGE 10

Page 11: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S1

A

Replicated Architecture

S2 S3

A B B

How many replicas?PAGE 11

Page 12: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S1

A

Replicated Architecture

S2 S3

A B B

W[ A ] W[ A ]

PAGE 12

Page 13: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S1

A

Replicated Architecture

S2 S3

AB B

W[ A ] W[ A ]

Where to place replicas?PAGE 13

Page 14: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S1

A

Replicated Architecture

S2 S3

AB B

How to propagate updates?

W[ A ]

PAGE 14

● (A)synchronous● Consistency

Page 15: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How many replicas?

● Where to place replicas?

● How to propagate updates?

Replication Decisions

PAGE 15

Page 16: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Single Node Architecture

How to scale beyond a single node?

Replicate and partition

PAGE 16

Page 17: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Partitioned Architecture

Distributes requests S1

A

S2

B

PAGE 17

Page 18: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Partitioned Architecture

S1 S2

BA

PAGE 18

Page 19: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Partitioned Architecture

How to form partitions? S1

A1

S2

BA2

PAGE 19

Page 20: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Partitioned Architecture

Where to place partitions? S1

A1

S2

BA2

PAGE 20

Page 21: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Partitioned Architecture

How to execute multi-partition operations?

S1

A

S2

B

PAGE 21

Page 22: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Partitioning Decisions

PAGE 22

● How to form partitions? ● Where to place partitions?

● How to execute multi-partition operations?

Page 23: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Where to place partitions?

S1

A1

S3

BA2

PAGE 23

Page 24: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

How to make a partitioning or replication decision when access patterns change?

Static Decisions

Why do access patterns change?

PAGE 24

Page 25: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Why do accesses change?

Humans have follow-the-sun cycles

PAGE 25

Page 26: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Why do accesses change?

Load bursts

PAGE 26

Page 27: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Why do accesses change?

Shifting hot-spotsPAGE 27

Page 28: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

How to make a partitioning or replication decision when access patterns change?

Static Decisions

Adaptively replicate and partition

PAGE 28

Page 29: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Where to place partitions?

S1

A1

S2

BA2

PAGE 29

Page 30: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Where to place partitions?

S1

A1

S2

A2B

PAGE 30

Page 31: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

How many replicas?

S1

A1

S2

A2B

R[ B ]

PAGE 31

Page 32: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

How many replicas?

S1

A1

S2

A2B

R[ B ]

B

PAGE 32

Page 33: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● Adaptive Replication

● Adaptive Partitioning

● Outlook

Road Map

PAGE 33

Page 34: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adaptive Replication

PAGE 34

Page 35: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How many replicas?

● Where to place replicas?

● How to propagate updates?

Replication Decisions

PAGE 35

Page 36: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adaptive Replication ● Decentralized

● Geo-Distributed

● Caching

● Availability

PAGE 36

Page 37: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S2

Adaptive Replication (ADR)

(Wolfson et al., TODS 1997)

S1 S3

S4 S5 S6

PAGE 37

Page 38: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adaptive Replication (ADR)

(Wolfson et al., TODS 1997)

S2 S1 S3

PAGE 38

Page 39: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adaptive Replication (ADR)

(Wolfson et al., TODS 1997)

A

S2 S1 S3

PAGE 39

Page 40: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S2

Local Read

(Wolfson et al., TODS 1997)

A

R[A] S1 S3

PAGE 40

Page 41: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Remote Read

(Wolfson et al., TODS 1997)

A

S2 S1 S3

PAGE 41

Page 42: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

R[A] S2

Remote Read

(Wolfson et al., TODS 1997)

A

S1 S3

PAGE 42

Page 43: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

R[A] S2

Remote Read

(Wolfson et al., TODS 1997)

A

S1 S3

More Messages!

PAGE 43

Page 44: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

R[A] S2

Replication for Reads

(Wolfson et al., TODS 1997)

A

S1 S3

A’

PAGE 44

Page 45: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

W[A] S2

No Free Lunch

(Wolfson et al., TODS 1997)

A

S1 S3

A’

PAGE 45

Page 46: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

W[A] S2

No Free Lunch

(Wolfson et al., TODS 1997)

A

S1 S3

A’

Reduces read cost,Increases write cost

PAGE 46

Page 47: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

When to Replicate?

(Wolfson et al., TODS 1997)

A

S2 S1 S3

PAGE 47

Page 48: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S2

When to Replicate?

(Wolfson et al., TODS 1997)

A

Reads: 15

Writes: 5

Writes: 5

S1 S3

PAGE 48

Page 49: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

When to Replicate?

(Wolfson et al., TODS 1997)

A

Reads: 0

Writes: 5

Writes: 5

A’

Writes: 10

S2 S1 S3

PAGE 49

Page 50: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

When to Replicate?

(Wolfson et al., TODS 1997)

A

Reads: 0

Writes: 5

Writes: 5

A’

Writes: 10

5 Fewer Messages!

S2 S1 S3

PAGE 50

Page 51: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

When to Stop Replicating?

(Wolfson et al., TODS 1997)

AA’

S2 S1 S3

PAGE 51

Page 52: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S2

When to Stop Replicating?

(Wolfson et al., TODS 1997)

AA’

Writes: 20Reads: 5

Reads: 5

S1 S3

PAGE 52

Page 53: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S2

When to Stop Replicating?

(Wolfson et al., TODS 1997)

A

Writes: 20Reads: 5

Reads: 5

10 Fewer Messages!

S1 S3

PAGE 53

Page 54: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Decentralized Decisions

(Wolfson et al., TODS 1997)

A

S2 S1 S3

Replicate?

PAGE 54

Page 55: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adapts to Changing Workloads

(Wolfson et al., TODS 1997)

A

Replicate?

S2 S1 S3

More Reads

A’

PAGE 55

Page 56: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S2

Extensions (Network Topology)

(Wolfson et al., TODS 1997)

S1 S3

S4 S5 S6

PAGE 56

Page 57: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S2

Extensions (Network Topology)

(Wolfson et al., TODS 1997)

S1 S3

S4 S5 S6

PAGE 57

Page 58: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

S2

Extensions (Network Topology)

(Wolfson et al., TODS 1997)

S1 S3

S4 S5 S6

Form Tree

PAGE 58

Page 59: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Extensions (Blocks)

(Wolfson et al., TODS 1997)

S2 S1 S3

A

B

C

⅓ R[A] + ⅓ R[B] + ⅓ R[C] ⅓ W[A] + ⅓ W[B] + ⅓ W[C]

A’

B’

C’

PAGE 59

Page 60: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Peer-to-Peer File Systems

PAGE 60

Page 61: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hash-Based P2P File System

h(s) = 275

PAGE 61

Page 62: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hash-Based P2P File System

PAGE 62

Page 63: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hash-Based P2P File System

PAGE 63

Page 64: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hash-Based P2P File System

PAGE 64

Page 65: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hash-Based P2P FSBalanced load, but poor access locality!

PAGE 65

Page 66: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hierarchical P2P File Systems

PAGE 66

Page 67: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hierarchical P2P File Systems

PAGE 67

Page 68: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hierarchical P2P File Systems

PAGE 68

Page 69: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hierarchical P2P File Systems

PAGE 69

Page 70: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hierarchical P2P File SystemGood locality,but poor balance

PAGE 70

Page 71: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Gopalakrishnan et al., ICDCS’04)

Overloaded

Moderate Load

Light Load

Locality and Load Balance

PAGE 71

Page 72: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Gopalakrishnan et al., ICDCS’04)

Overloaded

Moderate Load

Light Load

Locality and Load Balance

PAGE 72

Page 73: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Gopalakrishnan et al., ICDCS’04)

Locality and Load Balance

PAGE 73

No replication

Page 74: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Gopalakrishnan et al., ICDCS’04)

Replicate to balance load!

Locality and Load Balance

PAGE 74

Page 75: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Gopalakrishnan et al., ICDCS’04)

Locality and Load Balance

PAGE 75

Page 76: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Locality and Load Balance

(Gopalakrishnan et al., ICDCS’04)

Replicate to balance load!

PAGE 76

Page 77: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Gopalakrishnan et al., ICDCS’04)

Locality and Load Balance

PAGE 77

Page 78: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Decentralized Replicas

(Gopalakrishnan et al., ICDCS’04)

Writes?

PAGE 78

Page 79: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How many replicas?Decentralized decision

● Where to place replicas?At the requester

● How to propagate updates?ADR: Synchronous, P2P: read-only

Replication Decisions

(Gopalakrishnan et al., ICDCS’04)(Wolfson et al., TODS 1997)PAGE 79

Page 80: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 80

Page 81: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Global Scale Replication

022 73

Average Latency31 ms

PAGE 81

Page 82: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 82

Page 83: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Global Scale Replication

185 159

142189

136022 73

Average Latency113 ms

PAGE 83

Page 84: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 84

Page 85: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Global Scale Replication

136

185 159

142180

0022 73

Average Latency 95 ms

PAGE 85

Page 86: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Global Scale Replication

301136

159

185 0

169

0022 73

Average Latency64 ms

133

PAGE 86

Page 87: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Minimize cost of access

Take me to your leader!(Sharov et al., VLDB 2015)

Global Scale Replication

GPlacer (Zakhary et al., EDBT 2018)

Place data around the world

PAGE 87

Page 88: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Data Distribution Model

(Sharov et al., VLDB 2015)

S1

A B

S4

A B

S2

A B

S3

A B

Replicas

PAGE 88

Page 89: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Data Distribution Model

(Sharov et al., VLDB 2015)

S1

A B

S4

a b

S2

A B

S3

A B

Read-Write Replicas

Read ReplicasState

changesPAGE 89

Page 90: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Data Distribution Model

(Sharov et al., VLDB 2015)

S1

A B

S4

a b

S2

A B

S3

A B

Read-Write Replicas

Read ReplicasLeader State

changesCoordinates

PAGE 90

RWWL

Page 91: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Global Replication Problem

(Sharov et al., VLDB 2015)

● Select replicas

● Assign replica roles (read or read-write)

● Assign leader

PAGE 91

Page 92: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Global Replication Problem

(Sharov et al., VLDB 2015)

● Select replicas

● Assign replica roles (read or read-write)

● Assign leader

PAGE 92

Page 93: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Assign Leader

(Sharov et al., VLDB 2015)

Leader: site that minimizes access costs

PAGE 93

Page 94: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

+ median RTT( S1, {S1, S2, S3})

Write transaction cost

(Sharov et al., VLDB 2015)

S1

A B

S4

a b

S2

A B

S3

A B

Send write to leader

Quorum writes cost = RTT(S3, S1)

PAGE 94

RWWL

Page 95: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Assign Leader

(Sharov et al., VLDB 2015)

Leader: site that minimizes access costs

Client cost: RTT(client, replica) + cost( transaction )

PAGE 95

Page 96: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Weighting Client Cost

(Sharov et al., VLDB 2015)PAGE 96

Page 97: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Weighting Client Cost

(Sharov et al., VLDB 2015)PAGE 97

2 writes2 reads

10 writes20 reads

5 writes5 reads

Page 98: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Assign Leader

(Sharov et al., VLDB 2015)

Leader: site that minimizes access costs

Client cost: RTT(client, replica) + cost( transaction )

Cost: Weighted average of client costs

PAGE 98

Page 99: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Global Replication Problem

(Sharov et al., VLDB 2015)

● Select replicas

● Assign replica roles (read or read-write)

● Assign leader

PAGE 99

Page 100: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Assign Replica RolesLeader: minimizes median RTT to read-write replicas

(Sharov et al., VLDB 2015)

Read-write replicas:

PAGE 100

Page 101: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

+ median RTT( S1, {S1, S2, S3})

Write transaction cost

(Sharov et al., VLDB 2015)

S1

A B

S4

a b

S2

A B

S3

A B

Send write to leader

Quorum writes cost = RTT(S3, S1)

PAGE 101

RWWL

Page 102: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Assign Replica RolesLeader: minimizes median RTT to read-write replicas

(Sharov et al., VLDB 2015)

Read-write replicas: Lowest RTT to leader

PAGE 102

Page 103: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● Select replicas

● Assign replica roles (read or read-write)

● Assign leader

Global Replication Problem

(Sharov et al., VLDB 2015)PAGE 103

Page 104: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Replica selection

(Sharov et al., VLDB 2015)

Read-write replicas: Lowest RTT to leader

Read replicas:

Leader: minimizes median RTT to read-write replicas

PAGE 104

Page 105: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Replica selection

(Sharov et al., VLDB 2015)PAGE 105

Client cost: RTT(client, replica) + cost( transaction )

Page 106: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Replica selection

(Sharov et al., VLDB 2015)PAGE 106

Client cost: RTT(client, replica) + cost( transaction )

Read replicas: Lowest RTT to clients

Page 107: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Replica selection

(Sharov et al., VLDB 2015)

Read-write replicas: Lowest RTT to leader

Read replicas: Lowest RTT to clients

Leader: minimizes median RTT to read-write replicas

PAGE 107

Page 108: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

K-Means Replica selection

(Sharov et al., VLDB 2015)PAGE 108

Page 109: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

K-Means Replica selection

(Sharov et al., VLDB 2015)

97

121

7365

88 0

0

0

220

Average Latency55 ms

PAGE 109

219

169

Page 110: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

K-Means Replica selectionSelect replicas

(Sharov et al., VLDB 2015)

Assign leader and read-write replicas

PAGE 110

Page 111: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Leaderless Protocols

(Zakhary et al., EDBT 2018)

S1

A B

S4

a b

S2

A B

S3

A B

Any quorum member can coordinate

PAGE 111

Page 112: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hinted Hand off

301136

159(Zakhary et al., EDBT 2018)

PAGE 112

Page 113: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hinted Hand off

301136

159

295 < 301

(Zakhary et al., EDBT 2018)PAGE 113

Page 114: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Hinted Hand off

(Zakhary et al., EDBT 2018)

Hand off request from S1 to S2 if:

cost( S1 ) > RTT( S1, S2) + cost( S2 )

PAGE 114

cost( S ) = cost of executing request at S

Page 115: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How many replicas?Centralized, given client workload

● Where to place replicas?Heuristic (clustering)

● How to propagate updates?Quorums / Leader-based (Sharov)

Replication Decisions

(Zakhary et al., EDBT 2018)(Sharov et al., VLDB 2015)PAGE 115

Page 116: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Intra-Region Latency

(Sivasubramanian et al., WWW 2005)

100+ ms

PAGE 116

Page 117: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Edge Nodes

(Sivasubramanian et al., WWW 2005)PAGE 117

Supports Static Data

Dynamic Data?

Page 118: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

GlobeDB

(Sivasubramanian et al., WWW 2005)PAGE 118

Page 119: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Replication Granularity

(Sivasubramanian et al., WWW 2005)

ID ARTIST

1 Bryan Adams

2 Justin Bieber

3 Avril Lavigne

PAGE 119

Page 120: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Replication Granularity

(Sivasubramanian et al., WWW 2005)

Per-Record?High Overhead

PAGE 120

ID ARTIST

1 Bryan Adams

2 Justin Bieber

3 Avril Lavigne

Page 121: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Replication Granularity

(Sivasubramanian et al., WWW 2005)

Per-Table?Inflexible

PAGE 121

ID ARTIST

1 Bryan Adams

2 Justin Bieber

3 Avril Lavigne

Page 122: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Access-Driven Replicas

(Sivasubramanian et al., WWW 2005)

When would these be replicated together?

PAGE 122

ID ARTIST

1 Bryan Adams

2 Justin Bieber

3 Avril Lavigne

Page 123: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Access-Driven Replicas

(Sivasubramanian et al., WWW 2005)

Abryan = <r1,...rn,w1,...wn>

PAGE 123

ID ARTIST

1 Bryan Adams

2 Justin Bieber

3 Avril Lavigne

Page 124: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Access-Driven Replicas

(Sivasubramanian et al., WWW 2005)

Sim(Abryan,Ajustin) ≥τ?

PAGE 124

Ajustin = <r1,...rn,w1,...wn>

ID ARTIST

1 Bryan Adams

2 Justin Bieber

3 Avril Lavigne

Page 125: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Access-Driven Replicas

(Sivasubramanian et al., WWW 2005)

Shared Replication Scheme

PAGE 125

ID ARTIST

1 Bryan Adams

2 Justin Bieber

3 Avril Lavigne

Page 126: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Access-Driven Replicas

(Sivasubramanian et al., WWW 2005)

Ap1 = <r1,...rn,w1,...wn>

PAGE 126

ID ARTIST

1 Bryan Adams

2 Justin Bieber

3 Avril Lavigne

Page 127: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Access-Driven Replicas

(Sivasubramanian et al., WWW 2005)

Aavril = <r1,...rn,w1,...wn>

PAGE 127

ID ARTIST

1 Bryan Adams

2 Justin Bieber

3 Avril Lavigne

Sim(Ap1,Aavril) ≥τ?

Page 128: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Access-Driven Replicas

(Sivasubramanian et al., WWW 2005)PAGE 128

ID ARTIST

1 Bryan Adams

2 Justin Bieber

3 Avril Lavigne

Page 129: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Access-Driven Replicas

(Sivasubramanian et al., WWW 2005)PAGE 129

ID ARTIST

1 Bryan Adams

2 Justin Bieber

3 Avril Lavigne

4 Kanye West

5 Drake

6 David Guetta

7 Ed Sheeran

Page 130: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Transaction Processing

(Sivasubramanian et al., WWW 2005)

Origin Server:Decide Partitions, Place Replicas, Place Master

PAGE 130

Page 131: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Transaction Processing

(Sivasubramanian et al., WWW 2005)

W[B]

PAGE 131

M

Page 132: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Transaction Processing

(Sivasubramanian et al., WWW 2005)

Push updates

PAGE 132

M

Page 133: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Transaction Processing

(Sivasubramanian et al., WWW 2005)

R[B]

PAGE 133

M

Page 134: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Transaction Processing

(Sivasubramanian et al., WWW 2005)

Push updates R[B]

PAGE 134

M

Page 135: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Transaction Processing

(Sivasubramanian et al., WWW 2005)PAGE 135

M

M

Page 136: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Replica and Master Placement

(Sivasubramanian et al., WWW 2005)PAGE 136

Page 137: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Read Latency vs. Bandwidth

(Sivasubramanian et al., WWW 2005)PAGE 137

M

Page 138: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Read Latency vs. Bandwidth

(Sivasubramanian et al., WWW 2005)PAGE 138

M

Read Latency

Bandwidth

𝜶

𝜷

Master?

Page 139: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Master Partition Placement

(Sivasubramanian et al., WWW 2005)PAGE 139

Page 140: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Master Partition Placement

(Sivasubramanian et al., WWW 2005)PAGE 140

Page 141: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Placement Heuristic

(Sivasubramanian et al., WWW 2005)PAGE 141

M

100% Threshold

Page 142: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Placement Heuristic

(Sivasubramanian et al., WWW 2005)PAGE 142

M

95% Threshold

Page 143: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Placement Heuristic

(Sivasubramanian et al., WWW 2005)PAGE 143

M

30% Threshold

Page 144: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Placement Heuristic

(Sivasubramanian et al., WWW 2005)PAGE 144

M

5% Threshold

Page 145: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Placement Heuristic

(Sivasubramanian et al., WWW 2005)

Minimize:𝛼 r + 𝛽 b

PAGE 145

M

0% Threshold

Page 146: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How many replicas?Cost-based given requests

● Where to place replicas?Cost-based given requests

● How to propagate updates?Single-master, eventual consistency

Replication Decisions

(Sivasubramanian et al., WWW 2005) PAGE 146

Page 147: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Web Workload Characteristics

(Glasbergen et al., EDBT 2018)

Read-heavy

Cache misses are very painful

PAGE 147

Page 148: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Web Workload Characteristics

(Glasbergen et al., EDBT 2018)PAGE 148

Page 149: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Web Workload Characteristics

(Glasbergen et al., EDBT 2018)PAGE 149

Page 150: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Web Workload Characteristics

(Glasbergen et al., EDBT 2018)PAGE 150

Page 151: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Predict(Q2)

Predictive Caching

(Glasbergen et al., EDBT 2018)

Q1 Forward(Q1)Result(Q1)

Cache(Q2)Q2

PAGE 151

Page 152: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Query Patterns (TPC-W)

(Glasbergen et al., EDBT 2018)PAGE 152

Page 153: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 153

Page 154: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Building a Predictive Model

(Glasbergen et al., EDBT 2018)PAGE 154

Time

Q1 Q2 Q3 Q1 Q2 Q3

Q1

Q2 Q3

1 1

Page 155: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Building a Predictive Model

(Glasbergen et al., EDBT 2018)PAGE 155

Time

Q1 Q2 Q3 Q1 Q2 Q3

Q1

Q2 Q3

1 1

1

Page 156: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Building a Predictive Model

(Glasbergen et al., EDBT 2018)PAGE 156

Time

Q1 Q2 Q3 Q1 Q2 Q3

Q1

Q2 Q3

1 1

1

1

Page 157: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Building a Predictive Model

(Glasbergen et al., EDBT 2018)PAGE 157

Q1

Q2 Q3

5 3

5

1All executed 5 times

100% probability Q2 follows Q1

20% probability Q1 follows Q3

Page 158: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Finding Parameter Mappings

(Glasbergen et al., EDBT 2018)PAGE 158

Page 159: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Finding Parameter Mappings

(Glasbergen et al., EDBT 2018)PAGE 159

Page 160: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Finding Parameter Mappings

(Glasbergen et al., EDBT 2018)PAGE 160

Page 161: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Finding Parameter Mappings

(Glasbergen et al., EDBT 2018)PAGE 161

Page 162: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Predictive Caching

(Glasbergen et al., EDBT 2018)PAGE 162

Q1

Q2 Q3

Page 163: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Predictive Caching

(Glasbergen et al., EDBT 2018)PAGE 163

Q1

Q2 Q3

Predictively Cache Q2

Page 164: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Predictive Caching

(Glasbergen et al., EDBT 2018)PAGE 164

Q1

Q2 Q3

Predictively Cache Q3

Page 165: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Apollo Deployment

(Glasbergen et al., EDBT 2018)

A

PAGE 165

A

AA

A

A A

AR[B]

Page 166: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Apollo Deployment

(Glasbergen et al., EDBT 2018)PAGE 166

AR[B]

R[C]

Page 167: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Invalidations

(Glasbergen et al., EDBT 2018)

W[B]

PAGE 167

Page 168: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Invalidations

(Glasbergen et al., EDBT 2018)

Invalidations Limit Cache Effectiveness

PAGE 168

Page 169: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Session Semantics

(Glasbergen et al., EDBT 2018)

R[B]

W[B]

PAGE 169

Page 170: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Session Semantics

(Glasbergen et al., EDBT 2018)

Good fit for web data!

R[B]

PAGE 170

Page 171: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How many replicas?Predictively based on requests

● Where to place replicas?Client edge cache, predictively

● How to propagate updates?Cache updates with sessions

Replication Decisions

(Glasbergen et al., EDBT 2018) PAGE 171

Page 172: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 172

Page 173: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 173

Page 174: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 174

Page 175: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Replication for Availability

Failures are common

Data systems must remain available

PAGE 175

Page 176: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Replication for Availability

S1

C

S3 S2 S4 S5

DA

CA B

CDB

DB

CA

Tolerating r faults requires r + 1 replicas

B B B

PAGE 176

Page 177: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Lower Overhead

PAGE 177

B1

ReplicasData

XOR

B2 xor (B1 xor B2) = B1

B2

B1

B2

B1

B2

B1 xor B2

B1

B2

Page 178: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Erasure Coding

12...k

k + 1

...

k + r

12...k

Tolerating r faults requires (k+r)/k space

k partitionsr parity

partitions

Data

PAGE 178

Page 179: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Erasure Coding

S1 S3 S2 S4 S5

A1A2 A3 A4

W [ A ] Encode and storek = 2, r = 2

PAGE 179

Page 180: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Erasure Coding

S1 S3 S2 S4 S5

A1A2 A3 A4

R [ A ] Read and decodek = 2, r = 2

PAGE 180

Page 181: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Erasure CodingReduces storage overhead

Requires parallel retrieval

PAGE 181

Page 182: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Erasure Coded StorageWhere to place data

How to access dataEC-Store (Abebe, ICDCS 2018)

PAGE 182

Page 183: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

EC-Store Data Access

(Abebe et al., ICDCS 2018)

S1 S3 S2 S4

A1 A2 A3B1 B2 B3 C1

R[ A, B ]

R[ A, B ]Load aware

S5

R[ C ]k = 2, r = 1

PAGE 183

Page 184: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

EC-Store Data Access

(Abebe et al., ICDCS 2018)

Access Strategy: Minimize cost of access

Cost of site access: load at site + I/O at site

PAGE 184

Page 185: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

EC-Store Data Movement

(Abebe et al., ICDCS 2018)

S1 S3 S2 S4

A1 A2 A3B1 B2 B3 C1

R[ A, B ] R[ C ]

R[ A, B ]Load aware

S5

A3

k = 2, r = 1

PAGE 185

Page 186: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

EC-Store Data Movement

(Abebe et al., ICDCS 2018)

Move data to minimize cost of future accesses and balance system load

Model access patterns to predict future accesses

PAGE 186

Page 187: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How many replicas?Fault tolerance requirements

● Where to place replicas?Dynamic movement, using access costs

● How to propagate updates?Synchronous updates

Replication Decisions

(Abebe et al., ICDCS 2018) PAGE 187

Page 188: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● Adaptive Replication

● Adaptive Partitioning

● Outlook

Road Map

PAGE 188

Page 189: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adaptive Partitioning

PAGE 189

Page 190: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How to form partitions? ● Where to place partitions?

● How to execute multi-partition operations?

Partitioning Decisions

PAGE 190

Page 191: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adaptive Partitioning ● Iterative improvements

● Partitioning per request

● Considering the overall workload ○ Heuristics

PAGE 191

Page 192: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 192

Page 193: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 193

Page 194: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Physical Database Design

A B C D

1 2 4 2

2 4 6 8

3 6 7 5

PAGE 194

Page 195: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

A B C D

1 2 4 8

2 4 6 3

3 6 7 10

1248

Physical Database Design

PAGE 195

Page 196: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

A B C D

1 2 4 8

2 4 6 3

3 6 7 10

12482463

Physical Database Design

PAGE 196

Page 197: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

A B C D

1 2 4 8

2 4 6 3

3 6 7 10

1248246336710

Physical Database Design

PAGE 197

Page 198: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

A B C D

1 2 4 82 4 6 3

3 6 7 10

SELECT AVERAGE(C) FROM R WHERE R.D > 5;

Physical Database Design

PAGE 198

1248246336710

Page 199: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

A B C D

1 2 4 82 4 6 3

3 6 7 10

SELECT AVERAGE(C) FROM R WHERE R.D > 5;

AAA

...

...

CCC

DDD

Scan

Analytic Database Design

PAGE 199

Page 200: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

SELECT AVERAGE(C) FROM R WHERE R.D > 5;

AAA...

...

CCC

DDD

Need to know what to index upfront

Index on D

Analytic Database Design

PAGE 200

Page 201: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adaptive Range IndexingSELECT ... FROM R WHERER.D > 5;SELECT … FROM R WHERE R.D > 5 AND R.D < 10;SELECT … FROM R WHERE R.D > 10 AND R.D < 20;

PAGE 201

Page 202: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Database Cracking83

1000425122215

174221

Column D83

1000425122215

174221

Cracked Column D

Copy

(Idreos et al., CIDR 2007)PAGE 202

Page 203: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning83

1000425122215

174221

Cracked Column D

SELECT … WHERE R.D > 5

(Idreos et al., CIDR 2007)PAGE 203

Page 204: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning

(Idreos et al., CIDR 2007)

83

1000425122215

174221

SELECT … WHERE R.D > 5

Cracked Column D

PAGE 204

Page 205: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning

(Idreos et al., CIDR 2007)

83

1000425122215

174221

SELECT … WHERE R.D > 5

Cracked Column D

PAGE 205

Page 206: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning

(Idreos et al., CIDR 2007)

83

1000425122215

174221

SwapSELECT … WHERE R.D > 5

Cracked Column D

PAGE 206

Page 207: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning

(Idreos et al., CIDR 2007)

53

1000425122218

174221

SwapSELECT … WHERE R.D > 5

Cracked Column D

PAGE 207

Page 208: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning

(Idreos et al., CIDR 2007)

53

1000425122218

174221

SwapSELECT … WHERE R.D > 5

Cracked Column D

PAGE 208

Page 209: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning

(Idreos et al., CIDR 2007)

5314

251222108

174221

SwapSELECT … WHERE R.D > 5

Cracked Column D

PAGE 209

Page 210: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning

(Idreos et al., CIDR 2007)

5314

251222108

174221

SELECT … WHERE R.D > 5

Cracked Column D

PAGE 210

Page 211: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning

(Idreos et al., CIDR 2007)

5314

251222108

174221

SELECT … WHERE R.D > 5 AND R.D < 10

Only need to consider these

Cracked Column D

PAGE 211

Page 212: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning

(Idreos et al., CIDR 2007)

53148

10171225224221

SELECT … WHERE R.D > 5 AND R.D < 10

Cracked Column D

PAGE 212

Page 213: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning

(Idreos et al., CIDR 2007)

53148

10171225224221

Only need to consider these

SELECT … WHERE R.D > 10 AND R.D < 20

Cracked Column D

PAGE 213

Page 214: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Indexes via Partitioning

(Idreos et al., CIDR 2007)

53148

10171225224221

SELECT … WHERE R.D > 10 AND R.D < 20

Cracked Column D

Iterative Partitioning forIndexing

PAGE 214

Page 215: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Cracked Column D53148

10171225224221

Advanced Cracking Methods

Distribution

(Idreos et al., CIDR 2007)

Cracking: Extensions

PAGE 215

Page 216: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How to form partitions?Iteratively, based on queries

● Where to place partitions?Sorted in memory

● How to execute multi-partition operations?N/A

Partitioning Decisions

(Idreos et al., CIDR 2007) PAGE 216

Page 217: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Exploratory Workloads?

App Usage Time

PAGE 217

Page 218: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Exploratory Workloads?

Usage By Device

PAGE 218

Page 219: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Exploratory Workloads?

Revenue By Device

PAGE 219

Page 220: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Exploratory Workloads?

Devices By Country

No upfront information, need generic partitioning!

PAGE 220

Page 221: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Initial Partitioning (KD-Tree)Depth Limits Division

64 MB

128 MB

256 MB

512 MB

(Shanbhag et al., SoCC 2017)PAGE 221

Page 222: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Heterogeneous Tree

(Shanbhag et al., SoCC 2017)

Contains more attributes!

PAGE 222

Page 223: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Building the Partitioning

(Shanbhag et al., SoCC 2017)

A: 0.0B: 0.0C: 0.0D: 0.0

PAGE 223

Page 224: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Building the Partitioning

(Shanbhag et al., SoCC 2017)

A: 1.0B: 0.0C: 0.0D: 0.0

PAGE 224

Page 225: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Building the Partitioning

(Shanbhag et al., SoCC 2017)

A: 1.0B: 0.5C: 0.0D: 0.0

PAGE 225

Page 226: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Building the Partitioning

(Shanbhag et al., SoCC 2017)

A: 1.0B: 0.5C: 0.5D: 0.0

PAGE 226

Page 227: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Building the Partitioning

(Shanbhag et al., SoCC 2017)

A: 1.0B: 0.5C: 0.5D: 0.5

PAGE 227

Page 228: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Building the Partitioning

(Shanbhag et al., SoCC 2017)

A: 1.0B: 0.75C: 0.5D: 0.5

PAGE 228

Page 229: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Q1 =σD≤45

Adaptive Partitioning

(Shanbhag et al., SoCC 2017)PAGE 229

Page 230: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adaptive Partitioning

(Shanbhag et al., SoCC 2017)

Q2 =σA≧125Refine partitioning per the workload!

PAGE 230

Page 231: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adaptive Partitioning: When?

(Shanbhag et al., SoCC 2017)

Q1 =σD≤45Q1, Q2, Q3, Q1, ...

PAGE 231

Page 232: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Swap Operation

(Shanbhag et al., SoCC 2017)

Q1 =σD≤45Q1, Q2, Q3, Q1, ...

PAGE 232

Rewrite Tree

Page 233: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Push Up Operation

(Shanbhag et al., SoCC 2017)PAGE 233

Q1 =σD≤45Q1, Q2, Q3, Q1, ...

Push up

Page 234: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Push Up Operation

(Shanbhag et al., SoCC 2017)PAGE 234

Q1 =σD≤45Q1, Q2, Q3, Q1, ...

Logical Movement

Page 235: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Divide and ConquerQ1 =σD≤45

Get Best Subtree

Get Best Subtree

PAGE 235(Shanbhag et al., SoCC 2017)

Page 236: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How to form partitions?Upfront then iteratively, based on queries

● Where to place partitions?Rely on HDFS

● How to execute multi-partition operations?Rely on HDFS

Partitioning Decisions

(Shanbhag et al., SoCC 2017) PAGE 236

Page 237: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

PAGE 237

Page 238: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Exploiting Workloads● Known ahead of time

● Parameterized

● Repetitive

PAGE 238

Page 239: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Exploiting Workloads - OLTP

PAGE 239

Warehouse

District

Customer

Page 240: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Partitioning OLTPWrite [ W1, D1, C1 ]

S1

W1C1

D1

S2

W2C2

D2

PAGE 240

Page 241: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Partitioning OLTP

S1

W1C1

D1

S2

W2C2

D2

Write [ W1, D1, C2 ]

prepare to commit

commit

PAGE 241

Two phase commit

Page 242: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Per transaction partitioning

Workload based repartitioning

Partitioning OLTP

G-Store(Das et al., SoCC 2010)

L-Store (Lin et al., SIGMOD 2016)

Later in the tutorial

PAGE 242

Page 243: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Create group

Key Grouping

S1

C1

D1

S2

W2C2

D2

(Das et al., SoCC 2010)

W1

PAGE 243

Write[ W1, D1, C2 ]

Page 244: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Create group

Key Grouping

S1

C1

D1

S2

W2C2

D2

(Das et al., SoCC 2010)

W1

Join request

PAGE 244

Write[ W1, D1, C2 ]

Page 245: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Create group

Key Grouping

S1

C1

D1

S2

W2C2

D2

(Das et al., SoCC 2010)

W1

Join request

JoinedC2 Propagate

Txn ops

PAGE 245

Write[ W1, D1, C2 ]

Page 246: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Create group

Key Grouping

S1

C1

D1

S2

W2C2

D2

(Das et al., SoCC 2010)

W1

Join request

JoinedC2 Propagate

Txn ops

Delete group

PAGE 246

Write[ W1, D1, C2 ]

Page 247: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Create group

Key Grouping

S1

C1

D1

S2

W2C2

D2

(Das et al., SoCC 2010)

W1

Join request

JoinedC2 Propagate

Txn ops

Delete group

FreePAGE 247

Write[ W1, D1, C2 ]

Page 248: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Create group

Key Grouping

S1

C1

D1

S2

W2C2

D2

(Das et al., SoCC 2010)

W1

Join request

Joined

Propagate

Txn ops

Delete group

FreePAGE 248

Write[ W1, D1, C2 ]

Page 249: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

On demand transactional partitioning

Key Grouping

Works best when groups are small and transactions contain multiple operations

(Das et al., SoCC 2010)

But groups are transient

PAGE 249

Page 250: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Localizing Execution

(Lin et al., SIGMOD 2016)

Repartition data via localization for single site execution

Dynamic partitioning based on transaction patterns

PAGE 250

Page 251: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Localizing Execution

(Lin et al., SIGMOD 2016)

S1

W1C1

D1

S2

W2C2

D2

Ownership information

D2 S2

C1 S1

C2 S2

W1 S1

W2 S2

D1 S1

PAGE 251

Page 252: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Localizing Execution

(Lin et al., SIGMOD 2016)

S1

W1C1

D1

S2

W2C2

D2

Ownership information

D2 S2

C1 S1

C2 S2

W1 S1

W2 S2

D1 S1

PAGE 252

Page 253: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Localizing Execution

(Lin et al., SIGMOD 2016)

S1

W1C1

D1

S2

W2C2

D2

Write [ W1, D1, C1 ]

W1 S1

W2 S2

D1 S1

D2 S2

C1 S1

C2 S2

PAGE 253

Page 254: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Localizing Execution

(Lin et al., SIGMOD 2016)

S1

W1C1

D1

S2

W2C2

D2

Owner request

W1 S1

W2 S2

D1 S1

D2 S2

C1 S1

C2 S2

PAGE 254

Write [ W1, D1, C2 ]

Page 255: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Localizing Execution

(Lin et al., SIGMOD 2016)

S1

W1C1

D1

S2

W2C2

D2TransferW1 S1

W2 S2

D1 S1

D2 S2

C1 S1

C2 S2

PAGE 255

Owner request

Write [ W1, D1, C2 ]

Page 256: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Localizing Execution

(Lin et al., SIGMOD 2016)

S1

W1C1

D1

S2

W2 D2Transfer

ResponseC2

Txn ops

C2

W1 S1

W2 S2

D1 S1

D2 S2

C1 S1

C2 S1

PAGE 256

Owner request

Write [ W1, D1, C2 ]

Page 257: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Localizing Execution

(Lin et al., SIGMOD 2016)

S1

W1C1

D1

S2

W2 D2C2

Txn ops

W1 S1

W2 S2

D1 S1

D2 S2

C1 S1

C2 S1

PAGE 257

Write [ W1, D1, C2 ]

Page 258: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Localizing Execution

(Lin et al., SIGMOD 2016)

Dynamic partitioning based on per transaction patterns

Does not consider workload overall

PAGE 258

Page 259: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How to form partitions?Transaction localization

● Where to place partitions?At requester

● How to execute multi-partition operations?L-Store protocol

Partitioning Decisions

(Lin et al., SIGMOD 2016) PAGE 259

Page 260: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How to form partitions?Key groups, temporarily

● Where to place partitions?Key group leader

● How to execute multi-partition operations?Key group protocol

Partitioning Decisions

(Das et al., SoCC 2010) PAGE 260

Page 261: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Localizing Transactions

A, B

W[A,B]

Commit

Commit Locally Without Synchronization!

PAGE 261

C, D

Page 262: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Constructing the Graph

ID Name

A Alice

B Bob

C Carol

From a workload trace

(Curino et al., VLDB 2010)PAGE 262

Page 263: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Constructing the Graph

ID Name

A Alice

B Bob

C Carol

Add traced transactions: R[A,B], 3x W[A,C]

(Curino et al., VLDB 2010)PAGE 263

Page 264: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Constructing the Graph

ID Name

A Alice

B Bob

C Carol

Add node weights (size, load)

(Curino et al., VLDB 2010)PAGE 264

Page 265: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Constructing the Graph

ID Name

A Alice

B Bob

C Carol

Min-cut edges subject to weight imbalance

k=2(Curino et al., VLDB 2010)

PAGE 265

High Edge Cuts!

Page 266: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Constructing the Graph

ID Name

A Alice

B Bob

C Carol

Min-cut edges subject to weight imbalance

k=2(Curino et al., VLDB 2010)

PAGE 266

Imbalanced!

Page 267: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Constructing the Graph

ID Name

A Alice

B Bob

C Carol

Min-cut edges subject to weight imbalance

k=2(Curino et al., VLDB 2010)

PAGE 267

Page 268: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adding Replica SupportR[A,B] 3x W[A,C]

(Curino et al., VLDB 2010)PAGE 268

Page 269: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Adding Replica SupportR[A,B] 3x W[A,C]

(Curino et al., VLDB 2010)

Holistic Partitioning/Replication

Offline and Periodic

PAGE 269

Page 270: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Access Patterns Change!

(Nicoara et al., EDBT 2015)

W[A,B], 3x W[A,C], W[A,D], W[C,D] 3x R[B,C]

PAGE 270

W(P1)=4, W(P2)=10, EC=8

Page 271: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Two Phases

(Nicoara et al., EDBT 2015)

Phase 1

PAGE 271

Page 272: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Two Phases

(Nicoara et al., EDBT 2015)

Phase 2

Rule:- Movement doesn’t overload- Move best-gain candidates- If overloaded, must move!

PAGE 272

Logical Movement, then Migrate

Page 273: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Two Phases

(Nicoara et al., EDBT 2015)

Already Overloaded

Phase 1

W(P1)=4, W(P2)=10, EC=8, Bounds: (6,8)

PAGE 273

Page 274: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Two Phases

(Nicoara et al., EDBT 2015)

Phase 2

W(P1)=4, W(P2)=10, EC=8, Bounds: (6,8)

Gain=0

PAGE 274

Page 275: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Two Phases

(Nicoara et al., EDBT 2015)

Already Overloaded

Phase 1

W(P1)=5, W(P2)=9, EC=8, Bounds: (6,8)

PAGE 275

Page 276: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Two Phases

(Nicoara et al., EDBT 2015)

Phase 2

W(P1)=5, W(P2)=9, EC=8, Bounds: (6,8)

Gain=-1

PAGE 276

Page 277: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Convergence

(Nicoara et al., EDBT 2015)

W(P1)=6, W(P2)=8, EC=9, Bounds: (6,8)

PAGE 277

Stable

Page 278: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How to form partitions?Graph partitioning

● Where to place partitions?Based on partitioning

● How to execute multi-partition operations?2PC

Partitioning Decisions

PAGE 278(Curino et al., VLDB 2010) (Nicoara et al., EDBT 2015)

Page 279: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Graph Partitioning

X

A

B C

DMinimizes total number of distributed transactions

Ignores per node involvement

PAGE 279

Page 280: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Balance load and minimize distributed transactions

Adaptive Database Partitioning

PAGE 280

Database elasticity

Clay (Serafini et al., VLDB 2015)

P-Store (Taft et al., SIGMOD 2018)

E-Store (Taft et al., VLDB 2014)

Page 281: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Considering Distributed Cost

(Serafini et al., VLDB 2016)

General Graph Partitioning:

Clay:

minimize # edge cuts

load(Si) < (1 + ε) avg load( S )

load(Si) = ∑ w(v)

load(Si) = ∑ w(v) + k ∑ w( uv ) (v at Si)(u not at Si)

(v at Si)General:

such that:

Distributed cost

load balanced

PAGE 281

Page 282: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Repartitioning Cost

(Serafini et al., VLDB 2016)

General Graph Partitioning: minimize # edge cuts

Clay: minimize # edge cuts

cost of repartitioning

and# of vertices mapped to new partitions

PAGE 282

Page 283: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Serafini et al., VLDB 2016)

F

E

D

C

B

A

S1 S2

S3

MedLow HighClumping

PAGE 283

Page 284: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Serafini et al., VLDB 2016)

F

E

D

C

B

A

S1 S2

S3

G

MedLow HighClumping

PAGE 284

Page 285: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Serafini et al., VLDB 2016)

F

E

D

C

B

A

S1 S2

S3

G Form clump

MedLow HighClumping

PAGE 285

Page 286: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Serafini et al., VLDB 2016)

F

E

D

C

B

A

S1 S2

S3

G Migrate clump

Increases cost

MedLow HighClumping

PAGE 286

Page 287: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Serafini et al., VLDB 2016)

F

E

D

C

B

A

S1 S2

S3

G

MedLow HighClumping

PAGE 287

Page 288: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Serafini et al., VLDB 2016)

F

E

D

C

B

A

S1 S2

S3

G Expand clump

MedLow HighClumping

PAGE 288

Page 289: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Serafini et al., VLDB 2016)

EF

D

C

B

A

S1

S2

S3

GMigrate clump

MedLow HighClumping

PAGE 289

Page 290: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Serafini et al., VLDB 2016)

FE

D

C

B

A

S1

S2

S3

GExpand clump

MedLow HighClumping

PAGE 290

Page 291: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Serafini et al., VLDB 2016)

FE

D C

B

A

S1

S2

S3

GMigrate clump

MedLow HighClumping

PAGE 291

Page 292: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Serafini et al., VLDB 2016)

FE

D

C

B

A

S1

S2

S3

G

MedLow HighClumping

PAGE 292

Termination

Page 293: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Clumping

(Serafini et al., VLDB 2016)

Expands clumps to frequently accessed neighbours

Consider moving clump to lightly loaded sites

Considers both re-partitioning and load costs

PAGE 293

Page 294: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Elasticity

(Taft et al., VLDB 2015)

F

E

D

C

B

A

S1 S2

S3

MedLow High

PAGE 294

Page 295: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Elasticity

(Taft et al., VLDB 2015)

F

E

D

C

B

A

S1 S2

S3

MedLow High

PAGE 295

Page 296: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Elasticity

(Taft et al., VLDB 2015)

F

E

D

C

B

A

S1 S2

S3

MedLow High

S4

PAGE 296

Page 297: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Elasticity

(Taft et al., VLDB 2015)

Repartition to elastically add or remove nodes

PAGE 297

Page 298: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Elasticity

(Taft et al., VLDB 2015)

F

E

D

C

B

A

S1 S2

S3

MedLow High

S4

PAGE 298

Page 299: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Elasticity

(Taft et al., VLDB 2015)

F

E

D

C

B

A

S1 S2

MedLow High

S4

PAGE 299

Page 300: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Elasticity Decisions

(Taft et al., VLDB 2015)

When the average load:

increases: add nodes

decreases: remove nodes

PAGE 300

Page 301: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Two Tier Data Placement

(Taft et al., VLDB 2015)

Identify hot data

Evenly distribute hot data

Distribute cold data over remaining capacity

PAGE 301

Page 302: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Identifying Hot Data

(Taft et al., VLDB 2015)

Monitor partition level access frequency

If hot partition enable tuple level monitoring

Reacts to changes in load

PAGE 302

Page 303: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Reactive Elasticity

(Taft et al., VLDB 2015)

Load

Capacity

Time

PAGE 303

Page 304: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Reactive Elasticity

(Taft et al., VLDB 2015)

Load

Capacity

Time

PAGE 304

Page 305: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Reactive Elasticity

(Taft et al., VLDB 2015)

Load

Capacity

Time

PAGE 305

Page 306: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Reactive Elasticity

(Taft et al., VLDB 2015)

Load

Capacity

Time

PAGE 306

Page 307: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Reactive Elasticity

(Taft et al., VLDB 2015)

Load

Capacity

Time

PAGE 307

Page 308: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Reactive Elasticity

(Taft et al., VLDB 2015)

Load

Capacity

Time

PAGE 308

Page 309: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Reactive Elasticity

(Taft et al., VLDB 2015)

Load

Capacity

Time

PAGE 309

Page 310: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Ideal Elasticity

(Taft et al., SIGMOD 2018)

Load

Capacity

Time

predict the function

PAGE 310

Page 311: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Periodic Workloads

(Taft et al., SIGMOD 2018)

Daily load variations

Seasonal load spikesPAGE 311

Page 312: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

How to Predict Load

(Taft et al., SIGMOD 2018)

load(t) = avg_load( t - pi) + change_in_load( t - ji)

= +Load Periodicity Trend

SPAR: Sparse Periodic Auto-Regression

PAGE 312

Page 313: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

decide the # of nodes

Ideal Elasticity

(Taft et al., SIGMOD 2018)

Load

Capacity

Time

PAGE 313

Page 314: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Number of Nodes

(Taft et al., SIGMOD 2018)

# of nodes Predicted Load

Load per Server=

Assuming partitionable

PAGE 314

Page 315: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

(Serafini et al., VLDB 2016)

● How to form partitions?Heuristically (Clumping versus 2 Tier)

● Where to place partitions?React or predict based on load

● How to execute multi-partition operations?2PC

Partitioning Decisions

(Taft et al., VLDB 2015)(Taft et al., SIGMOD 2018) PAGE 315

Page 316: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● Adaptive Replication

● Adaptive Partitioning

● Outlook

Road Map

PAGE 316

Page 317: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Outlook

PAGE 317

Page 318: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

How to make a partitioning or replication decision when access patterns change?

Adaptive Systems

Adaptively replicate and partition

PAGE 318

Page 319: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How to form partitions?

● Where to place partitions?

● How to execute multi-partition operations?

Partitioning Decisions

Iterative, Temporarily, Graph partitioning, Heuristic

Sorted, Leader, At requester, Graph partitioning, Reactively, Predictively

Novel protocols, 2PCPAGE 319

Page 320: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How many replicas?

● Where to place replicas?

● How to propagate updates?

Replication Decisions Decentralized, Client workload, Cost-based, Predictive, Fault tolerance

At requester, Heuristic, Cost-based, Predictive, Dynamic

Synchronous, Quorums, Single-master, Cache PAGE 320

Page 321: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

● How many replicas?

● Where to place replicas?

● Where to place partitions?

Decisions Predictively

Predictively

Predictively

PAGE 321

Page 322: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

How to make a partitioning or replication decision when access patterns change?

Adaptive & Predictive Systems

Adaptively and predictively replicate and partition

PAGE 322

Page 323: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Predicting the FutureHow can your system predict its future workload?

Apollo: Predict future queries (Markov Model)

P-Store: Predict future load (SPAR)

PAGE 323

Page 324: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Predicting the Future QB5000When, how many, and what queries will arrive?

(Ma et al., SIGMOD 2018)

Pre-process: remove parameters, creating templates

SELECT * FROM C WHERE id = “C1”

SELECT * FROM C WHERE id = $

PAGE 324

Page 325: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Predicting the Future QB5000When, how many, and what queries will arrive?

(Ma et al., SIGMOD 2018)

Cluster: group templates by arrival rate

T1 T2 T3

PAGE 325

Page 326: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Predicting the Future QB5000When, how many, and what queries will arrive?

(Ma et al., SIGMOD 2018)

Forecast: Predict clusters arrival rate (Ensemble of RNN, LR, KR)

PAGE 326

Page 327: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Predicting the Future QB5000When, how many, and what queries will arrive?

(Ma et al., SIGMOD 2018)

Pre-process: remove parameters, creating templatesCluster: group templates by arrival rate

Forecast: Predict clusters arrival rate (Ensemble of RNN, LR, KR)

PAGE 327

Page 328: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

Predicting the FutureHow can your system predict its future workload?

Apollo: Predict future queries (Markov Model)

P-Store: Predict future load (SPAR)

QB5000: Predict query workloads (Ensemble of RNN, LR, KR)

PAGE 328

Page 329: Adaptive Replication and Partitioning in Data Systemsmtabebe/resources/... · Hierarchical P2P File Systems PAGE 67. Hierarchical P2P File Systems PAGE 68. Hierarchical P2P File Systems

If your system knew the future workload, how could it partition and replicate data?

Predicting the FutureHow can your system predict its future workload?

PAGE 329