coterie availability in sites

22
Coterie Coterie availability in availability in sites sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005

Upload: felix-combs

Post on 31-Dec-2015

33 views

Category:

Documents


3 download

DESCRIPTION

Coterie availability in sites. Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005. Multi-site systems. Emerging class of distributed systems Collection of sites across a WAN Multiple nodes in each site Share resources Data sets - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Coterie availability in sites

Coterie availability in sitesCoterie availability in sites

Flavio Junqueira and Keith Marzullo

University of California, San Diego

DISC, Krakow, Poland, September 2005

Page 2: Coterie availability in sites

2DISC’05

Multi-site systemsMulti-site systems

Emerging class of distributed systems Collection of sites across a WAN Multiple nodes in each site Share resources

Data sets Computational power

E.g. BIRN, Geon, TeraGrid, PlanetLab

Site failure All the nodes in a site simultaneously

unavailable

Page 3: Coterie availability in sites

3DISC’05

Site availability — BIRNSite availability — BIRN

10 sites experience at least one outage

One site under 97%

Page 4: Coterie availability in sites

4DISC’05

Improving availabilityImproving availability

Better availability through replication Coteries

Set system of processes: a set of subsets of processes Each subset is called a quorum Minimal sets, pairwise intersect

Coteries are useful Distributed mutual exclusion Distributed registers Consensus through Paxos

Coterie availability in multi-site systems

Page 5: Coterie availability in sites

5DISC’05

RoadmapRoadmap

System model Availability metrics

Previous deterministic metrics not necessarily good A new metric

Failure model Characterize failures using survivor sets Survivor sets: more expressive

Quorum construction Multi-site hierarchical construction

Practical issues Failure model in practice PlanetLab experiment

Conclusions

Page 6: Coterie availability in sites

6DISC’05

System modelSystem model

Set P of processes Pairwise connected by quasi-reliable asynchronous channels Process failure: crash Processes can recover

Set B of sites Partition of the set processes Site failure: simultaneous failure of all the processes in the site Process failures are not independent

Execution Sequence of steps of processes E: set of all executions

In a step s

Available process in s p P is available if p F(s) €

NF(s) = P \ F(s)

F(s) = {p : ( p ∈ P)∧( p is faulty in s)}

Page 7: Coterie availability in sites

7DISC’05

Survivor setsSurvivor sets

A set S P is a survivor set iff

Example

∀p ∈ S : ∀E ∈E : S \ p ≠ NF(s)

∃E ∈E : ∃s ∈ E : S = NF(s)

Processes

Sites

E={E1,E2,E3,E4}

E1,E2: s1 s2 E3: s1 E4: s1

NF(si)

Survivor sets

Page 8: Coterie availability in sites

8DISC’05

Availability metricsAvailability metrics

Traditional deterministic metrics Undirected graph: nodes = processes, edges = comm. links Node vulnerability: Minimal number of nodes Edge vulnerability: Minimal number of edges

Majority is optimal [Barbara and Garcia-Molina’86] Complete graphs

Page 9: Coterie availability in sites

9DISC’05

A counterexampleA counterexample

Processes

Survivor sets

Sites

Majority Quorum: 5 processes In some step, no quorum can

be formed

Using SP as quorums In every step, at least one

quorum can be formed

Majority is not optimal

Page 10: Coterie availability in sites

10DISC’05

Availability metricsAvailability metrics

Traditional deterministic metrics Undirected graph: nodes = processes, edges = comm. links Node vulnerability: Minimal number of nodes Edge vulnerability: Minimal number of edges

Majority is optimal [Barbara and Garcia-Molina’86] Complete graphs

A new metric A(Q), Q is a coterie Number of covered survivor sets in Q A survivor set S is covered in Q if:

∃Q ∈Q : Q ⊆ S

Page 11: Coterie availability in sites

11DISC’05

Failure modelFailure model

Multi-site hierarchical model A set Fs of subsets of B

Subsets of simultaneously faulty sites

An array Fp One entry per site Each entry: subsets of

processes in the site Subsets of simultaneously

faulty processes at a site

A survivor set S: FS Fs

Bi FS:FP Fp[i]:P\FP S

Bi FS:Bi S =

Processes (P)

B1 B2 B3

Fs ={{B1},{B2},{B3}}

1 2 3 1 2 3 1 2 3

Fp [1]={{ }: i {1,2,3}}i

Fp [2]={{ }: i {1,2,3}}i

Fp [3]={{ }: i {1,2,3}}i

Sites(B )

Sp={{ }: i, j,k,l {1,2,3} ij kl}i j k l

{{ }: i, j,k,l {1,2,3} ij kl}i j k l

{{ }: i, j,k,l {1,2,3} ij kl}i j k l

Page 12: Coterie availability in sites

12DISC’05

Quorum constructionQuorum construction

Optimal availability with respect to A

Coterie Q : Sp = Q OR Q dominates Sp

Survivor sets in Sp pairwise intersect

If not, then optimally discarding survivor sets is NP-Complete

A special case: Qsite All subsets of B of size fs inFs

All subsets of size t of Bi in Fp[i], for every i

Site 1

Site 2

Site 3

E.g.: fs = 1, t = 1

Quorums

Page 13: Coterie availability in sites

13DISC’05

Model in practiceModel in practice

Qsite fs: Threshold on site failures

Data on site availability t : Threshold on process failures

Markov chains One Markov chain for each site

Transitions Failure transitions: same probability, homogeneous processes Repair transitions: variable probability, amount of resources used

Failure transitions

Repair transitions

Page 14: Coterie availability in sites

14DISC’05

PlanetLab experimentPlanetLab experiment

Toy application Paxos: quorums of acceptors Client accessing quorums

Hosts used Three sites: three from each site One UCSD host: proposer,

learner

Three settings 3Sites: One acceptor per site

Quorum: two hosts 3SitesMaj: All hosts

Quorum: four hosts, majority from each of two sites

SimpleMaj: All hosts Quorum: any five processes

UC Davis

UT Austin

DukeUC San Diego

SimpleMaj has worse availability

3SitesMaj has better availability

Page 15: Coterie availability in sites

15DISC’05

The Bimodal modelThe Bimodal model

Sites are survivor sets Sp is not a coterie

“Throw out” survivor sets In general, optimal solution is NP-Complete Simple solution for this model

Practical issues Practical for two sites More than two sites: open problem

n0

t0 t1 t t

00 01 0t

10 11 1t

0n

n1 n t nn

t n

1n

Page 16: Coterie availability in sites

16DISC’05

ConclusionsConclusions

Coteries for multi-site systems Site failures: process failures not independent

A new metric Counts covered survivor sets

Multi-site hierarchical construction Practical Illustrated with Markov model Experiment shows better availability

Using majority quorums is not a good idea Not optimal Poor performance

Future work More experiments, more constructions, real deployment

Page 17: Coterie availability in sites

17DISC’05

END

Page 18: Coterie availability in sites

18DISC’05

Backup Slides

Page 19: Coterie availability in sites

19DISC’05

Failure modelsFailure models

The multi-site hierarchical model A set Fs of subsets of B

An array Fp One entry per site Each entry: subsets of processes in

the site

A survivor set S: FS Fs

Bi FS:FP Fp[i]:P\FP S

Bi FS:Bi S =

The bimodal model A set Fs of subsets of B

There is one site that is in no element of Fs

An array Fp

A survivor set S As in the previous model OR

Bi B: S = Bi

Processes

B2B1

Fs =

Fp [1]={{ }: i {1,2,3}}

1 2 3 1 2 3

i

Fp [2]={{ }: i {1,2,3}}i

MSH: Sp={{ }: i, j,k,l {1,2,3}

ij kl} i j k l

B: Sp={{ }: i, j,k,l {1,2,3} ij kl} B

i j k l

Page 20: Coterie availability in sites

20DISC’05

Bimodal constructionBimodal construction

Bimodal model By construction: Not all pairs of survivor sets intersect

Discard survivor sets until remaining intersect Selecting optimally is NP-Complete

Solution: Remove |B|-1 survivor sets Survivor sets containing processes from multiple sites pairwise intersect Construction is also optimal with respect to metric A

A special case: Bsite All elements of Fs have size fs

All elements of Fp[i] have the same size t, for every i

E.g.: fs = 1, t = 1 B1

B2

Quorums

Page 21: Coterie availability in sites

21DISC’05

Site availabilitySite availability

Goals Show that sites are unavailable frequently enough

BIRN - Biomedical Informatics Research Network Test bed projects centered around brain imaging Currently: 19 universities, 26 research groups

Availability Monthly basis Pings (BIRN-CC) Storage broker logs

Site availability Jan/04-Aug/04 Availability under 100%

On average in 5 out of the 8 months

Availability = Total hours - Unplanned outages

Total hours×100

Page 22: Coterie availability in sites

22DISC’05

Causes of site failuresCauses of site failures

Misconfigured software Shared resources

1.Storage2.Power circuits3.Cooling pipes4.Air conditioning5.Network