eecs 491 introduction to distributed systems · 2019. 9. 26. · case study: gfs google file system...
TRANSCRIPT
EECS 491Introduction to Distributed
Systems
Fall 2019
Harsha V. Madhyastha
Case study: GFS
● Google File System◆ Distributed storage system tailored to Google’s
workload
● Workload characteristics and setting:◆ Multi-GB files◆ Files are mostly appended to◆ Failures are extremely common
September 24, 2019 EECS 491 – Lecture 7 2
High-level Design● Files are split into 64 MB chunks
● Every chunk is replicated on three randomly selected machines
● A central chunkmaster server picks replicas of every chunk
September 24, 2019 EECS 491 – Lecture 7 3
GFS Overview
September 24, 2019 EECS 491 – Lecture 7 4
Replication in GFS
Client Primary
Backup
Backup
Chunkmaster
September 24, 2019 EECS 491 – Lecture 7 5
Replication in GFS
● High latency to distant primary◆ In data center, bandwidth degrades with distance
September 24, 2019 EECS 491 – Lecture 7 6
Client Primary
Backup
Backup
Chunkmaster
Replication in GFS
● High latency to distant primary● Submitting write to nearest replica will
compromise total ordering of writes
September 24, 2019 EECS 491 – Lecture 7 7
Client Primary
Backup
Backup
Chunkmaster Client2
Replication in GFS
● High latency to distant primary● Writing to nearest replica compromises total ordering● Optimize performance without violating consistency?
September 24, 2019 EECS 491 – Lecture 7 8
Client Primary
Backup
Backup
Chunkmaster Client2
Data flow vs. Control flow
September 24, 2019 EECS 491 – Lecture 7 9
GFS Performance Benchmark
September 24, 2019 EECS 491 – Lecture 7 10
Implementing RSMs
● Logical clock based ordering of requests◆ Cannot serve requests if any one replica is down
● Primary-backup replication◆ Replace primary/backup upon failure
September 24, 2019 EECS 491 – Lecture 7 11
Availability of P/B-based RSM● When is RSM unavailable to serve requests?
● Temporarily:◆ While primary is bootstrapping new backup◆ Replica is down but viewservice yet to detect
● Permanently:◆ Primary↔backup down but both can talk to viewservice◆ Primary fails while bootstrapping backup
September 24, 2019 EECS 491 – Lecture 7 12
How to …
● … make RSM tolerant to network partitions?
● … ensure that operations don’t block even if some machines are unavailable?
September 24, 2019 EECS 491 – Lecture 7 13
RSM via Consensus
● Idea: Apply update if majority of replicas commit● If 2f+1 replicas, need f+1 to commit
● Why majority? Why not fewer or more?● Remaining replicas cannot accept some other
update
September 24, 2019 EECS 491 – Lecture 7 14
Context for Today’s Lecture
● Say all replicas are in sync with each other
● First: Among several concurrent new updates, how to pick next update to apply?
● Later: How to apply all updates in a consistent order at all replicas?
September 24, 2019 EECS 491 – Lecture 7 15
Let’s plan a camping trip!● Before going away on
internships, three friends plan on going camping
● Can only coordinate via unreliable text messages
● How to decide on a camp to meet at?
September 24, 2019 EECS 491 – Lecture 7 16
Alice
BobSam
Strawman Approaches● Every user sends their proposal to everyone● Every user accepts first proposal received● Proposal accepted by majority is chosen● Why might this not work?
● Every user tags proposal with seq number● Every user collects proposals and accepts
highest seq number proposal● Why might this not work?
September 24, 2019 EECS 491 – Lecture 7 17
Paxos
● Original paper submitted in 1990◆ Tells mythical story of Greek island of Paxos with
“legislators” and “current law” passed through parliamentary voting protocol
● Widely used in industry todaySeptember 24, 2019 EECS 491 – Lecture 7 18
Desirable Properties● Safety
◆ “No bad things happen”◆ System never reaches an undesirable state
● Liveness◆ “Good things eventually happen”◆ System makes progress eventually
● Tradeoff between consistency and latency
September 24, 2019 EECS 491 – Lecture 7 19
Desired Properties of Solution
● Safety:◆ Choose a proposal only if accepted by a majority◆ Choose from proposals made
● Liveness:◆ If proposals exist, one will eventually be chosen◆ If a proposal is chosen, all replicas will eventually
discover that it was chosen
September 24, 2019 EECS 491 – Lecture 7 20
Project 2● View service:
◆ Draw state diagram◆ What events can cause view to change?◆ What state determines how to change view?◆ Think about cases not covered by unit tests
● Primary backup service:◆ Carefully think about implications of every failed RPC◆ Sleep for PingInterval before retrying RPC
September 24, 2019 EECS 491 – Lecture 7 21
Roles of a Process in Paxos
● Three conceptual roles◆ Proposers propose values◆ Acceptors accept values; chosen if majority accept◆ Learners learn the outcome (chosen value)
● In reality, a process can play any/all roles● Roles in bank account example?● Roles in camping trip example?● Roles if Paxos used to pass laws?
22September 24, 2019 EECS 491 – Lecture 7
Paxos: High-level Intuition
● May be unable to reach consensus in one round● So, protocol runs over multiple rounds● In each round:
◆ Elect a leader◆ Proposal by current leader accepted by majority
● Once a value is accepted by a majority, a diff. value won’t be proposed subsequently
September 24, 2019 EECS 491 – Lecture 7 23
Paxos Overview● Three phases within each round● Prepare phase (elect a leader):
◆ Proposer sends unique proposal no. to all acceptors◆ Waits to get commitment from majority of acceptors
● Accept phase (get majority to accept):◆ Proposer sends proposed value to all acceptors◆ Waits to get proposal accepted by majority
● Learn phase (disseminate chosen value):◆ Learners discover value accepted by majority
September 24, 2019 EECS 491 – Lecture 7 24