distributed consensus

2
Technical Reading How is your data stored, or, The Laws of the Imaginary Greek by Yonatan Zunger Aarthi Raghavendra This article talks about “Distributed Consensus”, in simple words, how do a group of distant computers connected by unreliable links working on a shared resource come to an agreement about what’s in it. The author explained it rather vividly by comparing it to the various law systems on the islands of ancient Greece. It makes sense as with both laws and files, many people want to read it, makes changes to it and do not want to spend a lot of time doing it. The two basic problems that need to be tackled are Simultaneous reading and Simultaneous writing. They can be handled in 4 different ways, each of which corresponds to the method implemented on each island. 1. The island of Pseudemoxos: Single Data Center – A single computer owned the resource and all the other systems who wished to read the contents of the shared resource made their own copies. In order to perform a write operation they had to take turns. The problem occurred when the system crashed halfway through a write leaving the data corrupted. This could however be solved by Journaling i.e., never over-write anything. This in turn made reading difficult due to the increased length of the journal which was taken care of by creating a digest of the information. Though this system had few obvious flaws such as its susceptibleness to a single crash, it ensured consistency and simplicity. 2. The island of Fotas: Eventual Consistency – Each system in the distributed network maintained its own copy of the resource and they made changes to their own copies. The changes were reflected to the other systems by sending their updates and receiving the other updates from all the other nodes on the network. The unique feature of the system was that consistency was not ensured immediately but rather eventually i.e., all the nodes were allowed to write at the same time without talking to each other however no one knew the exact state of the law at that instant but knew how it was a short time ago. This system is useful where successive writes were not performed once the data is written, such as uploading images onto a distributed network which will never be changed once written. 3. The island of Paxos: Quorum Decision – In order to perform a write operation on a resource a majority of the participants had to agree for the change to be made. It ensured strong consistency by guaranteeing: Read-after-write consistency, once the consensus condition had occurred for a proposed write/read, then every future read attempt would see that consensus Read-modify-write consistency, by not allowing other writes to intervene once a change to the resource had begun else the change will fail and everyone will know to try again. This system can be slow when the nodes are distributed across a large region. 4. The island of Siranos: Master Election – A node is selected as a “master”, who is responsible for all the changes related to a task. Strong consistency can be easily achieved as long as all the other nodes know whom to contact for any particular task and the master node is not overloaded with requests. Any node wishing to make changes related to a task or know the latest information on a task would contact the master. A central registry of masters is nothing more than another strongly-consistent store and then the responsible parties use their own, smaller, strongly

Upload: aarthi-raghavendra

Post on 20-Aug-2015

62 views

Category:

Software


3 download

TRANSCRIPT

Page 1: Distributed consensus

Technical Reading

How is your data stored, or, The Laws of the Imaginary Greek by Yonatan Zunger Aarthi Raghavendra

This article talks about “Distributed Consensus”, in simple words, how do a group of distant computers

connected by unreliable links working on a shared resource come to an agreement about what’s in it. The

author explained it rather vividly by comparing it to the various law systems on the islands of ancient

Greece. It makes sense as with both laws and files, many people want to read it, makes changes to it and

do not want to spend a lot of time doing it.

The two basic problems that need to be tackled are Simultaneous reading and Simultaneous writing. They can be handled in 4 different ways, each of which corresponds to the method implemented on each island.

1. The island of Pseudemoxos: Single Data Center – A single computer owned the resource and all

the other systems who wished to read the contents of the shared resource made their own copies.

In order to perform a write operation they had to take turns. The problem occurred when the

system crashed halfway through a write leaving the data corrupted. This could however be solved

by Journaling i.e., never over-write anything. This in turn made reading difficult due to the

increased length of the journal which was taken care of by creating a digest of the information.

Though this system had few obvious flaws such as its susceptibleness to a single crash, it ensured

consistency and simplicity.

2. The island of Fotas: Eventual Consistency – Each system in the distributed network maintained

its own copy of the resource and they made changes to their own copies. The changes were

reflected to the other systems by sending their updates and receiving the other updates from all

the other nodes on the network. The unique feature of the system was that consistency was not

ensured immediately but rather eventually i.e., all the nodes were allowed to write at the same

time without talking to each other however no one knew the exact state of the law at that instant

but knew how it was a short time ago. This system is useful where successive writes were not

performed once the data is written, such as uploading images onto a distributed network which

will never be changed once written.

3. The island of Paxos: Quorum Decision – In order to perform a write operation on a resource a

majority of the participants had to agree for the change to be made. It ensured strong consistency

by guaranteeing:

• Read-after-write consistency, once the consensus condition had occurred for a proposed

write/read, then every future read attempt would see that consensus

• Read-modify-write consistency, by not allowing other writes to intervene once a change to

the resource had begun else the change will fail and everyone will know to try again. This

system can be slow when the nodes are distributed across a large region.

4. The island of Siranos: Master Election – A node is selected as a “master”, who is responsible for

all the changes related to a task. Strong consistency can be easily achieved as long as all the other

nodes know whom to contact for any particular task and the master node is not overloaded with

requests. Any node wishing to make changes related to a task or know the latest information on

a task would contact the master. A central registry of masters is nothing more than another

strongly-consistent store and then the responsible parties use their own, smaller, strongly

Page 2: Distributed consensus

consistent store to maintain the task, but for reasons of scale and reliability it’s better to use the

Paxos method to build the central registry.

Therefore each of the systems have their own trade-offs and the selection depends on the client needs to

be satisfied and the guarantees to be provided.