midterm examination ece 419s 2016: distributed systems

Midterm ExaminationECE 419S 2016: Distributed Systems

Date: March 10, 2016

Instructor: Cristiana AmzaDepartment of Electrical and Computer Engineering

University of Toronto

Problem number Maximum Score Your Score1 72 503 254 105 8

total 100

This exam is closed textbook and closed lecture notes. You have two hours to complete the exam. Use ofcomputing and/or communicating devices is NOT permitted. You should not need any such devices. Youcan use a basic calculator if you feel it is absolutely necessary.

Work independently. Do not remove any sheets from this test book. Answer all questions in the spaceprovided. No additional sheets are permitted. Scratch space is available at the end of the exam.

Write your name and student number in the space below. Do the same on the top of each sheet of this exambook.

Your Student Number

Your First Name

Your Last Name

1

Student Number: Name:

Problem 1. Physical Clocks (7 points)

a)(4 points) Using the following diagram for orientation, we are trying to derive an algorithm by which aprocess running on machine A estimates the offset of its clock relative to that of the process running on ma-chine B. Please describe the algorithm, in terms of who sends what message and what timestamps should beput in each message sent. Assume that dTreq and dTres are the symbols we use for the network propagationdelays. Then derive a formula by which A will compute the desired offset based on the timestamps usedby the algorithm. Unless otherwise specified, you are allowed to make the same assumptions as Christian’salgorithm.

2


(b)(3 points) Suppose that in the previous figure, we cannot assume that dTreq and dTres are equal, as isoften the case in wide-area networks. How does this affect the accuracy of A’s estimation ?

3


Problem 2. Logical Clocks, Causal and Total Order Multicast (50 points)

Imagine a distributed implementation of a war game in which tanks fire at each other. Each participantuses a separate computer that communicates with others over the network, and selective local events aremulticasted to all game participants as follows. The implementation is such that, when a player A fires atB, A multicasts the fire event with a projectile carrying a certain number of charge units. Player B, uponreceiving the fire event, evaluates the damage. Each direct hit received for a local tank typically causes adeterioration to the functionality of the tank proportional to the number of charge units of the projectile. Forexample, a tank could have 80% functionality remaining if hit once with a projectile charge of 1, and then40% functionality remaining if hit again with a projectile charge of 2.

If a tank that is hit has no remaining functionality, the player who received the (last) direct hit to one of itstanks multicasts the event destroyed. Assume that no partial deterioration is disclosed to other players (onlythe destroyed event is multicasted).

This implementation may result in temporal anomaly due to network delays. As illustrated in the figure,assuming that a single hit can cause 100% deterioration, player C may observe the damage (“destroyed”)on B before the firing from A.

The following questions are about designing and explaining a solution to avoid temporal anomalies, whichenables displaying the events in a consistent order, respecting causality, on all nodes.

Unless otherwise specified, assume that there are no node failures and that network channels are reliable,but messages may not pass on the network with full parallelism between them.

Further assume that we have three alternative solutions in our design to avoid this temporal anomaly: i)using vector timestamps (VTS) with clock increments on both send and receive events (VTS-1), ii) usingvector timestamps (VTS) with clock increments only on multicast events (VTS-2) and iii) using total ordermulticast (TO-MCAST) based on Lamport clocks. Based on your designs, answer the following questions.For each question, design and explain the most efficient algorithm you can come up with.

4


a) (19 points) Describe your full solution for the design using TO multicast, with Lamport clocks, for thetwo cases below. As one part of your answer, make sure to specify the rule for when an event is displayedat a recipient. For each answer, indicate the number of total messages per displayed event. This means thetotal number of messages system-wide sent on behalf of each displayed event, starting from the multicastuntil the event is displayed on all nodes.

a1)(6 points) Solution and nr. of messages if the network is reliable and FIFO (from lecture notes).

5


a2)(7 points) Derive an efficient multicast-based algorithm that uses Lamport clocks, but has significantlyfewer total messages than the solution from the lecture notes. For full marks, your solution should introduceas little extra latency as possible on the critical path of displaying each event. Your solution should workeven if the network is reliable, but not FIFO, and we do not know an upper bound on the network delay.Clearly state the nr. of messages for your solution.

6


a3)(6 points) Give solution and nr. of messages if the network is reliable, but not FIFO, and the known upperbound on network delay on any link is DELTA (be precise in the use of DELTA towards the most efficientalgorithm possible). You can assume full network parallism for this part.

7


b) (12 points) Assume a working solution based on VTS-1 Vector timestamps to avoid the anomaly, i.e.,with VTS count/increment at both send and receive events. When is a post displayed and what is the totalnumber of messages system-wide sent on behalf of each displayed event ? Explain briefly.

b1)(6 points) Solution and nr. of messages if the network is reliable and FIFO.

b2)(6 points) Solution and nr. of messages if the network is reliable, but not FIFO, and we do not know anupper bound on the network delay.

8


c) (6 points) Assume a working solution based on VTS-2 Vector timestamps where we count/increment onlyon multicast events. When is a post displayed and what is the total number of messages system-wide onbehalf of each post ? Explain briefly.

c1) Solution and nr. of messages if the network is reliable and FIFO.

c2) Solution and nr. of messages if the network is reliable, but not FIFO, and we do not know an upperbound on the network delay.

9


d) (5 points) Now assume that you are allowed to use a sequencer process as a lightweight point of cen-tralization from which each process asks for and receives a sequence number just before each post. Brieflyexplain how this algorithm would work to solve the problem, and its number of messages system-wide, onbehalf of each post.

10


e. (8 points) Finally, assume that we can use a Token Ring logical structure to propagate the posts taggedwith Lamport clocks to all participants. Instead of sending N point to point messages for each multicast, weuse a token which carries messages around the ring. Each node adds its post message to the token when itgets the token. Each node displays all messages that are carried on the token upon receipt of the token (inthe order they were added). When the token comes all the way around, each node dequeues/deletes its ownpost from the token. Explain which algorithm would be the best and which would be the worst in terms ofposting speed at large scales of millions of nodes, from all of the above: Lamport TO-Mcast, VTS-1, VTS-2,Token Ring, Sequencer-based, in each type of environment below. State the overheads that each post eventgenerates in terms of CPU, network and memory, as a function (in big Oh notation), of the number of nodesin the group, N, for each algorithm. Then choose the best and worst for each environment and justify youranswer briefly.

e1. high bandwidth, high parallelism in network, high latency, high-end CPU’s on all nodes in environment:

e2. high network bandwidth, very low parallelism in network, very low latency, low-end CPU’s on all nodesin environment.

11


Problem 3. Mutual Exclusion (25 points)

Part A. Mutual Exclusion with Lamport Clocks (12 points)

We design a version of the distributed Mutual Exclusion algorithm called Lamport-ME, based on ideas fromTO-multicast based on Lamport clocks. Lamport-ME works as follows:

Each process adds each Acquire Request message received to a local priority queue sorted by Lamportclocks. Then, each process multicasts its Acquire Request message timestamped with the current Lamportclock to all processes, including itself. Upon receipt of an Acquire Request from process j, after adding thisrequest to the local queue, the receiving process i sends a reply to j immediately.

A process enters the critical section if its own request is at the head of its queue and it has received all repliesfor its request from all other processes.

When leaving the critical section, a process pops the head of its queue and multicasts a Release message toall other processes. Each process pops the head of its queue upon receiving a Release message.

You are asked to compare Lamport-ME with the Ricart and Agrawala algorithm in terms of the followingcriteria:

a) (4 points) Correctness, fairness and deadlock freedom. Specify what the case is for each algorithm asthey are stated e.g., are they both correct, fair and deadlock free ? Briefly justify your answer:

12


b) (4 points) Performance as latency to enter CS and number of messages (briefly justify your answerincluding specify exactly how many messages each protocol exchanges):

b1) Lamport-ME messages:

b2) R&A messages:

b3) Comparison:

13


c) (4 points) Each of the two algorithms uses a queue per node. For each of the two algorithms, does itmatter for correctness and/or performance if that queue is kept in FIFO order or in Lamport clock sortedorder for the events kept in the queue ? Please show your reasons.

c1) Lamport-ME queue:

c2) R&A queue:

c3) Comparison:

14


Part B. Mutual Exclusion Using Token Ring (13 points)

A version of the Token Ring algorithm for mutual exclusion called Token Ring v1 is as follows:

The token contains the time t of the earliest known outstanding Acquire Request. Initially, time t carried bythe token is not set (or set to NULL). You can assume that the “time” refers to either Logical of Physicalclocks (a question on this will follow).

To enter the critical section, a node:

1. Timestamps its Acuire Request with the current time.

2. When a node gets token with time t while waiting with its own Acquire Request from time Tr, itcompares Tr to time t on the token:

- If Tr == t, hold token and enter critical section- If Tr > t, pass token- If t is not set or Tr < t, set token time to Tr, pass token and wait for token.

When leaving critical section a node: Sets the token time to null i.e., unset t on token, passes token.

a) (4 points) State one advantage of Token Ring v1 compared to basic Token Ring. Explain briefly.

15


b) (4 points) State one disadvantage of Token Ring v1 compared to basic Token Ring.

c) (5 points) Does using Logical clocks for the timestamps in Token Ring v1 has any advantages or disad-vantages compared to using Physical clocks, and does this depend on the clock skew bound ?.

16


Problem 4. Replication and Fault Tolerance (10 points)

a) (2 points) Quorum-based replica deployments normally consist of an odd number of nodes (replicas)total e.g., 3 or 5 replicas. Does it make sense to use an even number of replicas instead e.g., 4? Justify youranswer.

b) (8 points) Assume that a quorum-based replication system is implementing a data processing servicewith strong consistency (sequential consistency) on a WAN. Also assuming that the workload running onthis system is composed of 90% Read and 10% Write operations on shared objects, and each Read andWrite operation takes the same CPU time, and once we choose the sizes of Read and Write quorums, thesesizes never change.

b1) How does the aggregate performance of the replicated system depend on the sizes of Read and Writequorums we choose ? Show your reasoning and discuss with some concrete examples approximately whatsize Read and Write quorums you would choose to optimize performance.

17


b2) What factors does the availability of the replicated system depend on ? As part of the answer, discusswith concrete examples how it depends or why it does not depend on the sizes of Read and Write quorumswe choose. Show your reasoning.

18


Problem 5. Sequential Consistency (8 points)

Is the following sequence of events allowed with a sequentially-consistent Distributed System global state? Either way, please explain your answer for credit. If yes, please show the sequential order. W(x)1 meansthat we write 1 into x, R(x)1 means that a read of variable x returns 1.

Initially x = y = z = 0.

P1 W(x)1 W(z)1P2 W(y)1P3 R(x)1 R(z)0 R(y)0P4 R(y)1 R(z)0 R(x)0

19


20


21

midterm examination ece 419s 2016: distributed systems

Documents