![Page 1: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/1.jpg)
CHORD: A Scalable Peer-to-Peer Lookup Service for Internet Applications
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, HariBalakrishnan
MIT Laboratory for Computer Science
Adopted for Presented in 527-07 UBC
![Page 2: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/2.jpg)
OutlineBackgroundChord
NamingSearchingStoring dataNode JoinNode LeaveFault ToleranceLoad BalancingSimulation and Experimental Results
Summary
![Page 3: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/3.jpg)
P2P Background
![Page 4: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/4.jpg)
What is P2P?An application-level Internet on top of the InternetEvery participating node acts as both a client and a server (“servent”)•Every node “pays” its participation by providing access to (some of) its resourcesProperties:
No central coordinationNo central databaseNo peer has a global view of the system. Global behavior emerges from local interactionsAll existing data and services are accessible from any peerDynamicPeers and connections are unreliable
![Page 5: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/5.jpg)
P2P applicationsFile sharing & storage SystemsPeer information retrieval and P2P web searchPeer data management, Peer query reformulation
![Page 6: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/6.jpg)
P2P Network Requirements
Efficient data location & Routing (Scalability)Adaptable to changes (data and query)Self-Organizing, Ad hoc participationRobust and fault tolerant
![Page 7: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/7.jpg)
Question: How to find the data?An efficient index mechanism: DHT
Hash Tabledata structure that maps “keys” to “buckets”essential building block in software systems
Distributed Hash Table (DHT) similar, but spread across the InternetRely on the hashing to achieve load balancing
Interface--Every DHT node supports operation:lookup(key)--Given key as input; route messages toward node holding keyinsert(key, value)
![Page 8: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/8.jpg)
DHT Design GoalsLow diameter: Theoretical search bound Low degree: Limited number of neighboursLocal routing decision: Greedy. Nodes incrementally calculate a path to the targetLoad balance: distributed hash function, spreading keys evenly over nodesRobustness: Operational even in partial failureAvailability: Can automatically adjusts its internal tables to ensure that the node responsible for a key can always be found
![Page 9: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/9.jpg)
An Example DHT: Chord
![Page 10: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/10.jpg)
Consistent HashingA scheme that provides hash table functionality in a way that the addition or removal of one bucket does not significantly change the mapping of keys to buckets A consistent hashing function:
Map both the buckets and keys into a range. The consistent hash of a key is defined as the bucket whose image is closest* to the key’s.When buckets join or leave, only nearby buckets are affected
CHORD is a distributed consistent hashing table.
![Page 11: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/11.jpg)
Chord Naming
Key identifier = SHA-1(key)Node identifier = SHA-1(IP address)Both are uniformly distributedBoth exist in the same ID spaceHow to map key IDs to node IDs?
N32
N90
N105 K20K5
Circular 7-bitID space
Key 5Node 105
K80
![Page 12: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/12.jpg)
6
1
2
6
0
4
26
5
1
3
7
2identifier
circle
identifier
node
X key
Successor Nodes
successor(1) = 1
successor(2) = 3successor(6) = 0
A key is stored at its successor: node with the same id or next higher ID
Note: successor(key) is different from successor(node)!
![Page 13: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/13.jpg)
CHORD Search: Basic lookup
N32
N90
N105
N60
N10N120
K80
“Where is key 80?”
“N90 has K80”
![Page 14: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/14.jpg)
CHORD Search: Acceleration of LookupsEach node maintains
a routing table with (at most) m entries called the finger table
Start: (n + 2i-1) mod 2m
Interval: [finger[i].start, finger[i+1].start )Successive node (finger): First node >= finger[i].start
A successor: the next node on the identifier ring, i.e. finger[1].succA predecessor: the previous node on the identifier ring.
0
4
26
5
1
3
7
124
[1,2)[2,4)[4,0)
130
finger tablestart int. finger
keys1
235
[2,3)[3,5)[5,1)
330
finger tablestart int. finger
keys2
457
[4,5)[5,7)[7,3)
000
finger tablestart int. finger
keys6
![Page 15: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/15.jpg)
CHORD Search: Acceleration of Lookups
0
4
26
5
1
3
7
124
[1,2)[2,4)[4,0)
130
finger tablestart int. finger
keys1
235
[2,3)[3,5)[5,1)
330
finger tablestart int. finger
keys2
457
[4,5)[5,7)[7,3)
000
finger tablestart int. finger
keys6
Each node stores information about only a small number of other nodes, and knows more about nodes closely following it than about nodes fartherawayA node’s finger table generally does not contain enough information to determine the successor of an arbitrary key k. E.g. Node 3 looks up key 1Repetitive queries to nodes that immediately precede the given key will lead to the key’s successor eventually
![Page 16: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/16.jpg)
The algorithm
![Page 17: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/17.jpg)
CHORD Search: An example
N32
N10
N5
N20N110
N99
N80
N60
Lookup(K19)
K19
7bit identifier space
Lookups take O(log(N)) hops
……96
……
[96,32)
……99
finger tablestart int. finger
…335
…[3,35)[35,99)
…560
finger tablestart int. finger
…913
…[9,13)[13,21)
…1020
finger tablestart int. finger 19 belongs to (10,
20], so N10 is the predecessor of key 19
![Page 18: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/18.jpg)
CHORD: Storing DataSearch the node responsible for holding the keyInsert the key into that node
![Page 19: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/19.jpg)
CHORD Node Joins
0
4
26
5
1
3
7
124
[1,2)[2,4)[4,0)
130
finger tablestart int. finger
keys1
235
[2,3)[3,5)[5,1)
330
finger tablestart int. finger
keys2
457
[4,5)[5,7)[7,3)
000
finger tablestart int. finger
keys
finger tablestart int. finger
keys
702
[7,0)[0,2)[2,6)
003
6
6
66
6
Node 6 wants to join
1. Initialize fingers and predecessor: O(log2N)
2. Transferring keys: O(1)
3. Updating fingers of existing nodes: O(log2N)
Total Cost: O(log2N)
Each node keeps a predecessor pointer
![Page 20: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/20.jpg)
![Page 21: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/21.jpg)
CHORD Node Departures
0
4
26
5
1
3
7
124
[1,2)[2,4)[4,0)
130
finger tablestart int. succ.
keys1
235
[2,3)[3,5)[5,1)
330
finger tablestart int. succ.
keys2
457
[4,5)[5,7)[7,3)
660
finger tablestart int. succ.
keys
finger tablestart int. succ.
keys
702
[7,0)[0,2)[2,6)
003
6
6
6
0
3
Node 1 wants to leave
1. Transferring keys: O(1)
2. Remove the node: O(1)
3. Updating fingers and predecessor of affected nodes: O(log2N)
Total Cost: O(log2N)
![Page 22: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/22.jpg)
CHORD: Fault Tolerance
Basic “stabilization” protocol is used to keep nodes’ successor pointers up to date, which is sufficient to guarantee correctness of lookups
Every node runs stabilize periodically to find newly joined nodes
Also fix the fingers:find_successor(finger[i].start)
![Page 23: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/23.jpg)
CHORD: Fault Tolerance --Stabilization after Join
np
succ
(np)
= n
s
ns
n
pre
d(n
s) =
np
n joinspredecessor = niln acquires ns as successor via some n’n notifies ns being the new predecessorns acquires n as its predecessor
np runs stabilizenp asks ns for its predecessor (now n)np acquires n as its successornp notifies nn will acquire np as its predecessor
all predecessor and successor pointers are now correct
nil
pre
d(n
s) =
n
succ
(np)
= n
![Page 24: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/24.jpg)
CHORD: Fault Tolerance –Failure Recovery
Key step in failure recovery is maintaining correct successor pointers (The worst case as in the simple lookup)
To help achieve this, each node maintains a successor-list of its rnearest successors on the ring
If node n notices that its successor has failed, it replaces it with the first live entry in the list
stabilize will correct finger table entries and successor-list entries pointing to failed node
Performance is sensitive to the frequency of node joins and leaves versus the frequency at which the stabilization protocol is invoked
![Page 25: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/25.jpg)
CHORD: Fault Tolerance –Replication
Chord can store replicas of the data associated with a key at the k nodes succeeding the key(+) Increase data availability(–) larger size of the <key, value> database
![Page 26: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/26.jpg)
CHORD: Load BalancingEven hashing is used, the number of keys stored on each node may vary a lot. This is because the node identifiers do not uniformly cover the entire identifier space.Solution:
Associate keys with a number of virtual nodesMap multiple (e.g. logN as suggested in the consistent hashing paper) virtual nodes to each real nodeDoesn’t affect worst case performance:O(log(NlogN)) = O(logN)
![Page 27: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/27.jpg)
Experimental Results—Load balance
Figure 2: The 1st and the 99th percentiles of the number of keys per node as a function of virtual nodes mapped to a real node.
Figure 1: The mean and 1st and 99th percentiles of the number of keys stored per node in a 10^4 node network.
Number of Nodes: N= 104
Number of Keys: K= 105 ~ 106
![Page 28: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/28.jpg)
Experimental Results– Path LengthNumber of Nodes: N= 2k
Number of Keys: K= 100 * 2k
K from 3 to 14
![Page 29: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/29.jpg)
Experimental Results– Simultaneous Node Failures
Number of Nodes: N= 104
Number of Keys: K= 105 ~ 106
![Page 30: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/30.jpg)
CHORD SummariesScalability
Search O(Log n) w.h.p.Update requires search, thus O(Log n) w.h.p.Construction: O(Log^2 n) if a new node joins
RobustnessReplication might be used by storing replicas at successor nodes
Global knowledgeMapping of IP addresses and data keys to key common key space
AutonomyStorage and routing: noneNodes have by virtue of their IP address a specific role
Search typesOnly equality
![Page 31: CHORD: A Scalable Peer-to-Peer Lookup Service for …cs527/Lectures2010/9c-Chord-527-10.pdf · CHORD: A Scalable Peer-to ... data structure that maps “keys” to “buckets](https://reader034.vdocuments.us/reader034/viewer/2022051722/5a9da6e17f8b9a28388cc7a6/html5/thumbnails/31.jpg)
Q & A