chord and cfs philip skov knudsen ([email protected]) niels teglsbo jensen ([email protected]) mads...
Post on 20-Dec-2015
219 views
TRANSCRIPT
Chord and CFS
Philip Skov Knudsen ([email protected])
Niels Teglsbo Jensen ([email protected])
Mads Lundemann ([email protected])
Distributed hash table
• Stores values at nodes
• Hash function
• Name -> Hash key, name can be any string or byte array
• Article mixes up key and ID
• Chord
• CFS
Chord
A scalable Peer-to-peer Lookup Protocol for Internet Applications
Chord purpose
• Map keys to nodes
• (Compared to Freenet: No anonymity)
Goals
• Load balance
• Decentralization
• Scalability
• Availability
• Flexible naming
Consistent hashing
Simple network topology
Efficient network topology
Lookup algorithm
Node joining
26.join(friend) -> 26.successor = 32
26.stabilize -> 32.notify(26)
21.stabilize -> 21.successor=26 -> 26.notify(21)
Preventing lookup failure
• Successor list length r
• Disregarding network failures
• Assuming each node failing within one stabilization period with probability p:
• Connectivity loss for a node with probability: p^r
Path lengths from simulation
Probability densityfunction for path length in anetwork of 2^12 nodes.
Path lengths with varying N
Load balance
Nodes: 10^4, keys: 5*10^5
Virtual servers
10^4 nodes and 10^6 keys
Resilience to failed nodes
In a network of 1000 nodes
Latency stretch
In a network of 2^16 nodesc = Chord latencyi = IP latencystretch = c / i
CFS
Wide-area cooperative storage
Purpose
• Distributed cooperative file system
System design
File system using DHash
Block placement
Tick mark: Block IDSquare: Server responsible for ID (in Chord)Circles: Servers holding replicasTriangle: Servers receiving a copy of the block to cache
Availability
• r servers holding replicas of a block
• The server responsible for ID is responsible for detecting failed replica servers
• If the server responsible for ID fails the new server in charge will be the first replica server
• Replica server detects this when Chord stabilizes
• Replica nodes are found in the successor list
Persistence
• Each server promises to keep a copy of a block available for at least an agreed-on interval
• Publishers can ask for extensions
• This does not apply to cached copies, but to replicas
• The server responsible for the ID is also responsible for relaying extension requests to servers holding replicas
Load balancing
• Consistent hashing
• Virtual servers
• Caching
Preventing flooding
• Each CFS server limits any one IP address to using a certain percentage of its storage
• Percentage might be lowered as more nodes enter the network
• Can be circumvented by clients with dynamic IP addresses
Efficiency
• Efficient lookups using Chord
• Prefetching
• Server selection
Conclusion
• Efficient
• Scalable
• Available
• Load-balanced
• Decentralized
• Persistent
• Prevents flooding