scalable content-addressable network lintao liu 2001.11.19

22
Scalable Content-Addressable Network Lintao Liu 2001.11.19

Upload: armani-hopwood

Post on 14-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Scalable Content-Addressable Network

Lintao Liu 2001.11.19

Page 2: Scalable Content-Addressable Network Lintao Liu 2001.11.19

System Goals

CAN: A distributed infrastructure that provides hash table-like functionality on Internet-like scales.

Scalable Fault-tolerant Self-organizing

Page 3: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Basic Design Basic Idea: A virtual d-dimensional Coordinate space

Each node owns a Zone in the virtual space Data is stored as (key, value) pair Hash(key) --> a point P in the virtual space (key, value) pair is stored on the node

within whose Zone the point P locates

Page 4: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Basic Design

For routing purpose, each node only need to maintain the information of those nodes that hold coordinate zone adjoining its own zone (neighbors)

Page 5: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Basic Design Routing: greedy algorithm if P is within the Zone of current node,

return (key, value) or failure (if no such key) else forward the query to the neighbor with coordinates closest to P

Page 6: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Example: Routing

(4,0)

4)

(0, 0)

(0, (4, 4)

7

Page 7: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Basic Design Node Insertion A new node N1 is going to join the network: 1. Find a node N2 already in the CAN 2. Randomly choose one point P in the space

3. Send a JOIN request destined for P (P resides in the Zone of N3)

4. N3 splits its Zone and assigns half zone to N1, and send (key, values) pairs from the half zone to N1

Page 8: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Basic Design

5. N1 also gets the information of neighbors from N3 6. N3 notices all the neighbors the reallocation of space. 7. Neighbors change their corresponding

data

Page 9: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Example: Insertion

Page 10: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Basic Design Node departure 1. Explicit departure

Hand over its zone to another node to

produce a valid single zone or merge with a smallest zone

Page 11: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Basic Design 2. Node failure Periodic update messages between

Neighbors Prolonged absence of an update message from a neighbor indicates its failure A takeover mechanism merges the zone with the smallest adjacent zone

There is also a background zone-reassignment algorithm to smooth the zone allocation.

Page 12: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Example: Node Departure

Page 13: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Problems in Basic design Scalable? Fault-tolerant? Self-organizing? Data durable? Efficient? Some others?

Page 14: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Design Improvements Guidelines:

Reduce path latency Increase fault tolerance

Increase data availability

Without increasing much complexity

Page 15: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Design Improvements Multi-dimensioned coordinate spaces -- reduce average path length

Increase the dimensions of the virtual space=> reduce the routing path length=> reduce the path latency

(increase the size of the routing table for there are more neighbors)

Page 16: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Design Improvements Multiple Realities: Multiple coordinate

spaces -- improve data availability improve routing fault tolerance reduce the average path length

Multiple coordinate spaces exist at the same time Each space is called a “reality” Each node occupies a zone in each reality

Page 17: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Design Improvements Better CAN routing metrics

-- reduce per-hop latency

When there are more than one choice for forwarding, choose the neighbor with the least RTT.

Page 18: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Design Improvements Overload coordinate zones

-- reduce average path length reduce the per-hop latency improve fault tolerance

Assign more than one node to share the same zone

Page 19: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Design Improvements Topologically-sensitive construction of CAN

-- reduce the path latency

A set of machines act as landmarks on InternetEach node measures its RTT to each of these landmarks and orders them in order of RTT.Physically close nodes are likely to have the same ordering and consequently, and will reside in adjacent zones of the coordinate space.

Page 20: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Other Design Improvements Multiple hash function

-- increase data availability reduce the query latency

Uniform Partitioning -- achieve load balance

Caching and Replication -- increase data availability reduce query latency achieve load balance

Page 21: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Comparisons Scalability? Construction & Storage Overhead? Query Efficiency? Fault Tolerance? Complexity?

Gnutella, FreeNet, Past, Chord

Page 22: Scalable Content-Addressable Network Lintao Liu 2001.11.19

Conclusions CAN provides scalable routing and

efficient indexing. Given a key, it can return the (key, value) pair with an average path length (d/4)(n1/d) hops,or return failure if no such (key, value).

CAN is completely self-organizing, fault-tolerant and resistant to DoS attack.