distributed hash tables

Post on 06-Jan-2016

37 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Distributed Hash Tables. CS653, Fall 2010. Implementing insert/retrieve: distributed hash table (DHT). Hash table data structure that maps “keys” to “values” essential building block in software systems Distributed Hash Table (DHT) similar, but spread across the Internet DHT Interface: - PowerPoint PPT Presentation

TRANSCRIPT

3-1

Distributed Hash Tables

CS653, Fall 2010

3-2

Implementing insert/retrieve: distributed hash table (DHT) Hash table

data structure that maps “keys” to “values” essential building block in software systems

Distributed Hash Table (DHT) similar, but spread across the Internet

DHT Interface: insert(key, value) lookup(key)

DHT: overlay (on top of internet) of nodes each DHT node supports single operation: given input key, route messages toward node

holding key

3-3

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

DHT in action

3-4

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

DHT in action

3-5

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

DHT in action

Operation: take key as input; route messages to

node holding key

3-6

DHT in action: insert()

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

insert(K1,V

1)

Operation: take key as input; route messages to

node holding key

3-7

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

DHT in action: insert()

insert(K1,V

1)

Operation: take key as input; route messages to

node holding key

3-8

(K1,V1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

DHT in action: insert()

Operation: take key as input; route messages to

node holding key

3-9

lookup (K1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

DHT in action: lookup()

Operation: take key as input; route messages to

node holding key

DHT Design Goals

An “overlay” network with: flexible mapping of keys to physical nodes small network diameter small degree local routing decisions

A “storage” or “memory” mechanism with best-effort persistence (soft state)

3-11

Example DHT: Chord

Each key, k, maps to node on ring # key values may be

> # nodes key k maps to node

on ring with next largest id: successor (k)

m-bit identifier space: 2m ids Identifiers ordered on identifier

ring (Chord ring) modulo 2m

Example: 10 node Chord with m=6 storing 5 keys.

3-12

Basic lookup

Lookup: node i forwards to

node i+1, until node holding k is reached

memory; O(1), each node need only know one neighbor

routing time: O(N)

3-13

Accelerating Lookups

Lookups accelerated by maintaining additional routing information

Each node maintains a routing table with (at most) m entries (where N=2m) called finger table

ith entry in table at node n contains identity of first node, s, that succeeds n by at least 2i-1 on chord

s = successor(n + 2i-1) (all arithmetic mod 2)

3-14

Routing by halving distance

Node uses finger tables to halve the ID space distance to destination in each step

3-15

Churn

Churn: nodes join, leave system stored keys must be move to (on leave) or moved

from (join) neighboring nodes finger tables must be updated too much churn: too much overhead

Joins: Node n joins by asking any node m for n’s successor

Voluntary leaves Transfer keys to successor, notify predecessor

Failure handling: maintain r backup successors r tuned to the mean-time-to-failure of nodes

3-16

DHTs discussion

What systems can you build using DHTs?

How to reduce stretch? [Plaxton97] How to support range requests or

partial matches between request and key?

What real applications use DHTs today? Why or why not?

Pros and cons of a centralized index?

3-17

Indirection

3-18

Indirection

Indirection: rather than reference an entity directly, reference it (“indirectly) via another entity, which in turn can or will access the original entity

"Every problem in computerscience can be solved by adding another level of indirection" -- Butler Lampson

A

B

x

3-19

Multicast: one sender to many receivers Multicast: act of sending datagram to multiple

receivers with single “transmit” operation

Question: how to achieve multicast

Network multicast router actively

participate in multicast, making copies of packets as needed and forwarding towards multicast receivers

Multicastrouters (red) duplicate and forward multicast datagrams

3-20

Internet Multicast Service Model

multicast group concept: use of indirection host addresses IP datagram to multicast group routers forward multicast datagrams to hosts

that have “joined” that multicast group

128.119.40.186

128.59.16.12

128.34.108.63

128.34.108.60

multicast group

226.17.30.197

3-21

Multicast via Indirection: why?

Naming and forwarding in IP tailored for point-to-point communication

Indirection Provides flexible namingDecouples sender from receivers (and

their joins and leaves)

3-22

How do you contact a mobilefriend?

search all phone books?

call her parents? expect her to let you

know where he/she is?

I wonder where Alice moved to?

Consider friend frequently changing addresses, how do you find her?

Mobility and Indirection

3-23

Mobility and indirection:

mobile node moves from network to network

correspondents want to send packets to mobile node

two approaches: indirect routing: communication from

correspondent to mobile goes through home agent, then forwarded to remote

direct routing: correspondent gets foreign address of mobile, sends directly to mobile

3-24

Mobility: Vocabulary

home network: permanent “home” of mobile(e.g., 128.119.40/24)

permanent address: address in home network, can always be used to reach mobilee.g., 128.119.40.186

home agent: entity that will perform mobility functions on behalf of mobile, when mobile is remote

wide area network

3-25

Mobility: more vocabulary

care-of-address: address in visited network.(e.g., 79,129.13.2)

wide area network

visited network: network in which mobile currently resides (e.g., 79.129.13/24)

permanent address: remains constant (e.g., 128.119.40.186)

foreign agent: entity in visited network that performs mobility functions on behalf of mobile.

correspondent: wants to communicate with mobile

3-26

Mobility: registration

End result: foreign agent knows about mobile home agent knows location of mobile

wide area network

home network

visited network

1

mobile contacts foreign agent on entering visited network

2

foreign agent contacts home agent home: “this mobile is resident in my network”

3-27

Mobility via Indirect Routing

wide area network

homenetwork

visitednetwork

3

2

41

correspondent addresses packets using home address of mobile

home agent intercepts packets, forwards to foreign agent

foreign agent receives packets, forwards to mobile

mobile replies directly to correspondent

3-28

Indirect Routing: comments mobile uses two addresses:

permanent address: used by correspondent (hence mobile location is transparent to correspondent)

care-of-address: used by home agent to forward datagrams to mobile

foreign agent functions may be done by mobile itself triangle routing: correspondent-home-network-

mobile inefficient when correspondent, mobile are in same network

3-29

Indirect Routing: moving between networks suppose mobile user moves to another

network registers with new foreign agent new foreign agent registers with home agent home agent updates care-of-address for

mobile packets continue to be forwarded to mobile

(but with new care-of-address) mobility, changing foreign networks

transparent: ongoing connections can be maintained!

3-30

Mobility via Direct Routing

wide area network

homenetwork

visitednetwork

4

2

51correspondent requests, receives foreign address of mobile

correspondent forwards to foreign agent

foreign agent receives packets, forwards to mobile

mobile replies directly to correspondent

3

3-31

Mobility via Direct Routing: comments

overcomes triangle routing problem non-transparent to correspondent:

correspondent must get care-of-address from home agent what happens if mobile changes networks?

3-32

Mobile IP

RFC 3220 Has many features we’ve seen:

home agents, foreign agents, foreign-agent registration, care-of-addresses, encapsulation (packet-within-a-packet)

Three components to standard: agent discovery registration with home agent indirect routing of datagrams

3-33

Mobility via indirection: why indirection?

Transparency to correspondent “Mostly” transparent to mobile (except

mobile must register with foreign agent) transparent to routers, rest of infrastructure practical concern: if egress filtering is in

place in foreign networks (since source IP address of mobile is its home address): spoofing?

3-34

An Internet Indirection Infrastructure

Motivation: Today’s Internet is built around point-to-point

communication abstraction: send packet “p” from host “A” to host “B” one sender, one receiver, at fixed and well-

known locations … not appropriate for applications that require

other communications primitives: multicast (one to many) mobility (one to anywhere) anycast (one to any)

We’ve seen indirection used to provide these services idea: make indirection a “first-class object”

3-35

Internet Indirection Infrastructure (i3)

Change communication abstraction: instead of point-to-point, exchange packets by name each packet has an identifier ID to receive packet with identifier ID, receiver R

stores trigger (ID, R) in network triggers stored in network overlay nodes

Sender Receiver (R)

ID R

trigger

send(ID, data)send(R, data)

3-36

Service Model

API sendPacket(p); insertTrigger(t); removeTrigger(t); // optional

Best-effort service model (like IP) Triggers periodically refreshed by end-hosts

Q: what is this approach called? Reliability, congestion control, flow-control

implemented at end hosts, and trigger-storing overlay nodes

packet p: (ID,data)trigger t: (ID, addr)

3-37

Discussion

Trigger is similar to routing table entry Essentially: application layer publish-subscribe

infrastructure Unlike IP, end hosts control triggers, i.e., end

hosts responsible for setting and maintaining “routing tables”

Provide support for mobility multicast anycast composable services

3-38

Mobility Receiver updates its trigger as it moves

from one subnet to another mobility transparent to sender location privacy

SenderReceiver

(R1)ID R1

send(ID,data) send(R1, data)

trigger

3-39

Mobility Receiver updates its trigger as it moves

from one subnet to another mobility transparent to sender location privacy

SenderID R2

send(ID,data)

Receiver(R2)

send(R2, data)

(new) trigger

3-40

Multicast Unifies multicast and unicast abstractions

multicast: receivers insert triggers with same ID Application naturally moves between

multicast and unicast, as needed “impossible” in current IP model

Sender Receiver (R1)ID R1

send(ID,data) send(R1, data)

Receiver (R2)

ID R2

send(R2, data)

3-41

Anycast (cont’d) Route to any one in set of receivers Receiver i in anycast group inserts same ID,

with anycast qualifications Route to receiver with best match between

a and si

Sender

Receiver (R1)ID|s1 R1send(ID|a,data)

Receiver (R2)ID|s2 R2

ID|s3 R3

Receiver (R3)

send(R1,data)

3-42

Composable Services

Use stack of IDs to encode successive operations to be performed on data (e.g., transcoding)

Don’t need to configure path between services

Sender(MPEG)

Receiver R(JPEG)

ID_MPEG/JPEG S_MPEG/JPEGID R

send((ID_MPEG/JPEG,ID), data)

S_MPEG/JPEG

send(ID, data) send(R, data)

3-43

Composable Services (cont’d) Both receivers and senders can specify operations

to be performed on data

Receiver R(JPEG)

ID_MPEG/JPEG S_MPEG/JPEG

ID (ID_MPEG/JPEG, R)

send(ID, data)

S_MPEG/JPEG

Sender(MPEG)

send((ID_MPEG/JPEG, R), data)

send(R, data)

3-44

Heterogeneous Multicast

Both receivers and senders can specify operations to be performed on data

Receiver R1(JPEG)

ID_MPEG/JPEG S_MPEG/JPEG

ID (ID_MPEG/JPEG, R1)

send(ID, data)

S_MPEG/JPEG

Sender(MPEG)

send((ID_MPEG/JPEG, R1), data)

send(R1, data)

Receiver R2(JPEG)

ID R2send(R2, data)

3-45

Discussion of I3

How would receiver signal ACK to sender? what is needed?

Does many-to-one fit well in this paradigm?

Security, snooping, information gathering: what are the issues?

In-network storage to handle disconnection?

3-46

Indirection: Summary

We’ve seen indirection used in many ways: multicast mobility Internet indirection

Uses of indirection: sender does not need to know receiver id - do

not want sender to know intermediary identities

elegant transparency of indirection is important performance: is it more efficient? security: important open issue for I3

top related