opendht: a public dht service
DESCRIPTION
OpenDHT: A Public DHT Service. Sean C. Rhea UC Berkeley June 2, 2005. Joint work with: Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu. Peer-to-Peer File Sharing. Very simple insight Most computers unused most of the time - PowerPoint PPT PresentationTRANSCRIPT
OpenDHT: A Public DHT ServiceSean C. Rhea
UC Berkeley
June 2, 2005
Joint work with: Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy,
Scott Shenker, Ion Stoica, and Harlan Yu
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Peer-to-Peer File Sharing
• Very simple insight– Most computers unused most of the time
• Idea: harness this spare capacity to– Quickly download music files [Napster, Gnutella]– Search for aliens [SETI@Home]– Make free long-distance phone calls [Skype]
• Question: how to find desired resource(s)?– Early approaches: scoped flooding– Downsides: scalability, accuracy
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
A Better Search Facility:The Distributed Hash Table (DHT)
• Same interface as a programmatic hash table,– put(key, value) stores value under key
– get(key) returns the value(s) stored under key
• But shared across many machines
• Implemented via an overlay network
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
A Better Search Facility:The Distributed Hash Table (DHT)
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
put(k1,v1)
stores k1,v1
get(k1)
k1
k1
k1v1
v1
v1
k1,v1
k1,v1
k1,v1
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
DHTs and File Sharing:DHT Stores Pointers to Files
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
pointerto file
put(file, IP)
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
DHTs and File Sharing:DHT Stores Pointers to Files
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
pointerto file
get(file)IP
xfer over HTTP
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
DHTs and Spam Detection:Detecting Similar Messages
K V
K V
K V
K V
K V
K V
K V
K V
K V
put(hash(msg), IP)
“I loveyou!”
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
DHTs and Spam Detection:Detecting Similar Messages
K V
K V
K V
K V
K V
K V
K V
K V
K V
put(hash(msg), IP)
“I loveyou!”
“I loveyou!”
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
DHTs and Spam Detection:Detecting Similar Messages
K V
K V
K V
K V
K V
K V
K V
K V
K V
“I loveyou!”
“I loveyou!”
Something’sfishy!
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
DHTs and Spam Detection:Detecting Similar Messages
K V
K V
K V
K V
K V
K V
K V
K V
K V
put(hash(msg), IP)
“I loveyou!”
“I loveyou!” “I love
you!”
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
DHTs and Spam Detection:Detecting Similar Messages
K V
K V
K V
K V
K V
K V
K V
K V
K V
“I loveyou!”
“I loveyou!” “I love
you!”
Something’sfishy!
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
More DHT Applications• Distributed Storage Systems
– CFS, HiveCache, PAST, Pastiche– OceanStore / Pond
• Content Distribution Networks / Web Caches– Bslash, Coral, Squirrel
• Indexing / Naming Systems– Chord-DNS, CoDoNS, DOA, SFR
• Internet Query Processors– Catalogs, PIER
• Communication Systems– Bayeux, i3, MCAN, SplitStream
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Some Areas of DHT Research• Better routing protocols
– One-hop, degree-optimal
• Load balancing– Non-uniform key distributions
• Security– Byzantine fault-tolerant routing
• Data redundancy and fault tolerance– Replication, erasure-coding
• Stronger semantics– Supporting read-modify-write
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
How Many DHTs Will There Be?
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
File Sharing
K V
Spam Detection
Company Machine: Can’t Share Files
Owns Stock in Spam CompanyK V
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
How Many DHTs Will There Be?
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
File Sharing
K V
Spam Detection
Redundant Link
K V
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
How Many DHTs Will There Be?
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
File Sharing
K V
Spam Detection
Unshared Links
K V
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Benefits of Sharing a DHT
• Amortizes costs across applications– Maintenance bandwidth, connection state, etc.
• Facilitates “bootstrapping” of new applications– Working infrastructure already in place
• Allows for statistical multiplexing of resources– Takes advantage of spare storage and bandwidth
• Facilitates upgrading existing applications– “Share” DHT between application versions
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Challenges in Sharing a DHT
• Robustness– Must be available 24/7
• Shared Interface Design– Should be general, yet easy to use
• Resource Allocation– Must protect against malicious/over-eager users
• Economics– What incentives are there to provide resources?
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Challenges in Sharing a DHT
• Robustness– Must be available 24/7
• Shared Interface Design– Should be general, yet easy to use
• Resource Allocation– Must protect against malicious/over-eager users
• Economics– What incentives are there to provide resources?
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
The DHT as a Service
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
The DHT as a Service
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V OpenDHT
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
The DHT as a Service
OpenDHT Clients
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
The DHT as a Service
OpenDHT
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
The DHT as a Service
OpenDHT
What is this interface?
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
The Traditional Interface: lookup
1112
10 2
1
9 3
6
4
57
8
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
The Traditional Interface: lookup
lookup(k)
k
On reaching the successor of k,message passed to an “upcall”
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
DHTs and Spam Detection:Detecting Similar Messages
K V
K V
K V
K V
K V
K V
K V
K V
K V
put(hash(msg), IP)
“I loveyou!”
“I loveyou!”
Upcall: I’ve seen thismessage before!
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
DHTs and Spam Detection:Detecting Similar Messages
K V
K V
K V
K V
K V
K V
K V
K V
K V
“I loveyou!”
“I loveyou!”
Something’sfishy!
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Upcall Challenges
• Distribution– How do we get new upcall code to all nodes?
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Upcall Challenges
lookup(k)
k
How did the upcall codeget here?
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Upcall Challenges
• Distribution– How do we get new upcall code to all nodes?– Active networking experience is a warning…
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Upcall Challenges
• Distribution– How do we get new upcall code to all nodes?– Active networking experience is a warning…
• Security– How do we safely run untrusted clients’ upcalls?
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
What about Put/Get?
• Works great for some applications– File sharing, for example
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
DHTs and File Sharing:DHT Stores Pointers to Files
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
pointerto file
get(file)IP
xfer over HTTP
put(file, IP)
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
What about Put/Get?
• Works great for some applications– File sharing, for example
• What about applications with upcalls?– Our spam detection application, for example
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
What about Put/Get?
• Works great for some applications– File sharing, for example
• What about applications with upcalls?– Our spam detection application, for example
• Idea: let application nodes run the upcalls– Each node only runs upcalls for the applications
that it’s participating in
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Upcall Example
OpenDHT
put/get
put/get
File Sharing
I can handle spamdetection messages
Spam Detection
I can handle spamdetection messages
I can handlespam detection
messages
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Upcall Example
OpenDHT
put/get
put/get
File Sharing
Spam Detection
“I loveyou!”
Who’s handlinghash(message)?
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Upcall Example
OpenDHT
put/get
put/get
File Sharing
Spam Detection
“I loveyou!”
Who’s handlinghash(message)?
“I loveyou!”
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Upcall Example
OpenDHT
put/get
put/get
File Sharing
Spam Detection
“I loveyou!”
“I loveyou!”
Something’sfishy!
DHT keeps track of which nodes support which upcalls via
Recursive Distributed Rendezvous (ReDiR)
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ReDiR• Goal: Implement two functions using put/get:
– join(namespace, node)– node = lookup(namespace, identifier)
H(namespace)
L0
L1
L2
H(A)
A
A
AH(B)
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ReDiR• Goal: Implement two functions using put/get:
– join(namespace, node)– node = lookup(namespace, identifier)
L0
L1
L2
H(A)
A, B
A
AH(B) H(C)
C
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ReDiR• Goal: Implement two functions using put/get:
– join(namespace, node)– node = lookup(namespace, identifier)
L0
L1
L2
H(A)
A, B
A, C
AH(B) H(C)
C
H(D)
D
D
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ReDiR• Goal: Implement two functions using put/get:
– join(namespace, node)– node = lookup(namespace, identifier)
L0
L1
L2
H(A)
A, B
A, C
A, DH(B) H(C)
C
H(D)
D
D
H(E)
E
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ReDiR• Goal: Implement two functions using put/get:
– join(namespace, node)– node = lookup(namespace, identifier)
L0
L1
L2
H(A)
A, B
A, C
A, DH(B) H(C)
C
H(D)
D
D, E
H(E)
E
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ReDiR• Join cost:
– Worst case: O(log n) puts and gets– Average case: O(1) puts and gets
L0
L1
L2
H(A)
A, B
A, C
A, DH(B) H(C)
C
H(D)
D
D, E
H(E)
E
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ReDiR• Goal: Implement two functions using put/get:
– join(namespace, node)– node = lookup(namespace, identifier)
L0
L1
L2H(A)
A, B
A, C
A, D
H(B) H(C)
C
H(D)
D
D, E
H(E)
E
H(k1)
successor
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ReDiR• Goal: Implement two functions using put/get:
– join(namespace, node)– node = lookup(namespace, identifier)
L0
L1
L2H(A)
A, B
A, C
A, D
H(B) H(C)
C
H(D)
D
D, E
H(E)
E
H(k2)
no successor
successor
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ReDiR• Goal: Implement two functions using put/get:
– join(namespace, node)– node = lookup(namespace, identifier)
L0
L1
L2H(A)
A, B
A, C
A, D
H(B) H(C)
C
H(D)
D
D, E
H(E)
E
H(k3)
no successor
successor
no successor
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ReDiR• Lookup cost:
– Worst case: O(log n) gets– Average case: O(1) gets
L0
L1
L2H(A)
A, B
A, C
A, D
H(B) H(C)
C
H(D)
D
D, E
H(E)
E
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ReDiR Performance(On PlanetLab)
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
OpenDHT Design Summary
• OpenDHT is a common infrastructure for– Storage of values, pointers, etc.– Organizing clients that handle application upcalls
• Benefits:– Amortizes maintenance costs across applications– Facilitates “bootstrapping” of new applications– Allows for statistical multiplexing of resources
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
ImpactApplication Uses OpenDHT for Interface
Croquet Media Manager replica location put/get
DOA indexing put/get
HIP name resolution put/get
Tetherless Computing host mobility put/get
Place Lab range queries ReDiR
QStream mcast tree constr. ReDiR
VPN Index indexing put/get
FreeDB storage put/get
Instant Messaging rendezvous put/get
CFS storage put/get
i3 redirection ReDiR
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Future Work
• OpenDHT makes a great common substrate for:– Soft-state storage– Naming and rendezvous
• Many P2P applications also need to:– Traverse NATs– Redirect packets within the infrastructure (as in i3)– Refresh puts while intermittently connected
• All of these can be implemented with upcalls– Who provides the machines that run the upcalls?
Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005
Future Work
• We don’t want to add upcalls to the core DHT– Keep the main service simple, fast, and robust
• Can we build a separate upcall service?– Some other set of machines organized with ReDiR– Security: can only accept incoming connections,
can’t write to local storage, etc.
• This should be enough to implement– NAT traversal, reput service– Some (most?) packet redirection
• What about more expressive security policies?
For more information, seehttp://opendht.org/