peer-to-peer networking credit slides from j. pang, b. richardson, i. stoica

Peer-to-Peer Networking

Credit slides from J. Pang, B. Richardson, I. Stoica

2

Why Study P2P

Huge fraction of traffic on networks today >=50%!

Exciting new applications Next level of resource sharing

Vs. timesharing, client-server E.g. Access 10’s-100’s of TB at low cost.

3

Share of Internet Traffic

4

Number of Users

Others includeBitTorrent, eDonkey, iMesh,Overnet, Gnutella

BitTorrent (and others) gaining sharefrom FastTrack (Kazaa).

5

What is P2P used for?

Use resources of end-hosts to accomplish a shared task Typically share files Play game Search for patterns in data (Seti@Home)

6

What’s new?

Taking advantage of resources at the edge of the network Fundamental shift in computing capability Increase in absolute bandwidth over WAN

7

Peer to Peer

Systems: Napster Gnutella KaZaA BitTorrent Chord

8

Key issues for P2P systems

Join/leave How do nodes join/leave? Who is allowed?

Search and retrieval How to find content? How are metadata indexes built, stored,

distributed? Content Distribution

Where is content stored? How is it downloaded and retrieved?

9

4 Key Primitives

• Join– How to enter/leave the P2P system?

• Publish– How to advertise a file?

• Search• how to find a file?

• Fetch– how to download a file?

10

Publish and Search

Basic strategies: Centralized (Napster) Flood the query (Gnutella) Route the query(Chord)

Different tradeoffs depending on application Robustness, scalability, legal issues

11

Napster: History

In 1999, S. Fanning launches Napster Peaked at 1.5 million simultaneous

users Jul 2001, Napster shuts down

12

Napster: Overiew

Centralized Database: Join: on startup, client contacts central

server Publish: reports list of files to central

server Search: query the server => return

someone that stores the requested file Fetch: get the file directly from peer

13

Napster: Publish

I have X, Y, and Z!

Publish

insert(X, 123.2.21.23)...

123.2.21.23

14

Napster: Search

Where is file A?

Query Reply

search(A)-->123.2.0.18Fetch

123.2.0.18

15

Napster: Discussion

Pros: Simple Search scope is O(1) Controllable (pro or con?)

Cons: Server maintains O(N) State Server does all processing Single point of failure

16

Gnutella: History

In 2000, J. Frankel and T. Pepper from Nullsoft released Gnutella

Soon many other clients: Bearshare, Morpheus, LimeWire, etc.

In 2001, many protocol enhancements including “ultrapeers”

17

Gnutella: Overview

Query Flooding: Join: on startup, client contacts a few

other nodes; these become its “neighbors”

Publish: no needSearch: ask neighbors, who as their

neighbors, and so on... when/if found, reply to sender.

Fetch: get the file directly from peer

18

I have file A.

I have file A.

Gnutella: Search

Where is file A?

Query

Reply

19

Gnutella: Discussion

Pros: Fully de-centralized Search cost distributed

Cons: Search scope is O(N) Search time is O(???) Nodes leave often, network unstable

20

Aside: Search Time?

21

Aside: All Peers Equal?

56kbps Modem

10Mbps LAN

1.5Mbps DSL

56kbps Modem56kbps Modem

1.5Mbps DSL

1.5Mbps DSL

1.5Mbps DSL

22

Aside: Network Resilience

Partial Topology Random 30% die Targeted 4% die

from Saroiu et al., MMCN 2002

23

KaZaA: History

In 2001, KaZaA created by Dutch company Kazaa BV

Single network called FastTrack used by other clients as well: Morpheus, giFT, etc.

Eventually protocol changed so other clients could no longer talk to it

Most popular file sharing network today with >10 million users (number varies)

24

KaZaA: Overview

“Smart” Query Flooding: Join: on startup, client contacts a

“supernode” ... may at some point become one itself

Publish: send list of files to supernode Search: send query to supernode,

supernodes flood query amongst themselves.

Fetch: get the file directly from peer(s); can fetch simultaneously from multiple peers

25

KaZaA: Network Design

“Super Nodes”

26

KaZaA: File Insert

I have X!

Publish

insert(X, 123.2.21.23)...

123.2.21.23

27

KaZaA: File Search

Where is file A?

Query

search(A)-->123.2.0.18

search(A)-->123.2.22.50

Replies

123.2.0.18

123.2.22.50

28

KaZaA: Discussion

Pros: Tries to take into account node heterogeneity:

• Bandwidth• Host Computational Resources• Host Availability (?)

Rumored to take into account network locality Cons:

Mechanisms easy to circumvent Still no real guarantees on search scope or search time

29

P2P systems

Napster Launched P2P Centralized index

Gnutella: Focus is simple sharing Using simple flooding

Kazaa More intelligent query routing

BitTorrent Focus on Download speed, fairness in sharing

30

BitTorrent: History

In 2002, B. Cohen debuted BitTorrent Key Motivation:

Popularity exhibits temporal locality (Flash Crowds) E.g., Slashdot effect, CNN on 9/11, new movie/game

release Focused on Efficient Fetching, not Searching:

Distribute the same file to all peers Single publisher, multiple downloaders

Has some “real” publishers: Blizzard Entertainment using it to distribute the beta of

their new game

31

BitTorrent: Overview

Swarming: Join: contact centralized “tracker” server,

get a list of peers. Publish: Run a tracker server. Search: Out-of-band. E.g., use Google to

find a tracker for the file you want. Fetch: Download chunks of the file from

your peers. Upload chunks you have to them.

33

BitTorrent: Sharing Strategy

Employ “Tit-for-tat” sharing strategy “I’ll share with you if you share with me” Be optimistic: occasionally let freeloaders download

• Otherwise no one would ever start!• Also allows you to discover better peers to download

from when they reciprocate

34

BitTorrent: Summary

Pros: Works reasonably well in practice Gives peers incentive to share resources;

avoids freeloaders Cons:

Central tracker server needed to bootstrap swarm (is this really necessary?)

35

DHT: History

In 2000-2001, academic researchers said “we want to play too!”

Motivation: Frustrated by popularity of all these “half-baked” P2P

apps :) We can do better! (so we said) Guaranteed lookup success for files in system Provable bounds on search time Provable scalability to millions of node

Hot Topic in networking ever since

36

DHT: Overview

Abstraction: a distributed “hash-table” (DHT) data structure: put(id, item); item = get(id);

Implementation: nodes in system form a distributed data structure Can be Ring, Tree, Hypercube, Skip List, Butterfly

Network, ...

37

DHT: Overview (2)

Structured Overlay Routing: Join: On startup, contact a “bootstrap” node and integrate

yourself into the distributed data structure; get a node id Publish: Route publication for file id toward a close node id

along the data structure Search: Route a query for file id toward a close node id.

Data structure guarantees that query will meet the publication.

Fetch:• P2P fetching

Tapestry: a DHT-based P2P system focusing on fault-tolerance

Danfeng (Daphne) Yao

39

A very big picture – P2P networking Observe

Dynamic nature of the computing environment

Network Expansion Require

Robustness, or fault tolerance Scalability Self-administration, or self-organization

40

Goals to achieve in Tapestry

Decentralization Routing uses local data

Efficiency, scalability Robust routing mechanism

Resilient to node failures An easy way to locate objects

41

IDs in Tapestry

NodeID ObjectID IDs are computed using a hash function

The hash function is a system-wide parameter

Use ID for routing Locating an object Finding a node

A node stores its neighbors’ IDs

42

Tapestry routing: 03254598

Suffix-based routing

43

Each object is assigned a root node

Root node: a unique node for object O with the longest matching suffix Root knows where O is

rootId = O

Copy of O

Publish: node S routes a msg <objectID(O), nodeID(S)> to root of O Closest location is stored if exist multiple copies Location query: a msg for O is routed towards O’s root How to choose the root node for an object?

S

44

Surrogate routing: unique mapping without global knowledge

Problem: how to choose a unique root node of an object deterministically

Surrogate routing Choose an object’s root (surrogate) node

such that whose nodeId is the closest to the objectId

Small overhead, but no need of global knowledge

The number of additional hops is small

45

Surrogate example: find the root node for objectId = 6534

2234

1114

2534

6534

object

root

7534

7534

6534 empty

7534

6534 empty…….

46

Tapestry: adaptable, fault-resilient,self-management

Basic routing and location Backpointer list in neighbor map Multiple <objectId(O), nodeId(S)> mappings

are stored More semantic flexibility: selection operator

for choosing returned objects

47

A single Tapestry node

Neighbor MapFor “1732” (Octal)

1234

xxx1

1732

xxx0

xxx3

xxx4

xxx5

xxx6

xxx7

xx02

1732

xx22

xx32

xx42

xx52

xx62

xx72

x032

x132

x232

x332

x432

x532

x632

1732

0732

1732

2732

3732

4732

5732

6732

7732

Object locationPointers<objectId, nodeId>

Hotspot monitor<objId, nodeId, freq>

Object store

48

Fault-tolerant routing: detect, operate, and recover Expected faults

Server outages (high load, hw/sw faults) Link failures (router hw/sw faults) Neighbor table corruption at the server

Detect: TCP timeout, heartbeats to nodes on backpointer list

Operate under faults: backup neighbors Second chance for failed server: probing msgs

49

Fault-tolerant location: avoid single point of failure Multiple roots for each object

rootId = f(objectId + salt) Redundant, but reliable

Storage servers periodically republish location info Cached location info on routers times out New and recovered objects periodically

advertise their location info

50

Dynamic node insertion

Populate new node’s neighbor maps Routing messages to its own nodeId Copy and optimize neighbor maps

Inform relevant nodes of its entries to update neighbor maps

51

Dynamic deletion

Inform relevant nodes using backptrs Or rely on soft-state (stop heartbeats) In general, Tapestry expects small

number of dynamic insertions/deletions

52

Fault handling outline

Design choice Soft state for graceful fault recovery

Soft-state Caches: updated by periodic refreshment msgs, or

purged if receiving no such msgs Faults are expected in Tapestry

Fault-tolerant routing Fault-tolerant location Surrogate routing

53

Tapestry Conclusions

Decentralized location and routing Distributed algorithms for object-root

mapping, node insertion/deletion Fault-handling with redundancy Per node routing table size: bLogb(N)

• N = size of namespace Find object in Logb(n) overlay hops

• n = # of physical nodes

peer-to-peer networking credit slides from j. pang, b. richardson, i. stoica

Documents

peer slide

wan slide

home slide

stoica slide

ultrapeers slide

query reply slide

mbps dsl slide

network unstable slide