peer to peer network schemes and finding algorithms

Post on 11-Apr-2017

169 Views

Category:

Internet

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Searching in P2P networksMohamed Elsharnouby - Istanbul Sehir University

P2P networks

Structured:

- CAN- Chord- Tapestry- Pastry- Viceroy

Unstructured:

- Freenet- Gnutella- BitTorrent

Structured

Pros:

- Can search any resource even if rare- Search is more efficient as it exploits the

structure

Cons:

- Not very robust and resilient as unstructured

- Overhead of maintaining the structure with joining and leaving peers

Pros:

- More resilient to failures- Better handling of joining/leaving peers- Allow better optimization of routing by

changing the overlay structure

Cons:

- Rare resources are harder to find if found at all

- Searching can flood and overload the whole network

Unstructured

Search in Structured Networks

Content Addressable Network (CAN)

CAN

Multidimensional Cartesian coordinate space on a multi-torus

Each peer has a neighbour list

Routing performance is O( × N1/ )

CAN

Joining: by splitting an existing peer’s zone into half

Neighbour list: transferred from the old peer - updated for all neighbouring peers

Leaving: a neighboring peer takes over its space and the neighbour lists are updated

CAN improvement

Multiple coordinate spaces (realities) with different place for each peer, same place for data

Increasing dimensions: gives better routing. But both are needed

Overloading zones: more data availability - fault tolerance - shorter routing

Topological awareness of IP network

Using multiple hash functions: increases data availability

Chord

Chord

Peers are organized around a circle according to their ID which is an m-bit ID assigned by a uniform hashing function

Each data item is assigned an ID on the same circle and assigned to its successor peer

Routing takes O(log N) if peer information is up to date

Chord

Each peer carries a finger table for info of peers which are successors of IDs that increase by a power [ hence the O(log N) routing ]

Resilience is increased by maintaining another list of length r of the peer’s direct successors

Joining and leaving: needs successor keys to be updated which is done by a stabilization protocol that runs periodically in the background

Chord

It needs O(log N) for routing, much better than CAN

Needs O(log2 N) which is worse than CAN which requires O(2 x d)

Could make some use of CAN improvements ideas as multiple realities

Cannot take into account IP topology

Tapestry

Tapestry

The nth peer that the message reaches shares a suffix of at least length n with destination ID

Routing takes O(logb N) where b is the base of IDs

Uses multiple roots for each data object to avoid single points of failure

Robustness is increased by making the neighbour map maintain two backup peers in addition to the primary ones

Pastry

Pastry

Same as Tapestry

Doesn’t have optimization for locality of peers

Less efficient replication algorithm

Viceroy

Viceroy

- General Ring: every node is connected to its successor and predecessor

- Level Ring: every node is connected to others on ring

- Butterfly: every level L:- Down right edge that is

added to a long range- Down left edge to close

range- Up edge to close range

Routing performance is O(log N)

Search in Unstructured Networks

Freenet

Freenet

It uses Steepest Ascent Hill Climbing with backtracking algorithm

It caches the found file in the path peers => improvement of routing

Anonymity is one of the main properties of the network

Least Recently Used (LRU) is the basic cache replacement algorithm

An enhanced algorithm for cache replacement could be used for cache replacement

Freenet

Enhanced-clustering with Random Shortcut

It uses the concept of small world by choosing the farthest node in the cache

If the new added node is closer it replaces in the cache

If it’s farther with a certain probability it replaces

The choice of optimum is still an open question

Gnutella

Gnutella

Routing through the network is mainly done by flooding (BFS) with certain TTL and limit of hops

This causes high overload of the network when too many nodes join

To join a client connects to one of the peers and broadcasts its content by flooding as well

A concept of ultra peers with higher bandwidth is introduced to carry the network routing and search operations for its leaves

BitTorrent

BitTorrent

A centralized P2P system

It cuts files into pieces of fixed size (256 Kbytes each) and hashes them with SHA1 to confirm integrity of data

A client needs to connect to Tracker that gives the client a set of random peers having the file needed

A downloaded piece could be seeded

DHT introduced trackerless BitTorrent

Questions?

Thank you

top related