peer to peer systems architecture & research overview by shay horovitz

86
Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Post on 19-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Peer To Peer Systems

Architecture & Research Overview

by

Shay Horovitz

Page 2: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Lecture contents

What is Peer-to-Peer computing What makes it distinctive What are the potentials Examples of existing applications Possible applications Research item – Distributed trie Summary

Page 3: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Introduction

Peer-To-Peer Computing Today

Based on research of the Swedish Institute of Computer Science

Page 4: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Rise and Fall of P2P in Media

“Peer-to-peer is the next great thing for Internet” Stanford Law Net Guru – Lawrence Lessig, 2000

“Peer-to-peer computing is leading us into the 3rd age of the internet” Bob Knighton – Inter Corp, Fall 2000

“Is P2P plunging off the deep end” Wall Street Journal, April 2, 2001

“Does Peer-to-Peer Suck?” Jon Katz – Slashdot, April 4, 2001

Page 5: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

What’s P2P computing ?

Webster definition of peer:

“one that is of equal standing with another”

P2P Computing – Computing between equals

Page 6: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

What makes it distinctive ?

Class of applications that take advantage of resources: storage,CPU, content – available at the edges of the network

Edges ? Users and their PCs and devices Often

Without permanent IP address Turned off !!!

Page 7: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

The Contrast: Server-Centric

The Client/Server paradigm The client is basically a glorified I/O

device Information, Control, Computation is

kept at the Server Simpler to build centralized systems

Page 8: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Client Server & ‘Problems.COM’

A Typical Client-Server architecture: Server is a “Super Computer” Addresses/Ports of servers are known MANY clients to ONE server Client is just a “Monitor” Server is down = Network is down Server is Expensive Scalability – More clients=More Servers

Page 9: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Client Server Solutions

Replication (Many Servers) Expensive, Synchronization and much

more… Brute Force (Faster Server)

Expensive, Scalability, Single point of failure

Page 10: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

P2P Topologies

Centralized Hierarchical Decentralized Hash Circle Decentralized with Super Nodes

Page 11: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies - Centralized

Like Client-Server, many clients and one server entity (one server / group of servers)

Used in Napster Server acts like “144”, just helps to

initiate the communication Simple to design

Page 12: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies - Centralized

Page 13: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies - Centralized

Ways of action: Client sends server the query, server ask

everyone and responds to client Client gets list of clients from server All Clients send ID’s of the data they hold

to the server and when client asks for data, server responds with specific addresses

Page 14: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies - Hierarchical

Servers are organized in a tree Suits for communication between

“hierarchical objects” like companies, organizations – Inside P2P, Outside Client-Server

Suits for security architectures like Certificate Authority

Page 15: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies - Hierarchical

Page 16: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies - Hierarchical

Ways of action: Much like the Centralized topology Can set policy rules at the level of

servers Server sends the queries to his ancestor

when needed

Page 17: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies - Decentralized

It’s the “Pure” P2P topology No servers (well, maybe just one !) Topology changes as peers are

joining/leaving the network Mainly, the topology is really based on

the “logical” behavior of the peers

Page 18: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies - Decentralized

Page 19: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies - Decentralized

Ways of action: Peer sends requests to his “neighbors” Neighbors route the requests to their

neighbors Many message could drop since “weak”

peers might not work as fast as needed In future, special algorithms will dictate

the behavior of this topology

Page 20: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies – Hash Circle

Mainly for file-sharing, storage-distribution

All resources are represented by a hash value

Only “Exact” searches are allowed

Page 21: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies – Hash Circle

Page 22: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies – Hash Circle

Ways of action: When a peer joins, it gets responsibility for part

of the hash space Each peer knows his neighbors in the hash

space and a few other randomly chosen peers Requests are forwarded to the node closest to

the hash query Requires O(logN) forwards = low bandwidth

Page 23: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies – Decent’ + Super Nodes

New topology Still – no servers (not expensive ones

at least) Used in iMesh, Kazaa Slow peers do not slow the search

Page 24: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies – Decent’ + Super Nodes

Page 25: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Topologies – Decent’ + Super Nodes

Ways of action: A super node is a normal node that’s

elected to act as a local server Usually super nodes are elected for their

bandwidth Requests are forwarded from slow peers

to super nodes

Page 26: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

P2P Application

Application is P2P if:• Allows for variable connectivity &

temporary network addresses• Gives the nodes at the edges of the

network significant autonomy

Page 27: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Another Point Of View

In P2P, peers in relation to each other act as: Clients AND Servers AND Routers AND Caches AND… EVERYTHING

Page 28: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

What makes (made) it so hyped ?

Industry looking for something positive after .Com death

Has large social consequences The Internet has already changed society We can expect further changes

Some very interesting applications became widely known and used Napster, iMesh, Gnutella, FreeNet, Kazaa,

Morpheus, CuteMX, Scour …

Page 29: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Potential of P2P

Better resource utilization Scalability Fault-tolerance Denial of service tolerance

Page 30: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Research item

Efficient Peer-To-Peer Lookup Based

on a Distributed Trie

Michael J.Freedman

MIT Lab for Computer Science

Radek Virgralek

InterTrust STAR Lab

Published - 2002

Page 31: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Until Now…

2 main approaches for lookup : Broadcast searches (gnutella) Location deterministic algorithms (chord)

The new approach: Distributed Trie!

Page 32: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

What’s a Trie ?

A trie is a tree that store a string by

representing each character in the

String as an edge on path from root

to leaf

Page 33: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Trie example

Words :

•Then

•Them

•Those

•Toss

•Ball

Page 34: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

The lookup scales

Lookups efficiency

Maintenance cost

Page 35: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Efficient lookup methods

Replicating the lookup structure on every peer

BUT – slow maintenance

How to reduce maintenance costs ?

Page 36: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Reducing maintenance costs

1st approach : Eliminate the lookup structure – thus there is NO maintenance

Lookups are “broadcast-like” – costing efficiency and scalability

Implemented in Gnutella

Page 37: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Reducing maintenance costs

2nd approach : Partitioning the lookup structure

Distribute subsets of partitions on each peer Peers update only small number of replicas The systems assign partitions to peers by:

Static assignment Dynamic assignment

Page 38: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Static partitions assignment

Each node replicates only

partitions that are “Close”

to it’s address

Page 39: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Dynamic partitions assignment

Each node replicates only

partitions that are

frequently accessed by

the node

Page 40: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Relaxing the consistency

Update local lookups “Lazily” – just when a node actually get a request for a key !

BUT – What do we get out of this ?

Page 41: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Relaxing

The Good

Should reduce maintenance cost

since we actually use less updates

Page 42: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Relaxing

The Bad

Peers hold stale replicas because

of the “lazy” updates of the local

lookup structures

Page 43: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Relaxing

and… the UGLY

Limit addressing errors by piggybacking the updates on other traffic

Page 44: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Offered algorithms nature

Use dynamic partitioning based on peers’ access locality

Use Lazy updates to reduce maintenance cost

Piggyback trie state on lookup responses only

Use Timestamping to reconcile conflicting updates

Page 45: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Algorithms differences

Difference in the volume of the trie structure piggybacked

Page 46: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Algorithms differences

Difference in how aggressively the requester uses the partitions

Page 47: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Security in a trie…

Who knows who the caller is ? Who knows who the callee is ?

Page 48: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

The system model

Lookup ( key )

The callee sends the caller the value associated with key if successful or a failure message

Page 49: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

The system model

Insert ( key, value )

The callee inserts a < key, value > pair into its lookup structure

Page 50: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

The system model

Join()

The callee sends to the caller initial state needed to bootstrap lookup operations

Page 51: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Back to Maintenance…

Update a value ? NO ! Delete a value ? Oh NO ! So what’s left ? Re-insert !!!

Can Re-insert in the same key Can Re-insert in other keys “Actual Deletion ” is made for old info,

according to timestamp value

Page 52: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

A Closer Look at Dist’ Trie

Page 53: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

What’s in it ?

Page 54: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Peer storage

Each peer holds a number of key/value pairs locally

Page 55: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Peer storage

The peer also stores partitions of a lookup organized as a trie

Page 56: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Peer storage

Very Important – A trie representation is insensitive to the insertion ordering !

It’s easy to merge two versions of the lookup structure

Page 57: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Trie internals

A trie node consist of 2^m routing tables Each routing table consists of L entries

Page 58: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Trie Internals

Each entry in table consists a peer

address ‘a’ and a timestamp ‘t’

Page 59: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Trie Internals

Each level of trie “consumes” ‘m’ bits of the ‘k’-bit key

Page 60: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Trie Internals

If the node if a Leaf ( having depth [k/m] ), then peer ‘a’ was known at time ‘t’ to hold the replica of the i-th child of the node

Page 61: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

The Ancestor Invariant

All peers maintain the Ancestor

Invariant:

If a peer holds a trie node, it

must hold all ancestors of the

node

Page 62: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

The Ancestor Invariant

Conclusion – Logically, nodes closer to the trie root are more widely replicated by peers, removing any single point of failure

Page 63: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Welcome to “PeerLand”

JOIN – in order to join the system, a peer must know the address of at least one participating peer, called its introducer.

The introducer sends its root routing table to the new client

Page 64: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Inserting Data

Performed locally by inserting a key/value pair in the local storage

Alternatively, a peer can send insert request to other peers

Page 65: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

General Lookup Algorithm

Lookup()

{

Key= LocalStorage.CheckForKey(KeyName)

If NotEmpty(Key)

{

CreateProcess(DistributedLookup)

}

}

Page 66: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Lookup example

Assume A calls: lookup(0101000000) keyName=0101000000 A.LocalTrie.FindMatch(keyName)

Page 67: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Lookup example - cont B = A.currentNode.GetLatestUpdatedAddressInTable If B.HasActualValue then

B.returnValueTo(A) Else

B.returnNullTo(A)

Page 68: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Lookup example - cont

“Else B.returnNullTo(A)” = Failure – so A will turn to the next “B” in the table according to decreasing TIMESTAMP order

Page 69: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Lookup example - cont

If there are no more “B”s in the table, A will

call the process on the parent’s routing table

Page 70: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Modes of action

Bounded mode Unbounded mode Full Path mode

Modes for exploring the tradeoff between size of the piggybacked trie and the speed of convergence to an accurate map:

Page 71: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Bounded Mode

The Callee send its most

specific routing table

matching the key (or the

value itself) if its routing table

is more specific (deeper) than

the Caller’s table.

Page 72: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Unbounded Mode

The Callee send its most

specific routing table

matching the key (or the

value itself) REGARDLESS

of the Callers table.

WHY ???

Page 73: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Unbounded Mode

This might be useful to

get new peers when

backtracking on higher

levels of the trie !!!

Page 74: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Full Path Mode

The Callee send to the Caller

all of its tables from the root

to its most specific table

(deepest).

Page 75: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

About Security

Most P2P lookup algorithms are

susceptible to malicious behavior.

How can you fool the trie ?

Page 76: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Security leaks

Possible Answer :

Page 77: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Security leaks

Another Possible Answer :

Page 78: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Security Modes

Conservative Mode – Update the trie just with the tables of

nodes that actually led to the information Ignore all other updates

Liberal Mode – Callers immediately update their local

tries with any piggybacked state !

Page 79: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Conservative example

Page 80: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Experiments & Simulation

200 peers L=10 (Size of table in each node) M=2 ( Size of step in trie levels) For each step, 2000 random

keys/value pairs During simulation, added/removed

peers with probability of 0.005

Page 81: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Failure Probability

A lookup fails when the requesting

peer’s trie didn’t contain sufficient

information to locate an existing

key/value pair (Even after contacting

other peers during lookup)

Page 82: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Probability of lookup failure

Page 83: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Message Overhead

A Lookup can be:• Local – satisfied locally• Remote – requires contacting peers

We measure the number of lookup

operations that were sent to other

peers in order to satisfy the request.

Page 84: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Message Overhead

Page 85: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

Bibliography

“Efficient Peer-to-Peer Lookup Based on a Distributed Trie” / IPTPS ’02 Cambridge http://www.cs.rice.edu/Conferences/IPTPS02/167.pdf

“Peer-to-Peer Computing” / Swedish Institute of Computer Science (SICS) in Stockholm http://www.sics.se/~perbrand/open.pdf

“DISTRIBUTED HASH TABLES Building large-scale, robust distributed applications “ PODC ’02 Monterey http://www.podc.org/podc2002/kaashoek.ppt

Gnutella website http://gnutella.wego.com “Chord: A scalable peer-to-peer lookup service for internet

applications” ACM SIGCOMM, San Diego 01

Page 86: Peer To Peer Systems Architecture & Research Overview by Shay Horovitz

THANK YOU !!!

For not falling asleep : - )