peer-to-peer computing csc8530 – dr. prasad jon a. preston april 21, 2004

31
Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Upload: pauline-roberts

Post on 04-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Peer-to-Peer Computing

CSC8530 – Dr. Prasad

Jon A. Preston

April 21, 2004

Page 2: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Agenda

Overview of Peer-to-peer computing Parallel Downloading Peer-to-Peer Media Streaming References

Collaborative Software Engineering

Page 3: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Peer-to-Peer Computing

Autonomy from centralized servers Dynamic (peers added & removed

frequently)

File Sharing (KaZaA – outpaces Web traffic, 3,000 terabytes, 3 million up peers)

Communication (instant messenger) Computation (seti@home)

Page 4: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Peer-to-Peer Computing (cont)

De-centralized data sharing Dynamic growth of system capacity Various data lookup/discovery schemes

– Centralized directory servers (Napster)– Controlled request flooding (Gnutella)– Hierarchy with supernodes (KaZaA)

Heterogeneous collection of peers– Need a way of encouraging reporting of true outgoing

bandwidth

Page 5: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Worldwide Computer(P2P Computation)

“Moonlight” your computer Share/lease processor and storage Process others’ simulations, etc. Archive other’s files (even when computer off) Receive micropayments for services rendered PC is component of worldwide computer “Internet-scale OS” – centralized structure

– Must allocate resources, coordination, security/privacy, etc.

Page 6: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Parallel Downloading

Potential widespread utilization on P2P networks

Past work shows parallel downloading (PD) has higher aggregated downloading throughput

Shorter download times by clients

Page 7: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Communication in PD

Client must determine segments of file for each server request

Alternative: “Tornado Code”– Servers keep sending until client says “enough”– Requires less communication about quantity and

which part of the file the client wants– Does require high buffering on client (entire file)

Page 8: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Parallel vs. Sequential Download

Parallel incurs non-trivial cost– Synchronization– Coordination– Encoding/decoding

Adopt PD if download performance improves significantly…

Page 9: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Large-Scale Deployment of PD

Koo et al developed a model in May 2003 that shows SD is better than PD– Assumes that Capacityservers >> Capacityclients

– Homogenous network– Analyzed average download time– Performance is similar, but SD requires less

overhead

Page 10: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Peer-to-Peer Media Streaming

Peer-to-peer file sharing– Act as server and client– “Open-after-download”

Media Streaming– “Play-while-downloading”– Subset of peers “owns” a media file– These peers stream media to requesting peers– Recipients become supplying peers themselves

Page 11: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Characteristics of P2P Media Streaming Systems

Self-growing – requesting peers become supplying peers (total system capacity grows)

Serverless – each peer is not to act as server (open large number of simultaneous/client connections)

Heterogeneous – peers contribute different outbound connection bandwidths

Many-to-one – many supplying peers to one real-time playing client (hard deadlines)

Page 12: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Two Problems

Media data assignment

Fast amplification

Page 13: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Media Data Assignment

Given– Requesting peer– Multiple supplying peers– Heterogeneous outbound bandwidth on suppliers

Determine– Subset of media to request from each supplier

A B C D

Page 14: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Variable Buffer Delays

Buffer delay dependsupon the orderingof which segments ofthe media file to obtainfrom each supplyingpeer.

Page 15: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Fast Amplification

Differential selection algorithm– Favor higher-class (higher outbound bandwidth)– Ultimately benefit all requesting peers– Should not starve any lower-class peer– Enforced via pure distributed algorithm– Probability of selection proportional to requesting

peer’s promised outbound bandwidth

Page 16: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Variable Capacity Growth

Page 17: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Selection Algorithm

Each supplying peer– Determines which requesting peer to serve– Maintains probability vector – one entry per class

of peers (class defined by bandwidth)– Receives “reminders” from peers

If supplier (Ps) is busy, it can receive a reminder from requesting peer (Pr)

This reminder tells the supplier to remember the requesting peer (Pr) and not elevate other peers in classes below Pr when current service complete

Page 18: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Admission Probability Vector

One entry per class-i set of peers If not busy, Ps grants request of Pr with probability

Pr[i], where i = class of Pr

If Ps is a class-k peer, Pr[i] defined as follows– For i < k, Pr[i] = 1.0 (favored class)– For i >= k, Pr[i] = 1/(2i-k)

If idle, elevate non-favored (and non-served) entries by factor of 2 (i.e. Pr[i] = Pr[i] * 2)

Use reminders to effect what happens after service completed (raise or not)

Page 19: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Making a Request

Knows candidate supplying peers {Ps1, Ps2, … Psn}

Pr will be admitted if it obtains permission from enough suppliers such that aggregated outbound bandwidth sufficient to service request

– Requesting peer then computes media data assignment

If not admitted, send “reminders” to busy supplying peers that favor Pr. Backoff exponentially.

When request is finished, Pr becomes a supplying peer, increasing the overall system capacity.

Page 20: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Differential Acceptance Results

Page 21: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Non-differential Acceptance Results

Page 22: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

References

Simon Koo, Catherine Rosenberg, Dongyan Xu, "Analysis of Parallel Downloading for Large File Distribution", Proceedings of IEEE International Workshop on Future Trends in Distributed Computing Systems (FTDCS 2003), San Juan, PR, May 2003.

Dongyan Xu, Mohamed Hefeeda, Susanne Hambrusch, Bharat Bhargava, "On Peer-to-Peer Media Streaming", Proceedings of IEEE International Conference on Distributed Computing Systems (ICDCS 2002), Wien, Austria, July 2002

Ripeanu, M. Peer-to-peer architecture case study: Gnutella network. In International Conference on Peer-to-peer Computing (2001).

J. Kangasharju, K.W. Ross, D. Turner, Adaptive Content Management in Structured P2P Communities, 2002, http://cis.poly.edu/~ross/papers/AdaptiveContentManagement.pdf

Androutsellis-Theotokis S. Whitepaper: A Survey of Peer-to-Peer File Sharing Technologies, Athens University of Economics and Business, Greece, 2002.

Page 23: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Collaborative Software Engineering

Overview of Collaborative Computing Synchronous and Asynchronous Notification Algorithms Distributed Mutex Achieving “undo” and “redo” Transparencies vs. Awareness Distributed Software Engineering

Page 24: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Overview of Collaborative Computing

Utilize computing to improve workflow and coordination/communication– Shared displays/applications– Online meetings– Collaborative development (configuration

management)– Minimize impact of physical distance

Collaboratories– Emulate scientific labs

Page 25: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Synchronous and Asynchronous

Synchronous– Same time, different place– ICQ, Chat, etc.– Can store session

Asynchronous– Different time, same/different place– Email, newsgroups, web forums– Store session, replay

Page 26: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Notification Algorithms

Unicast– Latency potential issue

Multicast– Significant bandwidth consumption– Network flooding

Frequency– Synchronous implies high frequency of change notifications– Asynchronous implies low frequency of change notifications

Granularity– Differentials or whole state– How to incorporate new users (latecomers)

Page 27: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Distributed Mutex

Token-based– Only the process that holds the token can enter the critical

section– Transmission of token algorithm (round-robin, hold & wait

for request)– How does a process know where to request token?

Permission-based– Sends request to enter CS to other processes– Other processes get to “vote”– Process enters CS only if it achieves enough votes

Page 28: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Achieving “undo” and “redo”

Particularly important in collaborative systems– High level of “what if” inherent in the system– Others might adversely affect someone else’s work

In OO-based systems, undo and redo are inverses of each other

In text-based systems, insert and delete are inverses of each other

In bitmap-based systems, undo and redo are not so easy– Save entire image (too much space)– Save only differential area (replay sequence of actions to recreate

state)

Page 29: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Transparencies vs. Awareness

Does the application know about the collaboration or not?

– Transparencies Communication layer sits on top of the application Useful for sharing legacy systems Have no access to source (or cannot modify it) Negative – no concurrency (one input/output at a time)

– Aware Applications Collaboration integrated into the application Requires centralized execution with distributed I/O Or requires a homogeneous architecture (same client on each

users’ machine)

Page 30: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Distributed Software Engineering

Synchronous and asynchronous collaboration

Provide meta view of others in system Allow for viewing of entire current system Fine-grain source locking/check-out Provide sandbox for developers to test/build

local source How do we improve concurrency?

Page 31: Peer-to-Peer Computing CSC8530 – Dr. Prasad Jon A. Preston April 21, 2004

Handling Concurrent Development

Split-combine (low level of concurrent development)

Copy-merge (high level of concurrency, problematic to merge)