a methodology for workload characterization of le … methodology for workload characterization of...

25
A methodology for workload characterization of file-sharing peer-to-peer networks Diˆ ego Nogueira, Leonardo Rocha, Juliano Santos, Paulo Ara´ ujo, Virg´ ılio Almeida, Wagner Meira Jr. Department of Computer Science Federal University of Minas Gerais - Brazil e-commerce, system performance evaluation, and experimental development lab Index 1

Upload: buinhu

Post on 16-Mar-2018

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

A methodology for workload characterization offile-sharing peer-to-peer networks

Diego Nogueira, Leonardo Rocha, Juliano Santos,

Paulo Araujo, Virgılio Almeida, Wagner Meira Jr.

Department of Computer Science

Federal University of Minas Gerais - Brazil

e-commerce, system performance evaluation, and experimental development lab Index 1

Page 2: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

What is peer-to-peer (P2P)?

Class of distributed applications on the Internet

• peers act as both servers and clients (servent)

• servents share computational resources

• particular features:

– dual role of servents

– totally distributed processing nature

– dynamic nature

e-commerce, system performance evaluation, and experimental development lab Index 2

Page 3: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Why characterize P2P networks?

• growth of P2P networks (especially file-sharing)

• lack of characterization methodologies

• provide important information for further research

e-commerce, system performance evaluation, and experimental development lab Index 3

Page 4: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Gnutella: case study

• file-sharing open P2P network

• why Gnutella?

– simple P2P network

– intense traffic, with users all over the world

• previous work:

– did not focus on standard statistical distributions

– not in the context of general characterization methodology

e-commerce, system performance evaluation, and experimental development lab Index 4

Page 5: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Gnutella: case study

e-commerce, system performance evaluation, and experimental development lab Index 5

Page 6: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Workload characterization methodology

• derives from the classic client-server characterization

• divided into:

– Qualitative characterization

∗ conceptual definition of atributes

– Quantitative characterization

∗ client-side criteria

∗ server-side criteria

e-commerce, system performance evaluation, and experimental development lab Index 6

Page 7: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Qualitative characterization - (1/2)

Conceptual definition of the attributes:

P2P architecture:

• type of resources shared

• communication protocol (connection + service

interface)

P2P application:

• set of messages that implement the protocol

e-commerce, system performance evaluation, and experimental development lab Index 7

Page 8: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Qualitative characterization - (2/2)

P2P network:

• application implemented by the servents

• set of servents

– the resources shared by the servent

– the servent’s neighborhood

e-commerce, system performance evaluation, and experimental development lab Index 8

Page 9: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Gnutella qualitative characterization

Gnutella architecture

• Shared resource: storage device (hard disk)

• Communication protocol:

– connection interface: {ping, pong}– service interface: {query, queryhit,

download, push}

Gnutella application

Gnutella network (gNet)

e-commerce, system performance evaluation, and experimental development lab Index 9

Page 10: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Quantitative characterization

Workload characterization of a live P2P network

• collection of traffic and peer behavior data, analysis of the data

Client-side criteria

• demand for resources

• interaction pattern

• servents’ connectivity

Server-side criteria

• resource availability

• service capacity

e-commerce, system performance evaluation, and experimental development lab Index 10

Page 11: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Gnutella quantitative characterization

• Data collector

– developed over Gnut

∗ collected data addressed to and through peer

∗ periodically sent random querys

• Experiments

– 2 Linux workstations connected to Brazilian research network

– connection to reference servents on Gnutella

– results presented from 24 hours (10/02/2001)

e-commerce, system performance evaluation, and experimental development lab Index 11

Page 12: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Demand for resources characterization

Identifies:

• the subjects of interest

• popularity of subjects among users

• temporal locality among requests

Gnutella:

• Servents’ interests

– 2,992,390 querys received / 94,642 distinct words

(including stop words)

e-commerce, system performance evaluation, and experimental development lab Index 12

Page 13: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Servents’ interests

e-commerce, system performance evaluation, and experimental development lab Index 13

Page 14: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Interaction pattern characterization

• quality-of-service metric

• used to quantify the overall performance of a P2P

network

Gnutella:

• Latency

– sent ttl 1 pings

– 1,823,972 registered pongs

e-commerce, system performance evaluation, and experimental development lab Index 14

Page 15: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Latency

e-commerce, system performance evaluation, and experimental development lab Index 15

Page 16: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Servents’ connectivity characterization

Quantified through:

• average number of neighbors

• network traffic associated with communication

• amount of data exchanged

Gnutella:

• Unique servents– number of servents varied approximately 15% across collection

periods

– 75% of peers answered

e-commerce, system performance evaluation, and experimental development lab Index 16

Page 17: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Resource availability characterization

• how dynamics of P2P affects access to information

• assess effectiveness of network in providing information

• good for comparing data distribution protocols and mechanisms

Gnutella:

• Shared kbytes

– 120,535 addressed servents / 90,282 replied

– information from pong messages

e-commerce, system performance evaluation, and experimental development lab Index 17

Page 18: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Shared kbytes

e-commerce, system performance evaluation, and experimental development lab Index 18

Page 19: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Service capacity characterization

• quantifies amount of information provided by servents

• helps to understand if idle capacity is used efficiently

• provides information to improve scalability of servents

Gnutella:

• Servents’ availability

– 98% on for at most 45 min.

– 84% no longer than 10 min.

e-commerce, system performance evaluation, and experimental development lab Index 19

Page 20: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Servents’ availability

e-commerce, system performance evaluation, and experimental development lab Index 20

Page 21: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Data convergence analysis

• verify how representative the data collected and

studied are

• mechanism:

– Perform the same experiments in shorter periods

∗ 6, 12, 18 and 24 hours, for example

∗ verify the distributions from each period

e-commerce, system performance evaluation, and experimental development lab Index 21

Page 22: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Data convergence analysisShared files

e-commerce, system performance evaluation, and experimental development lab Index 22

Page 23: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Conclusions

Definition of a workload characterization methodology

• Qualitative characterization (conceptual definitions)

• Quantitative characterization (client + server side criteria)

Successful application of methodology on Gnutella

Interesting results about Gnutella’s traffic and peer

behavior

• Statistical distribution analysis

• Latency distribution follows the Log-Normal distribution

• Search traffic 50 times larger than control traffic

e-commerce, system performance evaluation, and experimental development lab Index 23

Page 24: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Questions?

contact: [email protected]

e-commerce, system performance evaluation, and experimental development lab Index 24

Page 25: A methodology for workload characterization of le … methodology for workload characterization of ... case study le-sharing open P2P network ... case study e-commerce,

Index

P2P definition Demand for resources

Motivation Interaction pattern

Gnutella Servents’ connectivity

Methodology introduction Resource availability

Qualitative characterization Service capacity

gNet qualitative characterization Data convergence analysis

Quantitative characterization Conclusions

gNet quantitative characterization

e-commerce, system performance evaluation, and experimental development lab Start 25