a super-peer model for resource discovery services in...

Future Generation Computer Systems 21 (2005) 1235–1248

A super-peer model for resource discovery servicesin large-scale Grids�

Carlo Mastroiannia, Domenico Taliab,∗, Oreste Vertab

a ICAR-CNR 87036 Rende (CS), Italyb DEIS University of Calabria, 87036 Rende (CS), Italy

Received 23 March 2005; received in revised form 3 June 2005; accepted 6 June 2005Available online 19 July 2005

Abstract

As deployed Grids increase from tens to thousands of nodes, peer-to-peer (P2P) techniques and protocols can be used to imple-ment scalable services and applications. The super-peer model is a novel approach that helps the convergence of P2P models andGrid environments and can be used to deploy a P2P information service in Grids. A super-peer serves a single physical organiza-tion in a Grid, and manages metadata associated to the resources provided by the nodes of that organization. Super-peers connectto each other to form a peer network at a higher level. This paper examines how the super-peer model can handle membershipmanagement and resource discovery services in a multi-organizational Grid. A simulation analysis evaluates the performance ofa resource discovery protocol; simulation results can be used to tune protocol parameters in order to increase search efficiency.© 2005 Elsevier B.V. All rights reserved.

1

mmtfiu

E

f

t

s in

aseitiesP2Puselizedted

pen

inceerac-s.

0

. Introduction

Grid computing and peer-to-peer (P2P) computingodels share several features and have more in com-on than we generally recognize. The integration of

he two computing models could bring benefits in bothelds and could result in future integrations. In partic-lar, the use of P2P protocols is expected to improve

� Extended version of the Best Paper Award winning paper in theuropean Grid Conference 2005.∗ Corresponding author. Tel.: +39 0984 494726;

ax: +39 0984 494713.E-mail addresses: [email protected] (C. Mastroianni),

[email protected] (D. Talia), [email protected] (O. Verta).

the efficiency and scalability of information servicelarge-scale Grid systems[8,19].

As Grids used for complex applications increfrom tens to thousands of nodes, their functionalshould be decentralized to avoid bottlenecks. Themodel could favour Grid scalability: designers canP2P style and techniques to implement decentraGrid systems. The adoption of the service orienmodel in novel Grid systems (for example, the OGrid Services Architecture (OGSA)[2], or the WebServices Resource Framework (WSRF)[20]) will sup-port the convergence between the two models, sWeb Services can be used to implement P2P inttions between hosts belonging to different domain

167-739X/$ – see front matter © 2005 Elsevier B.V. All rights reserved.doi:10.1016/j.future.2005.06.001

1236 C. Mastroianni et al. / Future Generation Computer Systems 21 (2005) 1235–1248

In particular, an ongoing effort aims at studying howit is possible to drive the integration trend to efficientlyhandle two key services in Grid information systems:membership management (or simplymembership) andresource discovery services. The objective of a mem-bership management service is two-fold[8]: adding anew node to the network, and assigning this node a setof neighbour nodes. The resource discovery service isinvoked by a node when it needs to discover and usehardware or software resources matching given criteriaand characteristics.

In currently deployed Grid systems, resources areoften owned by research centres, public institutions,or large enterprises: in such organizations hosts andresources are generally stable. Hence, membershipmanagement and resource discovery services are effi-ciently handled through centralized or hierarchicalapproaches, as in the OGSA and WSRF frameworks.As opposed to Grids, in P2P systems nodes andresources provided to the community are very dynamic:peers can be frequently switched off or disconnected.In such an environment, a distributed approach is moreeffective and fault-tolerant than a centralized or hierar-chical one.

Recently, super-peer networks have been proposed[21] to achieve a balance between the inherent effi-ciency of centralized search, and the autonomy, loadbalancing and fault-tolerant features offered by dis-tributed search. A super-peer node acts as a centralizedresource for a number of regular peers, while super-p thate

uslya verye ndi aleG all-s hichf ni-z onea oren , cana uperp s ands irtualO edoe cide;

in particular when a VO represents resources that arelong-lived within one administrative domain.

This paper examines how the super-peer model canhandle membership management and resource discov-ery services in a multi-organizational Grid. A simula-tion analysis evaluates the performance of a resourcediscovery protocol; presented results can be used totune protocol parameters in order to increase searchefficiency. The remainder of the paper is organized asfollows. Section2 discusses related work. Section3introduces the super-peer model, and shows how it canbe used in large-scale Grids, in particular in service-oriented Grid frameworks. A discovery protocol basedon the super-peer model is proposed and discussed.Section4 analyzes the performance of the proposeddiscovery protocol by means of an event-driven simula-tion framework. The influence of network and protocolparameters on performance indices is evaluated, so thatthe protocol can be tuned to increase search efficiency.Section5 concludes the paper.

2. Related work

Discovery services in P2P networks can be classi-fied as using unstructured or structured approaches tosearch resources. Gnutella[7] and Kazaa[9] are exam-ples of unstructured P2P networks: hosts and resourcesare made available on the network without a globaloverlay planning. Structured P2P networks, such asC -t ble( isa ey,d keyi

ainlyu net-w largev idesa thea uests.I ustc ordr ede-c s, then ddedt -

eers connect to each other to form a networkxploits P2P mechanisms at a higher level.

The super-peer model can be advantageodopted in large-scale Grids because it allows for afficient implementation of the information service a

t is naturally appropriate for Grids. In fact, a large-scrid can be viewed as a network interconnecting smcale, proprietary Grids; each of these Grids, wrom now on will be referred to as a Physical Orgaation (PO), is composed of a set of hosts withindministrative domain. Within each PO, one or modes, e.g. those that have the largest capabilitiesct as super-peers, while the other nodes can use seers to access the Grid and search for resourceervices. Note that a PO is not the same as a Grid Vrganization (VO), which is defined as a short-livrganization that spans administrative boundaries[3],ven if in some cases the two concepts can coin

-

AN [14], Chord[17] and Pastry[16], use highly strucured overlays and exploit a Distributed Hash TaDHT) to route queries over the network. A DHT

data structure for distributed storing of pairs (kata) which allows for fast locating of data when a

s given.Peer discovery and membership services are m

sed for the construction and the start up of P2Porks. Such services can be provided using a veryariety of techniques. For example, Gnutella provnumber of well known “cache servers” that storeddresses of peers that can accept connection req

n Chord, a peer that wants to join the network montact a known peer, already included in the Ching, and request to it the addresses of its future pressor and successor peers in the ring; afterwardew peer will ask these neighbour peers to be a

o the ring. The FLAPPS system[15] is a P2P infras

C. Mastroianni et al. / Future Generation Computer Systems 21 (2005) 1235–1248 1237

tructure which has the merit of flexibly combining anumber of different styles of peer network construc-tion methods as well as different resource discoveryprotocols.

Membership and resource discovery services arealso key issues in Grid systems. Today, a centralized orhierarchical approach is usually adopted. For example,in the Globus Toolkit 2, or GT2[1], a node that wantsto connect to the Grid registers at a centralized indexserver, the Globus Index Information Server (GIIS),and periodically sends to that server information aboutthe resources offered to other nodes. GIIS servers areorganized according to a hierarchical approach. Querymessages are delivered to a high-level GIIS and thenpossibly forwarded to lower-level information servers.

The information model exploited in the GlobusToolkit 3, or GT3, built upon OGSA, is based on IndexServices[6], a specialized type of Grid Services. IndexServices are used to aggregate and indexService Data,i.e. metadata associated to the resources provided byGrid hosts. There is typically one Index Service perorganization but, in large organizations, several IndexServices can be organized in a hierarchy. A similarapproach is used in the WSRF-based Globus Toolkit4: ServiceGroup services are used to form a wide vari-ety of collections of WS-Resources, a WS-Resourcebeing a Web service that is associated with a statefulresource.

Nowadays, the Grid community agrees that it isnot efficient to devise scalable Grid resource discoveryb en al havet ity ofs

osedt cy ofc cinga rch.I alu-a signo per-f e tol lm ce ofa n thisw rma-t anya .

Kazaa[9] designates the more available and pow-erful peers as supernodes. In Kazaa, when a new peerwants to join the network, it searches for the exist-ing supernodes, and establishes an overlay connectionwith the supernode that has the shortest RTT. In[4],a framework that combines the structural DHT designwith a multi-level architecture based on super-peersis proposed. Peers are organized in disjoint groups,and lookup messages are first routed to the destina-tion group, then to the destination peer using an intra-group overlay. In[12], both resources and the contentstored at peers are described by means of RDF meta-data. Routing indices located at super-peers use suchmetadata to perform the routing of queries expressedthrough the RDF-QEL query language. Puppin et al.[13] proposed a Grid Information Service based onthe super-peer model and its integration within OGSA.The Hop Counting Routing Index algorithm is used toexchange queries among the super-peers and, in partic-ular, to select the neighbour super-peers that offer thehighest probability of success.

3. A super-peer model for Grids

The super-peer model can be advantageouslyexploited in Grid systems for the deployment of infor-mation and discovery services. To maximize the effi-ciency of the super-peer model in Grids, it is useful toc orks.

inceargerea-en-

ctive

archEGis-atchord-onin

bridero-t theced

ased on a centralized or hierarchical approach wharge number of Grid hosts, resources, and userso be managed, also because of the heterogeneuch resources[8].

Recently, super-peer networks have been propo achieve a balance between the inherent efficienentralized search, and the autonomy, load balannd fault-tolerant features offered by distributed sea

n [21], performance of super-peer networks is evted, and rules of thumb are given for an efficient def such networks: the objective is to enhance the

ormance of search operations and at the same timimit bandwidth and processing load. In[11], a genera

echanism for the construction and the maintenansuper-peer network is proposed and evaluated. Iork, a gossip paradigm is used to exchange info

ion among peers and dynamically decide how mnd which peers can efficiently act as superpeers

ompare the characteristics of Grids and P2P netw

(i) Grids are less dynamic than P2P networks, sGrid nodes and resources often belong to lenterprises or public institutions and securitysons generally require that Grid nodes authticate each other before accessing resperesources.

(ii) Whereas in a P2P network users usually sefor well defined resources (e.g. MP3 or MPfiles), in Grid systems they often need to dcover software or hardware resources that man extensible set of resource descriptions. Accingly, while structured protocols, e.g. baseddistributed indices, are usually very efficientfile sharing P2P networks, unstructured or hyprotocols seem to be preferable in largely hetgeneous Grids. Another consequence is thaperformance of a discovery service is influen


by the distribution of classes of resources, a classof resource being a set of resources that satisfysome given constraints on resource properties, asdiscussed in Section4.

(iii) In a Grid, it is feasible to identify, for each PO, asubset of powerful nodes having high availabilityproperties; these nodes can be used as super-peers.

These considerations guided us through the designof membership and discovery services. For the sakeof simplicity, we suppose that only one super-peer isassociated to each PO, i.e. we will not considerredun-dant super-peers serving the same PO. A super-peeraccomplishes two main tasks: it is responsible for thecommunications with the other POs, and it maintainsmetadata about all the nodes of the local PO. The setof nodes belonging to a PO (i.e. the super-peer and theordinary nodes) is also referred to as acluster in thefollowing.

The membership protocol exploits the characteris-tics of contact nodes. A contact node is a Grid nodethat plays the role of an intermediary node during thebuilding process of the Grid network. One or morecontact nodes are made available by each organiza-tion. Whenever an organization wants to connect to theGrid, the corresponding super-peer contacts a subsetof contact nodes and registers at those nodes. In turn,the selected contact nodes randomly choose a numberof previously registered Grid super-peers and com-municate their addresses to the requesting super-peer:t et oft atesw r itd er, ino

age-m ted,a eredsw tactn dt to itb per-p odeS es apn eersr

Fig. 1. The membership management protocol: (a) a new super-peerregisters at a set of contact nodes and (b) receives the identities of itsneighbour super-peers.

Ordinary nodes, i.e. simple peers, can be alreadyconnected to the super-peer before it initiates the join-ing procedure or can connect to the super-peer after ithas joined the Grid.

As shown inFig. 2, the super-peer model exploits thecentralized/hierarchical information service providedby the Grid infrastructure of the local PO: e.g. the MDS-2 service of GT2[5] or the Index Service of GT3[6].It is not necessary that the same Grid framework isinstalled in all the POs: it is only required that super-peers are able to communicate with each other using a

Fig. 2. A Grid network configuration exploiting the super-peermodel.

hese super-peers will constitute the neighbour she new Grid super-peers. A super-peer communicith contact nodes either periodically or wheneveetects the disconnection of a neighbour super-perder to ask for its substitution.

Fig. 1shows a schema of the membership manent protocol. A number of contact nodes are depicnd for each of them the corresponding set of registuper-peers is reported. InFig. 1(a), the super-peerSants to connect to the Grid and selects two conodes. InFig. 1(b), the selected contact nodes adS

o the list of registered super-peers and respondy communicating the addresses of a number of sueers, which will constitute the neighbour set of n. The membership management protocol requirroper setting of thecontact parameter K, i.e. theumber of contact nodes at which a new super-pegisters (K = 2 in Fig. 1).


standard protocol and that each super-peer knows howto interact with the information service of the local PO.

The resource discovery protocol, exploited by thediscovery service, is defined as follows. Query mes-sages generated by a Grid node are forwarded to thelocal super-peer. The super-peer examines the localinformation service to verify if the requested resourcesare present in some of the nodes belonging to the localPO, and in this case sends to the requesting node aqueryHit containing the IDs of those nodes.

Furthermore, the super-peer forwards a copy of thequery to a selected number of neighbour super-peers,which in turn contact the respective information sys-tems and so on. Whenever a resource, matching thecriteria specified in the query, is found in a remote PO, aqueryHit is generated and is forwarded along the samepath back to the requesting node, and a notificationmessage is sent by the remote super-peer to the nodethat handles the discovered resource.

The set of neighbours to which a query is for-warded is determined through an empirical approach.Each super-peer maintains statistics on the number ofqueryHits received from all the known super-peers. Thesuper-peer forwards a query to the neighbour super-peers from which the highest numbers of queryHitswere received in the past. The maximum number ofneighbours to which a query is forwarded can be tunedon the basis of the network configuration, as discussedin Section4.

A number of techniques are adopted to decrease then bya re-m twos achq e then per-p -peert ain-t f thel cardst en-e veralr localP es-s oser

oidt om-

Fig. 3. Implementation of the super-peer model using the GT3 frame-work.

plementary, since technique (ii) canprevent cycles onlyin particular cases (i.e. when a query, forwarded by asuper-peer, is subsequently delivered to the same super-peer), whereas technique (iii) canremove cycles in allthe other cases (e.g. whentwo copies of a query, sent bya super-peerA to two distinct super-peersB andC, aresubsequently both delivered to the remote super-peerD).

Fig. 3 shows how the GT3 information service isused in a PO to implement the super-peer model. Sucharchitecture extends the one presented in[18]. TheIndex Service subscribes to the Service Data containedin the Grid Services published on the nodes of the localPO. Specialized GT3 aggregators periodically collectService Data, which typically contain metadata infor-mation about Grid Services, and send it to the IndexService. The Superpeer Service is a static Grid Servicethat processes requests coming from the remote POs,queries the Index Service to find resources matchingthe query constraints, and forwards query and queryHitmessages through the Network Module. Minor modifi-cations will be needed in this architecture to replace theGT3 framework with the WSRF-based Globus Toolkit4.

A simplified version of the resource discovery algo-rithm, executed by a Superpeer Service when receivinga query from an external PO, is reported inFig. 4.

4. Simulation analysis

ocold essi ate

etwork load. (i) The number of hops is limitedTime-To-Live (TTL) parameter; the TTL is decented only when the query is forwarded between

uper-peers, i.e. between two different POs. (ii) Euery message contains a field used to annotatodes that the query traverses along its path. A sueer does not forward a query to a neighbour super

hat has already received it. (iii) Each super-peer mains a cache memory where it annotates the IDs oast received query messages. A super-peer dishe queries that it has already received. (iv) Whver a super-peer, after receiving a query, finds seesources that satisfy the query constraints in theO, it constructs and forwards only one queryHit mage containing the IDs of the nodes that own thesources.

Note that techniques (ii) and (iii) are used to avhe formation of cycles in the query path, and are c

The performance of the resource discovery protescribed in Section3 was analyzed in order to ass

ts effectiveness in a Grid environment and estim


Fig. 4. The resource discovery algorithm executed by the Superpeer Service.

the influence of protocol parameters on performanceindices. An event-based object-oriented simulator wasused both for building a super-peer network, using themembership protocol, and for simulating the behaviourof the resource discovery protocol in very large Gridnetworks.

In this section, we introduce and discuss the mainparameters and performance indices used to evaluatethe resource discovery protocol, and describe the maincomponents of the simulator. Then, we report resultsobtained for two main scenarios. In the first one, theGrid network has a fixed global size, whereas the meancluster size changes. In the second scenario, the meancluster size is constant and the Grid size changes.

4.1. Simulation parameters and performanceindices

The performance of a resource discovery protocoldepends on the distribution of resources among Gridhosts. As mentioned in Section3, in Grid systems usersoften need to discover resources that belong to classesof resources, rather than a specific single resource. A

class of resources is defined as the set of resources thatsatisfy some given constraints on resource properties.For example, when building a distributed data miningapplication[10], a user may need to discover a softwarethat performs a clustering task on a given type of sourcedata. A query, containing the appropriate constraints, isgenerated to find such resources on the Grid; at a latertime, one of the discovered resources will be chosenby the Grid scheduler for execution. This is a com-mon problem in Grids where resources and/or servicesmust be searched to be used in complex applications.Therefore, the performance of a resource discoveryprotocol in a Grid is strictly related to the categoriza-tion of heterogeneous resources in a given applicationdomain.

We assumed, as in[8], that the average number ofelementary resources offered by a single node (peer orsuper-peer) remains constant as the Grid size increases.This average value was set to 5, and a gamma stochas-tic function was used to determine the number ofresources owned by each node. However, as the Gridsize increases, it becomes more and more unlikely thata new node connecting to the Grid provides resources


Table 1Simulation parameters

Parameter Value

Grid size (number of nodes),N 10–10000Mean cluster size,C 1–5000Overall number of resources offered by the Grid

vs. the Grid size5N

Overall number of resource classes offered by theGrid vs. the Grid size

5(log2 N)2

Maximum number of neighbours to which asuper-peer can forward a query, Nh

2–8

Time to live, TTL 1–5Mean query generation time, MQGT (s) 300Mean internal hop time (between peer and

super-peer) (ms)10

Mean external hop time (between two super-peers)(ms)

50

Mean elaboration time at super-peers (ms) 100

belonging to a new resource class. Therefore, weassumed that the overall number of distinct resourceclasses offered by the Grid does not increase linearlywith the Grid size. We adopted a logarithmic distribu-tion: the number of resource classes offered by a Gridnetwork withN nodes (whereN is comprised between10 and 10,000) is equal to5(log2 N)2. As an exam-ple, a Grid having 1024 nodes provides 5120 resourcesbelonging to 500 different classes.

Tables 1 and 2report, respectively, the simulationparameters and the performance indices used in ouranalysis. During a simulation run, each peer or super-peer node generates queries at random times. In par-ticular, the mean query generation timeMQGT is theaverage interarrival time between two successive queryrequests issued by the same node.MQGT is a stochasticvariable having a gamma probability distribution func-tion with a shape parameter equal to 2, and a meanvalue equal to 300 ms. For each generated query, thesimulator randomly selects the class of the resources

that the user needs to discover in accordance with auniform stochastic distribution.

Among the performance indices,Nres is deemedto be more important than the probability of successPsucc, since it is often argued that thesatisfaction ofthe query depends on the number of results (i.e. thenumber of discovered resources) returned to the userthat issued the query: for example, in[22] a resourcediscovery operation is consideredsatisfactory only ifthe number of results exceeds a given threshold. Themessage loadL should obviously be kept as low as pos-sible. This performance index often counterbalancesthe success indices, in the sense that high successprobabilities are achievable at the cost of having highelaboration loads. The ratioR is an index of efficiency:if we succeed in increasing the value ofR, a higherrelative number of queryHit messages, containing use-ful results, is generated or forwarded with respect tothe overall number of messages. Response times arerelated to thetime to satisfaction experienced by a user:to calculate them, we assumed that the mean hop time isequal to 10 ms for an internal hop (i.e. a hop between apeer and the local super-peer) and to 50 ms for an exter-nal hop (i.e. a hop between two super-peers). Further-more, we assume that a super-peer spends a mean elab-oration time of 100 ms to process an incoming query.

4.2. Description of the simulation environment

m-p ventc

d tos h ac tersa , ther -e

Table 2Performance indices

Performance index Definition

Probability of success, Psucc Probability that a query issued b ryHitMean number of results, Nres Mean number of resources thatMessage load,L Frequency of all messages receive cationsqueryHits/messages ratio,R Number of queryHits received by a hat nodeResponse times, Tr, Tr(K), Tr(L) Mean amount of time (s) that elap a generic

result (average response time), of

The simulation environment includes two coonents: a network generator and a discrete-eontinuous-time simulator.

The network generator, written in Java, is useimulate the construction of a Grid network. Suconstruction is driven by a set of network paramemong which: the Grid size, the mean cluster sizeesource distribution (Section4.1), the contact paramter. The contact parameter, defined in Section3, is

y a peer will succeed, i.e. will be followed by at least one quea node discovers after its queryd by a node (messages/s), including queries, queryHits, notifinode divided by the overall number of messages received by tses between the generation of a query and the reception ofthe kth result, of the last result


set to 2. The network generator adopts an incrementalapproach: at each step a new super-peer is added to theGrid, according to the membership protocol describedin Section3, and a number of peers are connected to thatsuper-peer, thus forming a new physical organizationconnected to the Grid. The output of the network gener-ator is a simple script describing the Grid topology andthe distribution of resources on Grid nodes; this scriptis passed as an input to the discrete-event simulator.

The discrete-event simulator, written in C++, is fullyobject-oriented and designed upon objects which sim-ulate the behaviour of Grid/P2P components and areable to exchange messages among them. Every time anobject receives a message, it performs a related proce-dure and possibly sends new messages to other objects.The objects types defined in the simulator are the fol-lowing:

• SuperPeer, which models a super-peer serving aPO. In particular, if the deployed Grid framework isthe Globus Toolkit 3, aSuperPeer object sim-ulates the behaviour of a Superpeer Service (seeFig. 3) and executes the algorithm shown inFig. 4.EachSuperPeer object is connected to a numberof similar objects serving the neighbour POs, to anumber of Peer objects corresponding to the localpeers, to aUserAgent object and to anInfoS-ervice object.

• Peer, which models a simple peer located within aPO. Such an object is connected to the localSuper-

• ser.

ure is

e

• ged

gheliv-

the

(b) Query (message exchanged betweenPeerandSuperPeer objects to search for resourcesbelonging to a given class);

(c) queryHit (message exchanged betweenPeer andSuperPeer objects to carry resultsto the user that originated a query);

(d) Notification (message sent by aSuper-Peer object to a localPeer that owns arequested resource; such aPeer is so notifiedthat it can receive a download request from thenode that originated the query message).

• InfoService, which simulates the behaviour ofthe Grid information service of a PO (e.g. the GT3Index Service) and is connected to theSuperPeerserving that PO. AnInfoServicemanages infor-mation about the resources located in the local POand is contacted by theSuperPeerobject to searchfor the resources specified by a query message.

• Event Dispatcher, which manages events,stores them in a queue ordered by message deliverytimes, and dispatches them to destination objects.

4.3. Performance evaluation

The proposed discovery protocol was first analyzedfor a Grid network having a fixed number of nodes(10,000, including super-peers and simple peers), anda mean cluster sizeC ranging from 1 (correspondingto a fully decentralized P2P network, in which peersand super-peers coincide) to 5000 (corresponding to aG dif-f wt ancew t ane com-p ers.

icalv sec-t tionp ape ofs tiveb ver,f imu-l eso ted int

e ed

Peer object and to aUserAgent object.UserAgent generates queries on behalf of a uEachUserAgent object is connected to aPeer orSuperPeerobject, in order to model the behavioof a user attached to a Grid node. If such a noda simple peer, queries are forwarded by thePeerobject to the localSuperPeer object; otherwisqueries are forwarded by theSuperPeer object toa number of adjacentSuperPeers.Event, which embodies a message exchanamong UserAgent, Peer and SuperPeerobjects. AnEvent objects is characterized throuits source and destination objects, its message dery time and its type. PossibleEvent types are:(a) GeneratedQuery (message sent by aUser-

Agent to the attachedPeer or SuperPeerobject when a new query is generated byuser);

rid composed of only two clusters). We testederent values ofNh andTTL, in order to analyze hohose parameters can be tuned to improve performhen the mean cluster size is known. Notice thastimated value of the mean cluster size can beuted by exchanging information among super-pe

It should be remarked that, though the numeralues of the performance curves reported in thision are obviously driven by the values of the simulaarameters that have been chosen, the general shuch curves can give a useful insight into the qualitaehaviour of the resource discovery protocol. Howe

or the sake of clearness, our discussion about the sation results will often refer to exact numerical valuf parameters and performance indices represen

hese curves.Results shown inFigs. 5–14are obtained withNh

qual to 4. InFig. 5, the probability of success is plott


Fig. 5. Probability of success w.r.t. the cluster size, for different val-ues of TTL and Nh = 4.

Fig. 6. Mean number of results w.r.t. the cluster size, for differentvalues of TTL and Nh = 4: overall results.

Fig. 7. Mean number of results w.r.t. the cluster size, for differentvalues of TTL and Nh = 4: internal and external results.

Fig. 8. Message load at super-peers w.r.t. the cluster size, for differentvalues of TTL and Nh = 4.

Fig. 9. QueryHits/messages ratio at super-peers w.r.t. the cluster size,for different values of TTL and Nh = 4.

Fig. 10. Average response times w.r.t. the cluster size, with Nh = 4,for different values of TTL.


Fig. 11. Response times of the first result w.r.t. the cluster size, withNh = 4, for different values of TTL.

Fig. 12. Response times of the 10th result w.r.t. the cluster size, withNh = 4, for different values of TTL.

Fig. 13. Response times of the last result w.r.t. the cluster size, withNh = 4, for different values of TTL.

Fig. 14. Response times w.r.t. the cluster size, with Nh = 4 andTTL = 4: comparison between Tr, Tr(1), Tr(10) and Tr(L).

versusC, for TTL values ranging from 1 to 5. It isobserved that, if the mean cluster size is low, a highTTL value is necessary to explore a significant portionof the Grid and achieve an acceptable probability ofsuccess. With larger cluster sizes (higher than 100),the probability of success is close to 1, but the figuredoes not allow for figuring out how many results canbe expected on average.

Fig. 6 reports the mean number of discoveredresources versusC, forTTL values ranging from 1 to 5.It appears that, as long asC is lower than 1000, the num-ber of results can be notably increased by increasingtheTTL value. Beyond that threshold, curves relatedto different values ofTTL tend to converge, meaningthat the protocol allows for exploring the entire Grid,irrespective of theTTL value. This information can beexploited when tuning the value ofTTL. For example,if we want to maximize the number of results, witha value ofC equal to 100, theTTL value should be5 or higher, whereas if the value ofC is higher than500, it is almost ineffective to increase theTTL valuebeyond 3. Moreover, the very small number of resultsobtained for a decentralized network, i.e. with a clustersize equal to 1, demonstrates the great advantage thatderives from the use of the super-peer model.

While Fig. 6 reports the overall number of resultsexpected when issuing a query,Fig. 7 distinguishesbetween internal results (corresponding to resourcesdiscovered within the local cluster) and external results(coming from remote clusters). With a fixedTTL, then r size
umber of external results increases as the cluste


increases, because a higher number of peers is searchedin remote clusters. However, in a Grid composed byvery large clusters, it is observed that the number ofexternal results begins to decrease because a consid-erable number of peers is located in the local cluster;accordingly, the contribution of internal peers to theoverall number of results becomes significant. Note thatthe number of internal results is obviously independentfrom theTTL value.

FromFig. 8, we see that a high processing load atsuper-peers is a toll to pay if a high number of resultsis desired. Indeed, the curves of message load show atrend similar to the curves, observed inFig. 7, relatedto the number of external results.

A trade-off should be reached between maximizingthe number of results and minimizing processing load;to this aim, we calculated theR index at super-peers.From Fig. 9 we see thatR, for a fixed value ofTTL,initially increases withC, as a result of the fact that thenumber of incoming queryHits experiences a higherincrease rate than the overall number of messages.Beyond a threshold value ofC, which depends on thevalue ofTTL, an opposite trend is observed; the numberof received queryHits falls down, due to the fact thata consistent percentage of peers is located within thelocal cluster. Remind that a super-peer receives query-Hits only from remote clusters, since it already knowsthe resources offered by the local cluster. Values ofRconverge to 1/3 withC = 5000, i.e. with only two clus-ters in the Grid, for the following reason: each super-p eries( riesf e then anda er isf

T thembtev lueso een2

sizea ea

value and decreases with the cluster size. The val-ues of response times decrease as the cluster sizeincreases, for two main reasons: (i) queries and query-Hits traverse a smaller number of super-peers and (ii)a higher fraction of queryHits are internal queries,which are statistically received earlier than externalqueryHits.

The response time related to the first resultTr(1)is depicted inFig. 11; if compared toTr, curves ofTr(1) have a steeper slope because, as the clustersize increases, it is more and more likely that the firstqueryHit comes from a peer located within the localcluster, thus taking a shorter time to be delivered.

The response time related to the 10th responseTr(10) is reported inFig. 12. Note that the curvesof Tr(10) are depicted only for the values of clus-ter size andTTL that allow for obtaining at least 10results with a significant probability.Fig. 13shows theresponse time related to the last result received for agiven query,Tr(L). This index corresponds to theamount of time after which all results are received;note the mean number of results was shown inFig. 6.If compared with the other response times discussedso far,Tr(L) shows a dissimilar behaviour, since itslightly increases as the cluster size increases from 2 toa value that depends on theTTL value. The reason isthat the mean number of results increases very rapidlywith the cluster size, therefore there is a higher andhigher probability that the last query, which is usuallyan external query, experiences a long response time.

icesa4 s,w llert uallyrT fol-l s the1 . Forla lts ish

ofN da -u ce.Ti s

eer receives comparable numbers of internal ququeries from local peers), external queries (querom the other super-peer) and queryHits, becausumbers of peers in the two clusters are similarlmost each query directed to the other super-pe

ollowed by one queryHit message.Fig. 9 helps to identify, for a given value ofC, the

TL value that maximizes protocol efficiency. Asean cluster size increases, the optimal value ofTTLecomes smaller and smaller. For example, ifC is set

o 100, the value ofTTL that maximizes the ratioR isqual to 3, whereas ifC is set to 500, the optimalTTLalue is 2. It is interesting to note that the highest vaf R are obtained for cluster sizes comprised betw00 and 500 and aTTL value equal to 2.

Values of response times versus the clusterre reported inFigs. 10–14. Fig. 10 shows that thverage response timeTr increases with theTTL

In Fig. 14, all the discussed response time indre compared for a fixed value ofTTL, which is set to. It is observed thatTr(10) presents higher valueith respect toTr, as long as the cluster size is sma

han 20. The reason is that the 10th result is not acteceived for all queries, as can be derived fromFig. 12;r(10) is calculated only for the queries that are

owed by at least 10 results, and for those querie0th result usually experiences a quite long delay

arger cluster sizes,Tr(10) becomes lower thanTrndTr(L) because the number of expected resuigher than 10.

Figs. 15 and 16report, respectively, the valuesres andL obtained for aTTL value equal to 4 anvariable number of neighboursNh, in order to eval

ate howNh can be tuned to optimize performanhe number of results, depicted inFig. 15, significantly

ncreases with the value ofNh only if the cluster size i


Fig. 15. Mean number of results w.r.t. the cluster size, for differentvalues of Nh, with TTL = 4.

lower than 100; with larger clusters, a value ofNhequalto 4 is sufficient to achieve a high number of results.As expected, fromFig. 16it appears that the processingload is highly increased if queries are forwarded to alarge number of neighbours.

Fig. 17 shows that the values ofR are maxi-mized withNh equal to 2, and decrease as the num-ber of neighbours increases. From an analysis ofFigs. 15, 16 and 17, we can conclude that it is notconvenient to setNh to a value higher than 4 if thecluster size exceeds 100, because we would increasethe network and processing load without significantlyincreasing the number of results. For example, fromFig. 15, we can deduce that, withC equal to 200, themean number of results does not appreciably increaseif Nh is set to a value greater than 4, whereas by using

F r dif-f

Fig. 17. QueryHits/messages ratio at super-peers w.r.t. the clustersize, for different values of Nh, with TTL = 4.

Nh= 8 the message load increases significantly (seeFig. 16) and the value ofR decreases (seeFig. 17).

Finally, the performance of the super-peer modelwas analyzed for different Grid sizes. The mean clustersize was set to 10, while the number of Grid nodes wasvaried from 10 (corresponding to a super-peer networkhaving only one cluster) to 10,000. The value ofNhwas set to 4. It appears fromFig. 18that an increase intheTTL value allows for discovering a notably highernumber of resources only with networks having morethan 1000 nodes. As usual, a similar effect is observedon the processing load (Fig. 19); as discussed above, atrade-off may be obtained by analyzing the efficiencyindexR.

Fig. 20shows that the optimumTTL value, i.e. thevalue that maximizesR, increases with the Grid size.

F rentv

ig. 16. Message load at super-peers w.r.t. the cluster size, foerent values of Nh, with TTL = 4.

ig. 18. Mean number of results w.r.t. the Grid size, for diffealues of TTL, with Nh = 4.


Fig. 19. Message load at super-peers w.r.t. the Grid size, for differentvalues of TTL, with Nh = 4.

Fig. 20. QueryHits/messages ratio at super-peers w.r.t. the Grid size,for different values of TTL, with Nh = 4.

For example, in a Grid with 1000 nodes the maximumvalue ofR is obtained with aTTL equal to 3, whereas,in a Grid with 10,000 nodes,R is maximized with aTTL equal to 5. As a consequence,TTL should be setto a value equal or greater than 5 only if the numberof nodes exceeds 5000; for smaller networks, tuningdecisions should take into account that a high valueof TTL can slightly increase the number of results butsurely decreases the overall efficiency.

5. Conclusions

The P2P model is a distributed computing paradigmthat is used to harness the computing, storage, andcommunication power of hosts in the network to make

their underutilized resources available to others. P2Pshares this goal with the Grid, which is designed toprovide access to remote computing resources for high-performance and data-intensive applications.

Resource discovery in Grid environments is cur-rently based on centralized or hierarchical models.Because such information systems are built to addressthe requirements of organizational-based Grids, theydo not deal with more dynamic, large-scale distributedenvironments. The number of queries in such envi-ronments makes a client–server approach ineffective.Future large-scale Grid systems should implement aP2P-style decentralized resource discovery model.

The super-peer model is a novel approach that facil-itates the convergence of P2P models and Grid envi-ronments, since a super-peer serves a single organi-zation in a Grid and at the same time connects toother super-peers to form a peer network at a higherlevel. This paper proposed an approach based on thesuper-peer model for handling membership manage-ment and resource discovery services in a Grid. Inparticular, a resource discovery protocol was presentedand discussed. We reported simulation results, obtainedwith different network configurations, which give somegeneral insight into how real implementations of thediscovery protocol might behave. In particular, we eval-uated the influence of protocol parameters (such asthe number of neighbour super-peers and the time tolive) on performance indices. Performance evaluationallows for efficiently tuning the values of such protocolp

A

tal-i er theF theE 5).

R

ridPro-igh-SA,

arameters when a real Grid must be deployed.

cknowledgements

This work has been partially supported by the Ian MIUR project Grid. It (RBNE01KNFP), and thesearch work of D. Talia is also carried out underP6 Network of Excellence CoreGRID funded byuropean Commission (Contract IST-2002-00426

eferences

[1] K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman, Ginformation services for distributed resource sharing, in:ceedings of the 10th IEEE International Symposium on HPerformance Distributed Computing, San Francisco, CA, U2001.


[2] I. Foster, C. Kesselman, J.M. Nick, S. Tuecke, Grid servicesfor distributed system integration, IEEE Comput. 35 (6) (2002)37–46.

[3] I. Foster, C. Kesselman, S. Tuecke, The anatomy of the grid:enabling scalable virtual organizations, Int. J. Supercomput.Appl. 15 (3) (2001).

[4] L. Garces-Erce, E. Biersack, P. Felber, K. Ross, G. Urvoy-Keller, Hierarchical peer-to-peer systems, in: Proceedings ofEuroPar, Klagenfurt, Austria, 2003.

[5] The Globus Alliance: Information Services in theGlobus Toolkit 2 Release,http://www.globus.org/mds/mdstechnologybriefdraft4.pdf.

[6] The Globus Alliance: Information Services in the GlobusToolkit 3 Release,http://www.globus.org/mds.

[7] The Gnutella Protocol Specification v.0.4.http://www9.limewire.com/developer/gnutellaprotocol0.4.pdf.

[8] A. Iamnitchi, I. Foster, J. Weglarz, J. Nabrzyski, J. Schopf, M.Stroinski, in: Grid Resource Management (ed.), A Peer-to-PeerApproach to Resource Location in Grid Environments, KluwerPublishing, 2003.

[9] Kazaa,http://www.kazaa.com.[10] C. Mastroianni, D. Talia, P. Trunfio, Managing heterogeneous

resources in data mining applications on Grids using XML-based metadata, in: Proceedings of Heterogeneous ComputingWorkshop, Nice, France, 2003.

[11] A. Montresor, A robust protocol for building superpeer overlaytopologies, in: Proceedings of the International Conference onPeer-to-Peer Computing, Zurich, Switzerland, 2004.

[12] W. Nejdl, M. Wolpers, W. Siberski, C. Schmitz, M. Schlosser, I.Brunkhorst, A. Loser, Super-peer-based routing and clusteringstrategies for RDF-based peer-to-peer networks, in: Proceed-ings of World Wide Web Conference, Budapest, Hungary, 2003,pp. 536–543.

[13] D. Puppin, S. Moncelli, R. Baraglia, N. Tonellotto, F. Silvestri,A peer-to-peer information service for the Grid, in: Proceedings

Jose,

[ er, AACM

[ vel

[ bjects, in:nce

[ sh-ernetgo,

[ signce on

[ rids,

[ .

[21] B. Yang, H. Garcia-Molina, Designing a super-peer network,in: 19th International Conference on Data Engineering, IEEEComputer Society Press, Los Alamitos, CA, USA, 2003.

[22] B. Yang, H. Garcia-Molina, Efficient search in peer-to-peer net-works, in: Proceedings of ICDCS, Wien, Austria, 2002.

Carlo Mastroianni is a researcherat the Institute for High PerformanceComputing and Networks of the ItalianNational Research Council (ICAR-CNR)in Cosenza, Italy. He received a PhD incomputer engineering from the Universityof Calabria, Italy, in 1999. His researchinterests include distributed systems andnetworks, grid computing, peer-to-peer net-works, multimedia distribution networks.As a lecturer, he teaches computer networks

at the University of Calabria and Databases at the “Parthenope”University of Naples, Italy.

Oreste Verta is a research associate at theFaculty of Engineering at the Universityof Calabria, Italy. He received the Laureadegree in computer engineering from theUniversity of Calabria in 2004. His researchinterests include grid computing, peer-to-peer networks, web service orchestration.

Domenico Talia is a full professor of com-puter science at the Faculty of Engineer-ing at the University of Calabria, Italy, aresearch associate at ICAR-CNR in Rende,

Heat

er-edg,laralia

p tionalj uter,I S,F nfer-e utureG al onW urnal,a rnal.H wrotet utiveC er oft e iss es andi

of International Conference on Broadband Networks, SanCA, USA, 2004.

14] S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenkscalable content-addressable network, in: Proceedings ofSIGCOMM, San Diego, CA, USA, 2001.

15] P. Reiher, M. Scott, Forwarding Layer for Application-lePeer-to-Peer Services (FLAPPS),http://flapps.cs.ucla.edu/.

16] A. Rowstron, P. Druschel, Pastry: scalable, distributed olocation and routing for large-scale peer-to-peer systemProceedings of the 18th IFIP/ACM International Confereon Distributed Systems Platforms, 2001.

17] I. Stoica, R. Morris, D. Karger, M.F. Kaashoek, H. Balakrinan, Chord: a scalable peer-to-peer lookup service for intapplications, in: Proceedings of ACM SIGCOMM, San DieCA, USA, 2001.

18] D. Talia, P. Trunfio, A P2P Grid services-based protocol: deand evaluation, Proceedings of the European ConferenParallel Computing (EuroPar), LNCS 3149, 2004.

19] D. Talia, P. Trunfio, Towards a synergy between P2P and GIEEE Internet Comput. 7 (4) (2003) 94–96.

20] The Web Services Resource Framework,http://www.globusorg/wsrf/.

Italy and a partner at Exeura s.r.l.received the Laurea degree in physicsUniversity of Calabria. His research intests include grid computing, distributknowledge discovery, parallel data mininparallel programming languages, celluautomata and peer-to-peer systems. T

ublished three books and more than 150 papers in internaournals such as Communications of the ACM, IEEE CompEEE TKDE, IEEE TSE, IEEE TSMC-B, IEEE Micro, ACM CGCS, Parallel Computing, IEEE Internet Computing and conce proceedings. He is a member of the editorial boards of the Feneration Computer Systems journal, the International Journeb and Grid Services, the Parallel and Distributed Practices jo

nd the Web Intelligence and Agent Systems International joue is a member of the European Commission expert panel that

he “Next Generation Grid” report and is a member of the Execommittee of the CoreGRID Network of Excellence and memb

he European Knowledge Discovery Network of Excellence. Herving as a program committee member of several conferenc

s a member of the ACM and the IEEE Computer Society.

http://www.globus.org/mds/mdstechnologybrief_draft4.pdf

http://www.globus.org/mds/mdstechnologybrief_draft4.pdf

http://www.globus.org/mds

http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf

http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf

http://www.kazaa.com/

http://flapps.cs.ucla.edu/

http://www.globus.org/wsrf/

http://www.globus.org/wsrf/

a super-peer model for resource discovery services in...

Documents