master of science on e-commerce - comp.polyu.edu.hkcstyng/xian/isreport4.pdf · independent study...

24
Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study Underground Web [Scalability on P2P Models] Name: Khunsarnsombat Chairat, Kelvin 02407735g Lecturer: Dr. Vincent Ng Date: 05/12/2003 Version : Draft 0.9

Upload: lethuan

Post on 10-Sep-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 1 of 24

Master of Science

On

E-Commerce

COMP5009 Independent Study

Underground Web [Scalability on P2P Models]

Name: Khunsarnsombat Chairat, Kelvin 02407735g

Lecturer: Dr. Vincent Ng Date: 05/12/2003 Version : Draft 0.9

Page 2: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 2 of 24

Table of Content 1 Aims of project....................................................................................................... 3 2 Introduction .......................................................................................................... 4

2.1 History............................................................................................................. 4 2.2 What Benefits to society? (Legal) ..................................................................... 5 2.3 How P2P program affect society? (Illegal) ........................................................ 5

3 How that application works? ............................................................................... 7 3.1 What's P2P?.................................................................................................... 7 3.2 What's Peer? ...................................................................................................7 3.3 Peer Autonomy................................................................................................ 8 3.4 What P2P applications can share?.................................................................... 8 3.5 Adoption of P2P.............................................................................................. 9

4 P2P Architectures...............................................................................................11 4.1 Semi-centralized P2P Models.........................................................................11

4.1.1 Single Centralized Index Server Topology...............................................12 4.1.2 Computational Topology ........................................................................12 4.1.3 Multiple Server Nodes Topology............................................................13

4.2 Decentralized P2P Models .............................................................................14 4.2.1 Ring Topology.......................................................................................14 4.2.2 Hierarchical Topology ............................................................................15 4.2.3 Mesh Topology......................................................................................15 4.2.4 Pure Decentralized Topology..................................................................16

5 P2P Scalability....................................................................................................17 5.1 Files Storage – Response Time.......................................................................17 5.2 Index/Catalogue .............................................................................................17 5.3 Searching Process ..........................................................................................18 5.4 Fault Tolerance..............................................................................................18 5.5 Cost And Effective .........................................................................................19 5.6 Symmetric Communication.............................................................................19 5.7 Pervasive Computing......................................................................................19

6 Models Comparison Against Scalability............................................................20 6.1 Comparison with different semi-centralized models..........................................20 6.2 Comparison with different decentralized models ..............................................21 6.3 Semi-centralized Vs Decentralized..................................................................21

7 Conclusion ..........................................................................................................23 Reference .....................................................................................................................24

Page 3: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 3 of 24

1 Aims of project This project will focus on the technical side of P2P technology, instead of social issues. Aims

of this project are to:

• Study P2P technology and P2P models.

• Compare different models in order to find out their characteristics.

• Discuss on its scalability on several areas, which are file storage – response time,

index/catalogue, searching process, fault tolerance, cost and effective, symmetric

communication and pervasive computing, based on different models.

Page 4: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 4 of 24

2 Introduction

2.1 History

P2P starts in early 2000. It becomes a new paradigm in networking computing, a new

technology for connecting people, and effectively utilizing untapped resources anywhere on

the Internet and the Web.

P2P technologies and policies are not a new comer to the field of computing science. The

first implementation of the Internet was composed of computers-nodes that were behaving

as equals or peers to each other. Every computer had equal rights in sending and receiving

packets. At the early days of the Internet (called the Arpanet at the time), some applications

like ftp and telnet were used in a P2P fashion as every computer could ftp or telnet any

other.

After the Arpanet, several various P2P applications arisen. Usenet appeared in 1979, and it

was used by the students of North Carolina University and Duke University to exchange

messages within a set of predefined topics. It just like an email system so that users can send

message to others, and also receive message from others. 1

Domain Name System (DNS) is another well known example and essential for the Internet

today. The main concept of DNS is to share the host table file, hosts.txt, with another

machine. This file contains domain name and related IP address. As the Internet grows

exponentially, no machine can store all domain names and related IP addresses, and so the

file must be shared.

When a client requests an Internet address, it sends a URL to its nearest DNS server. If it is

aware of the URL, it replies with the corresponding IP address, otherwise, it propagates the

query to the authority for the particular namespace. This authority (another DNS server)

may delegate the query to an even higher authority until the query gets answered and the

result is propagated back and cached along the way. In other words, the name servers are

Page 5: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 5 of 24

working in a P2P fashion since they are servers and clients at the same time.

Napster, MP3 files sharing program, which is an example of P2P system. However, it is not

a true P2P system but it was the first one that raised important issues to the P2P community.

The most important of those was that the concepts of ownership and distribution of

information were differentiated. By using Napster many people would distribute information,

which they did not own, in a free and pseudo-anonymous manner.

User of Napster can share MP3 files via Internet 24hours. Based on Webnoize

(www.webnoize.com), the Napster user base was around 1.5 million in February 2001.

Exchange of MP3 files reached 2.7 billion in February 2001, and the control of copyright

seem to be out of control. 1

Besides Napster, Gnutella is a true P2P system, it gave birth to Infrasearch, which was the

first approach to P2P Information Retrieval. When the computing industry started showing

great interest in P2P solutions, some protocols are coming out. JXTA, which is initiated by

Sun Microsystems, is a set of protocols in order to provide a P2P solution to the industry.

2.2 What Benefits to society? (Legal)

P2P can help to share information and knowledge via Internet. It is stronger than a simple

search engine, as it can provide a quick, easy and effective way for users to search out any

information they want by simply enter certain key word(s). The information can be in any

type of format. And also, they can share their information and knowledge to public.

2.3 How P2P program affect society? (Illegal)

Obviously, if the users of P2P program share any file, which is not their own works, they

have already violated the law of 'Intellectual Property Laws' & 'Copyright'.

'Intellectual Property Laws' - "Intellectual property laws are not necessarily looking at who

is right or wrong, but how a company can protect what is rightfully theirs and what can be

Page 6: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 6 of 24

done if these laws are violated." 2

'Copyright' - "In the united States, the copyright law protects the right of an author to control

the public distribution, reproduction, display, and adaptation of his original work. The law

covers many, categories of work: pictorial, graphic, musical, dramatic, literary, pantomimes,

motion picture, sculptural, sound recording, and architectural." "A copyright protects the

expression of ideas rather than the ideas themselves." 2

Besides, it is the most effective and easy way for criminal to share information, such as child

porn pictures and movies. Also, terrorisms can easily communicate with each other’s for

their planning’s.

Page 7: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 7 of 24

3 How that application works?

3.1 What's P2P?

Peer-to-peer is a communications model in which each party has the same capabilities and

either party can initiate a communication session. Other models with which it might be

contrasted include the client/server model and the master/slave model. In some cases,

peer-to-peer communications is implemented by giving each communication node both

server and client capabilities. In recent usage, peer-to-peer has come to describe

applications in which users can use the Internet to exchange files with each other directly or

through a mediating server. 3

So based on the same capabilities of each party, each party exists as a particular and

discrete unit as figure 1. Each parties on a network of equals on which anyone can speak

and listen.

request/response request/response

Network

Figure 3.1 P2P application network diagram

3.2 What's Peer?

“A person who has equal standing with another or others, as in rank, class, or age.” 1 The

main idea is to figure out the equality relationship, each peer is the same within the

community or society, no difference in rank, class or age between those peers. Comparing

with client/server or master/slave models, it is completely difference from P2P. As

client/server or master/slave models have two level of roles, one is master or server, another

Peer

Peer

Page 8: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 8 of 24

one is the client or slave. Different roles have different duties. For example, master or server

will be most likely to store data or process business logic. Client or slave will just display the

data, or process some simply calculations on data.

3.3 Peer Autonomy

In order to implement decentralized models, the peers should be independent and self-

governing, which is node autonomy. Comparing with client/server models, the server will

control and manage all the clients including the file storage, database and networks.

However, every peer, in decentralized models, must manage the file storage and

communicate with other peers.

3.4 What P2P applications can share?

It might share files, bandwidth, processing power, application components and/or raw data.

These fall into four categories:

• Managing and sharing information – files, documents, photos, music, videos, and

movies all want to be shared with the business partners, friends, and colleagues.

More advanced sharing enables one machine to act as a general task manager by

collecting and aggregating results – for example, Napster used a central directory to

hold information on which MP3 music files were stored on its users' PCs--in effect,

a yellow pages--which was made available for other users to download to their own

machines. Therefore, a general definition is more useful for practical purposes: "any

application or process that uses a distributed architecture and allows peers to

provide and consume resources." 1

• Collaboration – individual users find that address book, schedule, chat and email

software improves their productivity. Connecting the desktop productivity software

together enables collaborative e-business communities to form for flexible,

productive, and efficient working teams. For example, Java developers use

OpenPorject.net to collaborate. On a broader scale, hundreds of thousands use

instant message, which may be the most popular P2P application to date.

Page 9: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 9 of 24

• Enterprise resource management – coordinating workflow processes within an

organization leverages the existing infrastructure of networked desktop computer

systems. For example, Groove enables an aerospace manufacturer to post job

order requests to partner companies and route the completed requests from one

department to the next.

• Distributing computation – a natural extension of the Internet’s philiosophy of

robustness through decentralization is to design P2P systems that send computing

tasks to millions of servers, each one possibly also being a desktop computer. For

example, SETI@home uses a central system to divide up radio-telescope data for

processing by Internet users' home PCs, when they are not being used for other

purposes, and coordinates the results. 1

3.5 Adoption of P2P

“Peer-to-peer (P2P) computing for business will become common within five years, as more

content management vendors offer P2P functions, according to predictions by analyst

company Gartner.” P2P systems let users search and access data and content held on other

users' systems, rather than on a server. Gartner said P2P poses challenges in security, policy

and workflow, but said firms will gain "significant competitive advantage" by being able to

access content quickly via P2P systems. 5

It seems that P2P will become more common to businesses in the future. Based on CEO

Jon Zimmerman, who's been researching P2P for two years, likes to stay ahead of the curve.

"I was working on Web sites in 1994," Zimmerman says. 6

The New York City-based consultancy is a Groove Networks developer and is exploring

projects with as-yet-unnamed grid computing firms. Symbiant has also established a

discussion group, PEER Grid, which meets to explore grid development issues. If P2P is

implemented in a hybrid collaborative environment, companies are excited by the dynamic

ways that people can work together." 6

Page 10: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 10 of 24

Int*Net Methods president Glenn Witerski wasn't even thinking about P2P when he

stumbled across P2P collaboration vendor 1st Works. The Tucson-based integrator and

consulting firm, staffed by pros with 20 years in the computer business, isn't necessarily

looking for bleeding edge technology. Witerski dug up 1st Works through a straightforward

Web search, and simply liked what he saw. "They were the only ones that had the right

buzzwords as far as real-time collaboration," he says. 6

Page 11: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 11 of 24

4 P2P Architectures

P2P systems are built up of nodes, most likely computers. Those computers or device are

connected together in some fashion - models. Computers or devices can communicate with

each other by passing messages. The messages can be either transference of simple data or

commands from a device in order to control other device(s). This section identifies P2P

dynamic networks, node and models.

There are two core types of model, one is semi-centralized, which is partially centralized,

and the other is Decentralized, which is fully decentralized including file storage, indexing and

searching. Finally, it will try to compare different models based on their characteristics.

4.1 Semi-centralized P2P Models

A semi-centralized P2P model contains at least one or several central point(s) of

control – server node(s), which can be a server or a simple PC. The central point provides

two purposes. First, it can maintain strict control including authentication (a certain level of

security control) and resources access control over the whole network. Second, it can

represent as a kind of index server, but does not provide any further content or services. So

it is just a lookup server, Napster is of this model.

This kind of models can be generally divided into three parts, physical files and resources

storage, indexing and searching processes. Indexing and searching will be processed in

the central point of server(s), providing a lookup service to client nodes. The files and

resources will be stored in client nodes.

As this kind of models need the central point of server(s) - server node, costing will be

much more higher due to extra hardware and software for the server(s). Besides, if the

server(s) is/are down, the whole system will be out of service as well, it becomes a single

of failure.

Page 12: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 12 of 24

4.1.1 Single Centralized Index Server Topology Figure 4.1 Single Centralized Index Server

• Indexing and searching processes are centralized to a single central server –

server node. It provides a lookup service to all client nodes (solid lines).

• The server node maintains the index/catalogue for client nodes to do searching.

• Files or resources are stored in client nodes, and they can be stored in server

node as well.

• All client nodes transfer data and communicate with other client nodes

(dash lines), after getting the reference from the server node (solid lines).

• Client nodes are autonomous, independent and intelligent.

4.1.2 Computational Topology Figure 4.2 Computational

• Indexing and searching processes are centralized to a single central server –

server node.

Server

Client

Server

Client

Lookup requests

Data transfers and communications

Page 13: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 13 of 24

• The server node maintains the index/catalogue for client nodes to do searching.

• Files or resources are stored in all nodes. That is the reason why it is not

client/server model.

• All client nodes get data from the server node, as the server will obtain the file

for them.

• Client nodes are not autonomous, independent and intelligent.

4.1.3 Multiple Server Nodes Topology Figure 4.3 Multiple Server Nodes

• Indexing and searching processes are centralized to multiple central

servers – server nodes. It provides a lookup service to all client nodes.

• The server nodes maintain the index/catalogue for client nodes to do searching.

The index/catalogue can be either full set or partitioned implementations as

described in section 5.2. If each server stores full set of index, it can prevent single

point of failure.

• Files or resources are stored in all nodes.

• All client nodes transfer data and communicate with other client nodes,

after getting the reference from the server nodes.

• Client nodes are autonomous, independent and intelligent .

Client

Server

Page 14: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 14 of 24

4.2 Decentralized P2P Models

Decentralized P2P model does not contain any central point of control – server node.

So, it does not need any extra hardware and software for server node. Costing of this

kind of models can be lower.

Files or resources, indexing and searching processes are distributed to all peers. So,

all peers are equal, autonomous, independent and intelligent. Each node must maintain its

own index/catalogue. The index/catalogue can be either full set or partitioned

implementatio ns as described in section 5.2. They can communicate symmetrically.

Gnutella is one of the examples.

4.2.1 Ring Topology

Figure 4.4 Ring

• Peers are organized in structured fashion.

• Each peer is connected to two other peers, forming a loop. Data is sent from a

peer to another peer around the loop.

• Data can be sent in any direction to the destination peer.

• If any peer is out of order, the whole network will be down.

Peer

Page 15: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 15 of 24

• Each peer must aware of two peers, which are the two closed to it .

4.2.2 Hierarchical Topology

Figure 4.5 Hierarchical

• Peers are organized in parent-child relationships or tree structure.

• Each peer can directly communicate with a few closed peers, which are its

parent or child, but they can indirectly communicate indirectly with each

other.

• If any peer is out of order, only its child will suffer from the problem, it will not

affect the whole network

• Child peers must aware of their parent peer.

4.2.3 Mesh Topology

Page 16: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 16 of 24

Figure 4.6 Mesh

• Peers are organized in non-structured fashion.

• Each peer can directly communicate with a few closed peers and indirectly

with each other.

• If any peer is down, it will not affect the whole network, as each peer can

communicate each other using other paths.

• Each peer does not aware of what other peers existing on the network.

4.2.4 Pure Decentralized Topology

Figure 4.7 Pure Decentralized

• Peers are organized in structured fashion, which is pure decentralized model.

• Each peer can directly communicate and transfer files from or to each other.

• If any peer is down, it will not affect the whole network or other peers.

• Each peer aware of what other peers existing on the network.

Page 17: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 17 of 24

5 P2P Scalability

“Scalability is jargon for the ability to support large numbers of nodes. Arguments over

whether completely decentralized designs such as Gnutella could support large numbers of

nodes have been present since the first day of its release. Justin Frankel, Gnutella's inventor,

stated that 250-300 nodes was the likely maximum. The Gnutella protocol tends to generate

more network traffic than most protocols. One reason is that a search request must be sent

to every node eligible to fulfill it. This is considered by some to make such excessive

demands on available bandwidth that the network must eventually collapse, and Gnutella

become unusable, at a large size. Possible counterarguments are that the Gnutella network

has already reached a fairly large size without collapsing; that the limited lifespan of a

Gnutella message prevents out-of-control resource usage; and that the topography of

Gnutella is in fact isomorphic with the topography of Napster.” 13

This section will discuss on decentralized, semi-centralized and client/server models’

scalability on several areas, including file storage – response time, index/catalogue, searching

process, fault tolerance, cost and effective, symmetric communication and pervasive

computing.

5.1 Files Storage – Response Time

Client/server models can only store the data in the server, so that the response time is the

same for all clients, no matter where they are in the network. However, semi-centralized and

decentralized model can decrease the response time because it can move the resources

closer to where the clients access. It reduces the network latency.

5.2 Index/Catalogue

For semi-centralized and client/server model, the index and catalogue are stored in one or

duplicated central server(s). Decentralized models’ index and catalogue store in all peers.

Client/server model’s server(s) store a full set of copy. However, the peers in decentralized

Page 18: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 18 of 24

models and server(s) in semi-centralized models can store the index and catalogue, either in

full copy or partitioned copies.

For full copy index implementation in decentralized and semi-centralized models, the index

will quickly become out of synchronous. Besides, each peer needs to store many

unnecessary index records, and the index will grow very fast. So, it is not recommended to

store full copy in each peer. For partitioned copies implementation, it just likes a Domain

Name System (DNS), each peer shares a file/resource lookup file containing file/resource

names or file/resource descriptions and the files/resources’ physical location (in which peers).

When a peer requests a file/resource, it searches its file/resource lookup file and try to find

out the files/resources’ physical location. If it cannot find the location in its file/resource

lookup file, it will propagates the query to other nearby peers, until the gets answered and

the result is propagated back.

5.3 Searching Process

Although Semi-centralized and client/server models’ servers can process requests in parallel,

each server has a threshold and become a bottleneck. Replacing or buying faster and better

hardware may reduce bottleneck threshold, however, it may not be a long-term solution for

a corporate. Distribution of load across multiple machines is a popular solution to reduce the

bottlenecks.

5.4 Fault Tolerance

Semi-centralized and client/server models often suffer from single point of failure, because

they need the server(s) to be existed in the network. Failure can be due to server down,

application error or network problem. It will be expensive and difficult to build redundancy

into every component of the system. When those models are down, it will bring the

company’s operation down as well. Decentralized models can prevent this problem to be

occurred, as these models do not need any central point of control – server node at all.

Page 19: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 19 of 24

5.5 Cost And Effective

Semi-centralized and client/server models is more expensive, because they require client’s

hardware and software, and also server’s hardware and software. However, the cost of

decentralized models is lower as no extra hardware is needed. Besides, it greatly increases

the usage of storage, bandwidth and CPU resources.

5.6 Symmetric Communication

Decentralized models provide a two-way communication to the clients. The value of the

network is increased because of two-way communication channel so that a more symmetric

flow of information is established in this model. However, for centralized models, the

communication is one-way or unidirectional, client nodes connect to central server, and push

or pull data from that central server. It provides less value to the network usage.

5.7 Pervasive Computing

Semi-centralized and client/server models, require one more hardware item, server.

However, decentralized model does not need server at all. When a group of PCs connect

together, they can form a P2P system. Even a non-PC device such as PDAs or handsets

can be one of the peers.

Page 20: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 20 of 24

6 Models Comparison Against Scalability

6.1 Comparison with different semi-centralized models Single Centralized Index Server Computational Multiple Server Nodes Scalability File Storage – response time Store in all nodes (server and

clients) Store in all nodes (server and clients)

Store in all nodes (server and clients)

Index/Catalogue Full set of index Full set of index Can be either full set or partitioned copies of index

Searching process Process in the central point of server

Process in the central point of server Can be either parallel processes or load balancing

Fault tolerance Single point of failure Single point of failure Multiple servers can be redundancy

Cost and effective Expensive Expensive The most expensive Symmetric Communication Symmetric Asymmetric Symmetric Pervasive Computing No No No Security Central server can act as

authentication server Central server can act as authentication server

Central servers can act as authentication server

Maintenance Middle Easy Hard

Page 21: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 21 of 24

6.2 Comparison with different decentralized models Ring Hierarchical Mesh Pure Decentralized Scalability File Storage – response time Storing in all peers Storing in all peers Storing in all peers Storing in all peers Index/Catalogue Partitioned copies of index Partitioned copies of index Partitioned copies of

index Partitioned copies of index

Searching process Easy Middle Hard Hard Fault tolerance High risk, affect the whole

network Middle risk, affect partial nodes

Low risk, no affect to the network

Low risk, no affect to the network

Cost and effective Low Low Low Low Symmetric Communication Yes Yes Yes Yes Pervasive Computing Yes Yes Yes Yes Security Hard to implement Hard to implement Hard to implement Hard to implement Maintenance Easy Middle Hard Hard

Page 22: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 22 of 24

6.3 Semi-centralized Vs Decentralized Semi-centralized Decentralized Scalability File Storage – response time Storing in all nodes Storing in all peers Index/Catalogue Single full copy storing in server Partitioned copies in all peers Searching process Easy to implement searching process Hard to implement searching process Fault tolerance Single point of failure No single point of failure Cost and effective Expensive, not effective resources usage Not expensive in hardware, more effective resources

(storage, bandwidth and CPU) usage Symmetric Communication No Yes Pervasive Computing No Yes Security Central server(s) can act as authentication server(s) Hard to secure Maintenance Easy Hard

Page 23: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 23 of 24

7 Conclusion P2P systems is continuously evolving, and becoming a new trend of computing technology.

Although Napster had to shutdown due to legal reason during 2002, it has proved that the

importance of P2P.

This project has given me an understanding of the P2P technology and models in general. It

figures out the differences and compares those models. Finally, it has a brief discussion on

the differences and scalability on file storage – response time, index/catalogue, searching

process, fault tolerance, cost and effective, symmetric communication and pervasive

computing, according to different models.

Page 24: Master of Science On E-Commerce - comp.polyu.edu.hkcstyng/xian/ISReport4.pdf · Independent Study (COMP5009) Page 1 of 24 Master of Science On E-Commerce COMP5009 Independent Study

Independent Study (COMP5009) Page 24 of 24

Reference

1. Java P2P Unleashed, Sams Publishing, 2003

2. SHON HARRIS CISSP

3. Christine Axton “Enterprise P2P: Flexibility and ROI What is a P2P application?”

http://techupdate.zdnet.com/techupdate/stories/main/0,14179,2863926-2,00.html

[6 May, 2002]

4. Donna Wolff

http://searchnetworking.techtarget.com/sDefinition/0,,sid7_gci212769,00.html

5. IT Week staff “Businesses start to adopt P2P”

http://www.zdnet.com.au/newstech/ebusiness/story/0,2000048590,20209706,00.ht

m [19 March 2001]

6. “P2P Finds Its Place” http://www.peertopeersource.com/research/findsitplace.html

[11 January 2002]

7. Dan Gillmor “A Note to IT: Why You Need to Know P2P”

http://www.computerworld.com/networkingtopics/networking/lanwan/story/0,1080

1,58006,00.html

8. “P2P security” http://homepage.ntlworld.com/tim.leonard1/p2psecurity.htm

9. Clay Shirky, Kelly Truelov, Rael Dornfest & Lucas Gonze “2001 P2P

Networking Overview” [September 2001]

10. “P2P Research Report Strips the Hype from Peer-to-Peer”

http://press.oreilly.com/pub/pr/845 [7 November 2001]

11. James Walkerdine, Lee Melville, Ian Sommerville “Dependability Properties of

P2P Architectures”

12. Klaus Marius Hansen “P2P Architectures” [9 September 2003]

13. Lancaster University “Peer-to-Peer: An Overview”

14. Bin Yu, Jianzhong Liu, Chui Sian Ong “Scalable P2P Information Retrieval via

Hierarchical Result Merging” [13 May 2003]

15. Lucas Gonze “Scalability of Gnutella” http://www.openp2p.com/pub/gl/42