2009-2010 1
Introduction to DISTRIBUTED SYSTEMS
Tran, Van HoaiDepartment of Systems & Networking
Faculty of Computer Science & EngineeringHCMC University of Technology
2009-2010 2
Outline
• Why distributed systems needed ?• Examples• Definitions• Goals to build distributed systems
2009-2010 3
Why distributed systems needed ? (1)
• Functional distribution: computers have different functional capabilities– Client/server– Host/terminal– Data gathering/data processing
sharing of resources with specific functionalities• Inherent distribution: stemming from application
domain, e.g.,– cash register and inventory systems for supermarket chains– computer supported collaborative work
2009-2010 4
Why distributed systems needed ? (2)
• Load distribution/balancing: assign tasks to computers such that overall performance is optimized
• Replication of processing power: independent computers working on the same task– collection of microcomputers may have processing
power that no supercomputer will ever achieve
2009-2010 5
Why distributed systems needed ? (3)
• Physical separation: relying on the fact that computers are physically separated (e.g., to satisfy reliability requirements)
• Economics: collections of microprocessors offer a better price/performance ratio than large mainframes– mainframes: 10 times faster, 1000 times as
expensive
2009-2010 6
Examples (1)
• Network of workstations– all files accessible from all machines in the same way and
using the same path name– system looks for the best place to execute a command
distributed system• Workflow information system: automatic order
processing– people from several departments at different locations– users unaware how an order to be processed
distributed system
2009-2010 7
Examples (2)
• World Wide Web: offering uniform model of distributed documents– in theory, no need to know where the document is
fetched– in practice, the location should be awared
2009-2010 8
Examples (3)
• Internet
intranet
ISP
desktop computer:
backbone
satellite link
server:
%
network link:
%
%
%
• interconnected collection of computer networks of many different types• computer interacts by passing messages using a common means of communication
2009-2010 9
Examples (4)
• Intranet
the rest of
email server
Web server
Desktopcomputers
File server
router/firewall
print and other servers
other servers
Local areanetwork
email server
the Internet • resources shared to different computers
2009-2010 10
Definitions (1)
“A system in which hardware or software located at networked computers communicate and coordinate their actions only by message passing”.[Coulouris]
“A system that consists of a collection of two or more independent computers which coordinate their processing through exchange of synchronous or asynchronous message passing”.
2009-2010 11
Definitions (2)
“A distributed system is a collection of independent computers that appear to the users of the system as a single computer”.[Tanenbaum]
“A distributed system is a collection of autonomous computers linked by a network with software designed to produce an integrated computing facility”.
2009-2010 12
Computer networks vs. Distributed systems
• Computer network: autonomous computers are explicitly visible (have to be explicitly addressed)
• Distributed system: existence of multiple computers is transparent
• However,– many problems in common– in some sense networks (or parts of them, e.g. name
services) are also distributed systems– normally, every distributed system relies on services
provided by a computer network
2009-2010 13
Which examples are distributed systems ?
• Network of workstationsdistributed system
• Workflow information system: automatic order processing
distributed system• World Wide Web
not fully qualified as a distributed system (Tanenbaum)distributed system (Coulouris)
2009-2010 14
Machine A
Local OS
Machine B
Local OS
Machine C
Local OS
Distributed applications
Middleware service
Middleware service
• To guarantee– supporting heterogeneous computers– providing single view to users
2009-2010 15
Goals to build a distributed systems (1)
• Connecting users and resources– sharing resource– easier to collaborate and exchange information
disadvantage: security (intrusion), privacy violation (communication tracking)
2009-2010 16
Goals to build a distributed systems (2)
• TransparencyTransparency Description
Access Hide differences in data representation and how a resource is accessed
Location Hide where a resource is located
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another location while in use
Replication Hide that a resource may have many copies
Concurrency Hide that a resource may be shared by several competitive users
Failure Hide the failure and recovery of a resource
Persistence Hide whether a (software) resource is in memory or on disk
tradeoff between a high degree of transparency and the performance of system
2009-2010 17
Goals to build a distributed systems (3)
• Openness– Offering services according to standard rules that
describe syntax and semantics of those services• syntax specification: in interface definition language• semantic specification: in natural language
– Interoperability and portability– Flexibility: using different components from
different developers
2009-2010 18
Goals to build a distributed systems (4)
• Scalability– Measured in three dimensions
• size: more users, resources can be added easily• geographics: users, resources may lie far apart• administration: still easy to manage even spanning
many independent administrative organizations
– Some problems must be solved• size: centralization
– centralized service: single server for all users– centralized data: single online telephone book– centralized algorithm: routing based on complete information
2009-2010 19
Goals to build a distributed systems (5)
• size: centralization– centralized service: single server for all users– centralized data: single online telephone book– centralized algorithm: routing based on complete information
• geographics: synchronous & unreliable communication, – some system only designed for LAN (blocking communication
depends strongly on quick response)
• administration: conflicting policies w.r.t. resource usage, management, security
2009-2010 21
Some numbers (1)
• Computers in the Internet
Date Computers Web servers
1979, Dec. 188 0
1989, July 130,000 01999, July 56,218,000 5,560,8662003, Jan. 171,638,297 35,424,956
2009-2010 22
Some numbers (2)
• Computers vs. Web servers in the Internet
Date Computers Web servers Percentage
1993, July 1,776,000 130 0.0081995, July 6,642,000 23,500 0.41997, July 19,540,000 1,203,096 61999, July 56,218,000 6,598,697 122001, July 125,888,197 31,299,592 25
2009-2010 23
Text books & materials
• Andrew S. Tanenbaum, Maaten Van Steen, Distributed Systems: Principles and Paradigms, Prentice Hall, Second Edition, 2007
• George Coulouris, Jean Dollimore, Tim Kindberg, Distributed Systems: Concepts and Design, Addison Wesley, Fourth Edition, 2005
2009-2010 25
How to reach me
• [email protected] [email protected]• http://www.cse.hcmut.edu.vn/~hoai