part 1 introduction to distributed computing

Upload: tapir999

Post on 30-May-2018

221 views

Category:

Documents


1 download

TRANSCRIPT

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    1/42

    EMM4086: DMS 1

    Chapter 1:Part 1

    Introduction to Distributed

    Computing

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    2/42

    EMM4086: DMS 2

    Contents

    Definition

    Examples of DS

    Advantages and Disadvantages

    Key characteristics

    Challenges in designing DS

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    3/42

    EMM4086: DMS 3

    Definition

    Coulouris - A system

    in which hardware orsoftware components

    located at networked

    computers

    communicate and

    coordinate their actions

    only by message

    passing

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    4/42

    EMM4086: DMS 4

    Tanenbaum - A

    distributed system is acollection of

    independent

    computers that appear

    to the users of the

    system as a single

    computer

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    5/42

    EMM4086: DMS 5

    A distributed system is a collection of autonomous

    computers linked by a network with softwaredesigned to produce an integrated computingfacility

    Sharing of resources is a main motivation forconstructing DS

    Resources may be managed by servers and accessedby clients or they may be encapsulated as objects andaccessed by other client objects

    The term resource extends from hardware such as

    disks, printers, to software defined entities such asfiles, databases and data objects of all kinds.

    Also includes video streams and audio streams

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    6/42

    EMM4086: DMS 6

    Reasons for DS

    DS consisting of collections of microcomputers may

    have processing powers that no supercomputer willever achieve

    10,000 CPUs, each running at 50 MIPS* yields500000 MIPS, then instructions to be executed in

    0.002 nsec, equivalent to light distance of 0.6 mm -any processor chip of that size would meltimmediately.

    Collection of microprocessors offer a better price/

    performance ratio than main frames- main frames: 10times faster, 1000 times as expensive

    (*) MIPS million instructions per second

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    7/42

    EMM4086: DMS 7

    Why DS and not

    isolated

    hardware?

    Enhance person to

    personcommunication

    Need to share data

    and resourcesamongst users

    Flexibility: different computers with different

    capabilities can be shared among users

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    8/42

    EMM4086: DMS 8

    Problems with distributed, connected systems

    Software - how

    to design and

    manage it in a

    DS

    Dependency

    on the

    underlying

    infrastructure

    Easy access to

    shared data

    raises security

    concerns

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    9/42

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    10/42

    EMM4086: DMS 10

    Example of DS - Internet

    intranet

    ISP

    desktop computer:

    backbone

    satellite link

    server:

    network link:

    Figure 1.1: A typical portion of the Internet - Coulouris

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    11/42

    EMM4086: DMS 11

    Example of DS - Internet

    Vast collection of interconnected computer networks

    of many different types programs on various computers interact by passing

    messages as a means of communication

    the design and construction of the internetcommunication mechanisms (the internet protocols)

    is a major technical achievement enabling a program

    running anywhere to address messages programs

    anywhere else.

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    12/42

    EMM4086: DMS 12

    Example of DS - Internet

    As shown in Figure 1.1, a collection of intranets,

    subnets are operated by organizations. ISPs arecompanies that provide modem links and otherconnection to individual users and smallorganizations enabling the users to access internetservices.

    Intranets are linked together by backbones. abackbone is a new link with a high transmissioncapacity, employing satellite connections, fiber opticcables and other high bandwidth circuits

    Multimedia services are available in the internet, butquite limited as the internet does not have facilities toreserve network capacities for individual streams ofdata

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    13/42

    EMM4086: DMS 13

    Example of DS - Intranet

    the rest of

    email server

    Web server

    Desktopcomputers

    File server

    router/firewall

    print and other servers

    other servers

    print

    Local area

    network

    email server

    the Internet

    Figure 1.2: A typical intranet - Coulouris

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    14/42

    EMM4086: DMS 14

    Example of DS - Intranet

    Portion of the internet is intranet, but separately

    administered and has a boundary that can beconfigured to enforce local security policies.

    Composed of several LANs linked by backboneconnections

    intranet is connected to internet via routers. Manyorganizations need to protect their own services

    Firewall is to protect an intranet by preventingunauthorized messages leaving or entering

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    15/42

    EMM4086: DMS 15

    Example of DS - DMS

    Often use internet infrastructure

    characteristics: Heterogeneous data sources andsinks that need to be synchronized in real time

    (Video, Audio, Text)

    Often distribution services- Multicast

    Examples:

    1. Tele-teaching

    2. Video-conferencing

    3. Video and audio on demand

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    16/42

    EMM4086: DMS 16

    Example of DS - Mobile and Ubiquitous systems

    Laptop

    Mobile

    Printer

    Camera

    Internet

    Host intranet Home intranetWAP

    Wireless LAN

    phone

    gateway

    Host site

    Figure 1.3: Portable and handheld devices in a distributed system -

    Coulouris

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    17/42

    EMM4086: DMS 17

    Example of DS - Mobile and Ubiquitous systems

    Integration of small and portable devices such as

    Laptop computers

    Handheld devices including PDAs, mobile phones,

    pagers, cameras

    wearable devices such as smart watches withfunctionality similar to a PDA

    Devices embedded in appliances such as washing

    machines, cars, and refrigerators are integrated into

    DS

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    18/42

    EMM4086: DMS 18

    Example of DS - Mobile and Ubiquitous systems

    In mobile computing, users who are away from their

    home intranet are still provided with access toresources via the devices they carry with them. They

    can continue to access the Internet.

    Ubiquitous computing:

    It would be convenient for users to control their washing

    machine etc from a universal remote control device in the

    home.

    Equally the washing machine could page the user via a

    smart badge when the washing is done.

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    19/42

    EMM4086: DMS 19

    Example of DS - Mobile and Ubiquitous systems

    As shown in Figure 1.3, the user has access to three

    forms of wireless connection1. their laptop has a means of connecting to the hosts

    wireless LAN. This provides coverage for few hundred

    meters

    2. It connects to the rest of intranet through gateway

    3. Mobile users are connected to the internet through WAP

    via a gateway

    users carries a digital camera, which can

    communicate over a infrared link when pointed at

    corresponding device such as printer.

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    20/42

    EMM4086: DMS 20

    Other examples of DS

    MMU campus IT network, where each faculty has a pool of computers linked together in

    a LAN , sharing common resources, such as email server, web server and ftp server.

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    21/42

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    22/42

    EMM4086: DMS 22

    Advantages of DS

    1. Extensibility: can be

    extended as the demand for

    service grows with out replacingany of the existing components

    2. Resource sharing: Extensive

    hardware can be shared among users.

    Data files can be managed andmaintained centrally, but remains

    accessible to all users

    3. Replication: High reliability

    achieved by maintaining several

    copies of data in different

    computers

    4. Continued availability: the

    failure of a single workstation does

    not affect the service to all users

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    23/42

    EMM4086: DMS 23

    Disadvantages of DS

    1. Loss of flexibility in allocation of memory and processing resources: In a centralized

    system, all processor and memory resources are available for allocation by the OS in any

    manner required by the current workload. In DS, the processor and memory capacity of theworkstation determine the largest task that can be performed.

    2. Dependence on network performance and reliability: failure of the local network can

    cause the service to users to be interrupted. Overloading of the network can degrade theperformance and responsiveness to the users.

    3. Security weakness: To achieve openness, many of the software interfaces are made readily

    available to clients. This leads to security problems

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    24/42

    EMM4086: DMS 24

    Challenges

    1. HETEROGENEITY: Internet enables users to access services

    and run applications over a heterogeneous collection ofcomputers and networks.

    Heterogeneity applies to the following:

    computer networks

    computer hardware

    OS

    Programming languages

    Implementation by different developers

    Middleware: It applies to a software layer that provides a

    programming abstraction as well as masking the heterogeneity of

    the underlying networks, hardware, OS and PLs.

    Example: CORBA, Java RMI

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    25/42

    EMM4086: DMS 25

    2. OPENNESS: The openness is determined by the degree to

    which the new resource sharing services can be added andbe made available for use by a variety of client programs.

    Openness cannot be achieved unless the specification and

    documentation of the key software interfaces are

    published. Open DS constructed from heterogeneous hardware and

    software possibly from different vendors.

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    26/42

    EMM4086: DMS 26

    3. SECURITY: There are 3 components:

    i. confidentiality (protection against disclosure to unauthorizedindividuals)

    ii. Integrity (protection against alteration or corruption)

    iii. Availability (protection against interference with the means to accessthe resources).

    Two security challenges that is yet to be met:

    1. Denial of service attacks: Here the users may disrupt a service forsome reason. Achieved by bombarding the service with such a largenumber of pointless requests that the serious users are unable to useit. This is denial of attack

    2. Security of mobile code: Assume some one who receives the EXEfrom email as attachment: it may access local resources, or denial ofservice attack.

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    27/42

    EMM4086: DMS 27

    4. SCALABILITY:

    DS is scalable if it remains effective when there is asignificant increase in the number of resources and the

    number of users.

    Date Computers Web servers

    1979, Dec. 188 0

    1989, July 130,000 0

    1999, July 56,218,000 5,560,866

    2003, Jan. 171,638,297 35,424,956

    Figure 1.5: Computers in the Internet - Coulouris

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    28/42

    EMM4086: DMS 28

    Design of Scalable DS presents challenges given

    below1. Controlling the cost of physical resources: as a demand

    of the resource grows, it should be possible to extend the

    system, at reasonable cost. In general, for a system with

    n users to be scalable, then the quantity of the physicalresources required to support them should be at most

    O(n)- that is proportional to n. Ex: 20 users- 1 server,

    then 40 users- 2 server and so on.

    2. Controlling the performance Loss: In managing thedata the time taken to access data is O(log n), where n is

    the size of the data. For a system to be scalable, the

    maximum performance loss should be no worse than this.

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    29/42

    EMM4086: DMS 29

    3. Preventing software resources running out: An

    example of lack of scalability is shown by the numbersused in the Internet. In late 1970s, it was decided to use

    32-bit. It run out in early 2000s. New version use 128 bit

    Internet address. The demand is difficult to predict. But

    too large Internet address occupy extra space in messages

    and in computer storage.

    4. Avoiding bottleneck: Algorithms to be decentralized to

    avoid having performance bottleneck. Example : Domain

    Name System can be kept in a single master file and

    downloaded to any computers when needed. But this isefficient only when the number of computers are less. If

    more, then it is to be decentralized.

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    30/42

    EMM4086: DMS 30

    Date Computers Web servers Percentage

    1993, July 1,776,000 130 0.008

    1995, July 6,642,000 23,500 0.4

    1997, July 19,540,000 1,203,096 6

    1999, July 56,218,000 6,598,697 12

    2001, July 125,888,197 31,299,592 25

    42,298,371

    Figure 1.6: Computers vs. Web servers in the Internet - Coulouris

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    31/42

    EMM4086: DMS 31

    5. FAILURE HANDLING: Failures in DS is partial - that is

    some components fail while others continue to functions.Therefore the handling of failures is difficult.

    Detecting failures: Some failures may be detected e.g. check sums to

    be used to find the error in a message. But difficult to detect the

    failures such as remote crashed server in the Internet. The challenge

    of DS is to manage in the presence of failures that cannot be detected,

    but may be suspected

    Masking failures: some failures may be detected but can be hidden

    or made less severe. Examples.

    1. Messages can be retransmitted

    2. File data can be written to a pair of disks so that if one is corrupted, the

    other may still be correct

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    32/42

    EMM4086: DMS 32

    Tolerating failures: example: when a web browser cannot

    contact the web server, it does not make the user wait for ever

    while it keeps on trying- it informs the user about the problem

    Recovery from failures: Recovery involves the design of the

    software so that the state of the permanent data can be recovered

    or rolled back after the server has crashed.

    Redundancy: services may be made to tolerate failures by the

    use of redundant components. Examples:

    1. Always two different routes between any two routers in Internet

    2. In the DNS, every name table is replicated in at least two different

    servers

    3. A database may be replicated in several servers to ensure that data

    remains accessible after the failure of the single server

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    33/42

    EMM4086: DMS 33

    Concurrency: since there is a possibility that several clients will attempt

    to access a shared resource at the same time, any object that represents a

    shared resource in a DS must be responsible for ensuring that it operatescorrectly in a concurrent environment.

    For an object to be safe in a concurrent environment, its operations must

    be synchronized in such a way that its data remains consistent.

    This can be achieved by standard techniques such as semaphores.

    Example: IF TWO CONCURRENT BIDS ARE A: 15 and B: 56 AND

    THE CORRESPONDING OPERATIONS ARE INTERLEAVED

    WITHOUT ANY CONRTOL, THEN THEY MIGHT GET STORED AS

    A:56 and B:15.

    It is possible that several threads may be executing concurrently within

    an object and in which case their operations on the object may conflict

    with one another and produce inconsistent results as above.

    T i

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    34/42

    EMM4086: DMS 34

    Transparencies Access transparency: enables local and remote resources to be accessed

    using identical operations.

    Location transparency: enables resources to be accessed withoutknowledge of their physical or network location (for example, whichbuilding or IP address).

    Concurrency transparency: enables several processes to operateconcurrently using shared resources without interference between them.

    Replication transparency: enables multiple instances of resources to be

    used to increase reliability and performance without knowledge of thereplicas by users or application programmers. Failure transparency: enables the concealment of faults, allowing users

    and application programs to complete their tasks despite the failure ofhardware or software components.

    Mobility transparency: allows the movement of resources and clients

    within a system without affecting the operation of users or programs. Performance transparency: allows the system to be reconfigured toimprove performance as loads vary.

    Scaling transparency: allows the system and applications to expand inscale without change to the system structure or the application algorithms.

    C

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    35/42

    EMM4086: DMS 35

    Consequences

    1. Concurrency:

    In a network of computers, concurrent program executionis the norm. (ex. Sharing of files from common server).

    The coordination of concurrently executing programs thatshare the resources is an important one

    1. No Global clock:

    When programs need to cooperate, they coordinate theiractions by exchanging messages.

    Close coordination often depends on a shared idea of thetime at which programs actions occur.

    But it turns out that there are limits to the accuracy with

    which the computers in a network can synchronize theirclocks.

    There is no single global notion of the correct time.

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    36/42

    EMM4086: DMS 36

    3. Independent Failures:

    DS can fail in new ways. Faults in the network result inthe isolation of the computers that are connected to it, but

    does not mean that they stop running.

    In fact, the programs on them may not be able to detect

    whether the network has failed or has become unusuallyslow

    Each component can fail independently leaving the

    others still running

    Cl k h i ti

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    37/42

    EMM4086: DMS 37

    Clock synchronization

    With a single computer and a single clock, it does not matter

    much if this clock is off by a small amount.

    As soon as multiple CPUs are introduced, each with its ownclock, the situation changes. Although the frequency at whicha crystal oscillator runs is usually fairly stable, it is impossible

    to guarantee that the crystals in different computers all run atexactly the same frequency. This leads to the clocks to get outsynchronization and different values when read out. It is calledclock skew.

    As a consequence of this clock skew, programs that expect thetime associated with a file, object process or message to becorrect and independent of the machine on which it wasguaranteed can fail.

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    38/42

    EMM4086: DMS 38

    1. Computer on which compiler runs

    2. Computer on which editor runs

    Output.ocreated

    Output.ccreated

    2144 2145 2146 2147

    2142 2143 2144 2145

    Figure 1.7: when each machine has its own clock, an event that

    occurred after another event may nevertheless be assigned an earlier

    time

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    39/42

    EMM4086: DMS 39

    Let output.o has time 2144 and shortly thereafter output.c ismodified but is assigned a time 2143 because the clock of its

    machine is slightly behind. Make will not call the compiler. The resulting exe binary

    program will then contain a mixture of object files from theold sources and the new sources.

    It will probably crash and the programmer will go crazy trying

    to understand what is wrong with the code. Another case: programmer has finished changing all the

    source files, he starts make, which examines the times atwhich the source and object files were last modified.

    If the source file input.c has time 2151 and the corresponding

    object file input.o has time 2150, make knows that input.c hasbeen changed since input.o was created, and thus input.c mustbe recompiled.

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    40/42

    EMM4086: DMS 40

    For ordinary clocks based on quartz crystal, this drift is about 10 -6

    seconds/ second- giving a difference of 1 second in every 11.6days.

    High precision quartz clock: 10-7 to 10-8

    More accurate physical clocks use atomic oscillators whose

    drift rate is about one part in 1013

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    41/42

    EMM4086: DMS 41

    2. Resource sharing

    1. Openness3. Concurrency

    4. Scalability

    5. Fault tolerance

    6. Transparency

    Key

    characteristics

    C t N t k d DS

  • 8/14/2019 Part 1 Introduction to Distributed Computing

    42/42

    Computer Networks and DS

    Computer Network: The autonomous computers are

    explicitly visible (have to be explicitly addressed) Distributed system: Existence of multiple

    autonomous computers is transparent

    However

    many problems in common

    in some sense networks are also DS

    Normally every DS relies on services provided by acomputer network