distributed (operating) systemslaurel.datsi.fi.upm.es/~jmpena/insa/ds-1-introduction.pdfdistributed...

38
Distributed (Operating) Systems Introduction Introduction

Upload: others

Post on 13-Mar-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Distributed (Operating) Systems

IntroductionIntroduction

Page 2: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems1

ScheduleSessions1. Introduction: Distributed systems

(Hardware/Software issues)2. Process management in clusters: Load

balancing and job scheduling 3. Distributed communications 4. Distributed servicesScenarios• High-performance solutions for scientific

applications (process management)• Distributed systems for transactional

services

Mon Tue8:00

9:00

10:0011:00

1-Intro 4-serv

12:0013:00

LUNCH

14:0015:00

2-proc Scenario 2

16:00

17:00Scenario 1

3-comm

Page 3: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems2

Bibliography

• Distributed Systems: Concepts and DesignG. Coulouris, J. Dollimore, T. Kindberg; Addison-Wesley, 2001

• Distributed Systems: Principles and ParadigmsA. S. Tanenbaum, M. Van Steen; Prentice-Hall, 2007

• Distributed Operating Systems: Concepts & PracticeD. L. Galli; Prentice-Hall, 2000

• Distributed Operating Systems & AlgorithmsR. Chow, T. Johnson; Addison-Wesley, 1997

• Distributed Computing: Principles and ApplicationsM.L. Liu; Addison-Wesley, 2004

Page 4: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Distributed (Operating) Systems

Introduction andConcepts

Introduction andConcepts

Page 5: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems4

Distributed System (DS)• Hardware: Network-connected processor without shared

physical memory:– Loosely-coupled system– Non-common clock– Processor-dependent I/O systems– Independent failures of system components– Heterogeneous system

• Goal of this seminar: Distributed System Software– Distributed Operating Systems (classical view)– Software interface that hide distributed system complexity:

• Single System Image

Page 6: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems5

Advantages and Drawbacks• Advantages:

– Cost/performance ratio– Parallel processing: high performance– Fault tolerance: high availability– Scalable, open and heterogeneous– Most appropriate for originally distributed applications– E.g., geographically distributed enterprise

• Drawbacks:– More complex software development– Networks connection problems: latency, bandwidth and availability– Security

Page 7: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems6

New Paradigms for DS• Cluster Computing:

– Dedicated systems:• High performance.• High availability.

– Homogeneous system:• Nodes.• LAN (generalist or specific).

– Open issues: Coupling degree, distributed services. • Gird Computing:

– Resource sharing and idle processor usage.– Restricted to some specific tasks.– Different scopes:

• Inter-departmental grids.• Inter-organization grids.

– Open issues: Coordination, security and dynamic changes.

Page 8: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support7

Operating System Support1. OS for Distributed Systems:

• Requirements• Characteristics

2. Distributed Systems3. Parallel/Distributed OS:

• Operating Systems Parallelisation• Distributed System Services• Microkernels

Page 9: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support8

Distributed ArchitecturesA distributed system is a collection of independent computers presented to the user as a single computer.

Distributed Computer Architectures:– Flynn’72: SISD, SIMD, MISD, MIMD– Johnson’88: UMA, NUMA, NORMA

Page 10: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support9

Distributed System Application• Internet Services: e-mail, news, web, ...• Corporate networks or intranets.• Parallel processing:

– Massive processing (+efficiency).– Distributed topology (distributed-nature problems)

• Distributed massive data management.• High performance multimedia.• Industrial and control systems.• Real-time systems.

<and many others...>

Page 11: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support10

Distributed System ProfileDistributed systems have:

1. No common clock: Message and co-ordination aspects.2. Global concurrency: Real parallel execution.3. Independent failures: Partial failures.

Distributed system usage:1. Collaborative processing: combined features and services.2. Parallel processing: massive or high-performance calculation.

Page 12: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support

Parallel systems – Performance– Scalability– Reliability– Transparency– Security

11

System Requirements

Collaborative systems– Openness– Scalability– Reliability– Transparency– Security

Common characteristics but different hardware platforms and applications.All of them DISTRIBUTED

Page 13: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support12

Operating System Distribution• Operating systems for multiprocessors with shared memory

(SMP):– Software tightly coupled – Hardware tightly coupled

• Distributed operating systems (DOS):– Software tightly coupled – Hardware loosely coupled

• Network operating system:– Software loosely coupled – Hardware loosely coupled

Page 14: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support13

Operating Systems for SMPsArchitectures with multiple processors (2 to 8) with uniform access shared memory (SMP: Symmetric Multiprocessors)

Characteristics:– “Small” variations of the traditional OS versions.– There is only one copy of the OS.– Concurrency with real parallelism (≠ shared time).– Commercial versions (Linux, WinNT, Solaris, AIX, ...).– Different problems: kernel code running on multiple processors

(concurrent system calls), synchronisation mechanisms (spin-locks), optimisation and scheduling (processor affinity), ...

Page 15: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support14

Distributed Operating Systems (DOS)A distributed operating system is a group of processor interconnected by a communication network that hides its complexity presenting to the user a “virtual uniprocessor”.

Characteristics:– It runs on a distributed systems making them appear as a

centralised system. – Transparency: Must hide complex factor of the distribution.– It is easier to say than to do.– This goal is reached partially by the experimental systems.– Failures make the users comply.

Page 16: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support15

Distributed Operating Systems (DOS)Problems:

– Each node has a copy of the OS: Which tasks are performed locally and which globally?

– How mutual exclusion is achieved without shared memory?– How deadlocks are detected without global states?– Process scheduling: Each operating system copy has an own task

queue (process migration). – How a single directory tree is defined?– Problems due to no-common clock, partial failures and heterogeneity.

Main result:– New concepts have been developed and they are useful for other

domains.

Page 17: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support16

DOS Evolution• First network operating systems:

– New network services in a conventional OS– E.g.: UNIX 4BSD (≈1980)

• New network functionalities:– Sun’s ONC (≈1985): includes NFS, RPC, NIS

• First DOS:– New OS based on conventional (monolithic) versions.– E.g.: Sprite, University of Berkeley (≈1988)

• DOS based on μ-kernel. E.g.:– Mach, CMU (≈1986)– Amoeba, designed by Tanenbaum (≈1984)– Chorus, INRIA, France (≈1988)

Page 18: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support17

Network Operating SystemsNetwork of computers loosely coupled that share resources with no external control on the hardware/software of each node.

Characteristics:– No virtual uniprocessor vision is presented (independent nodes).– Each node runs a copy of the OS (different). – Conventional OS+ network utilities.– Communication protocols for resource sharing and high-level service

access. – From rcp/rlogin to Sun’s Open Network Computing (ONC).

Page 19: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support18

Cooperative SystemsHigh-level services-oriented software systems that requires communication mechanisms to build upper level services.

Characteristics:– A grade of transparency is provided but the single-system vision is

not presented. Autonomous independent systems.– They are founded on middlewares (CORBA, DCE, COM+, ...)– These systems are designed as a combination of multiple services

offered by different network elements.

Page 20: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support19

Middleware

Middleware:– Software layer over the operating system that provides standard

distributed services.– Open systems independent of the vendor.– Hardware and OS independent.

Examples:– DCE (Open Group).– CORBA (OMG).– ...

Hardware

OS

Hardware

OS

Hardware

OS

Middleware

Page 21: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support20

Single System Image (SSI)The illusion, created by hardware/software, that presents a collection of resources as one.

– Hardware SSI: DEC Memory Channel or SMPs– Operating System: DOS or Gluing layer – Application and Services: Middlewares (many levels).

Every SSI has a boundary.

Page 22: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support21

Why SSI is useful?• It is easy to program/use:

– Traditional programming, known interfaces.– Low-level issues hidden.

• Allows centralized and distributed management depending on task requirement.

• (Potentially) provides:– Fault tolerance.– Scalability.– Modular improvement.

Page 23: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support22

Operating System Layers

A simplified vision of an Operating System has the followinglayers:

• Hardware.• Kernel.• System services.• Application programs.• Users.

Hardware

Kernel

Services

Applications

Users

Page 24: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support23

Kernel Responsabilities

KernelServices

μ−KernelServices

Computer

Computer

μ−Kernel μ−Kernel μ−Kernel

Services

Monolithic Kernels:Many OS functionalities inside the kernelscheduler, memory manager, drivers, file systems...

μ−Kernels:Many OS tasks are performed outside the kernel. Remaining: (i) process communication, (ii) memory management, (iii) low-level management and scheduling y (iv) low-level i/o

Distributed Services:Distributed system structure. Depending on

the level: Distributed operating systems Network operating systems or (Cooperative).

Page 25: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support24

Operating System on Distributed Systems

DistributedClustersSMPsMPPs

Size 100s – 1000s 10s 100s or less 10s – 1000s

OS N x kernels Single OS kernel N x OS platforms N x OS platforms

OS type Specific purpose Standard OS plus tools (not always)

Standard OS andspecial tools

Communic. Message / DSM Shared Memory Message passing (e.g.: MPI)

Message passing or middleware

Scheduling Single queue Single queue Multiple queues coordinated

Independent queues

Special variantsof standard OSs

Single System Image (SSI)

Page 26: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Operating System Support25

Tools for Distributed/Cluster Systems• Operating system:

– Modular/Layered Monolithic– Based on μ-Kernels

• Runtime systems:– Parallel file systems or I/O libaries– Distributed shared memory software

• Resource management:– Process scheduling tools– Load balancing

• Applications:– Management and administration tools.– Processing tasks and jobs

Page 27: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems

Hardware andSoftware Overview

Hardware andSoftware Overview

Page 28: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems27

Concept of Cluster• Alternative to traditional supercomputing facilities.• Instead of traditional systems:

– Specific hardware.– High-cost.– Slow hardware development.– Painful software development.

• the use of general-purpose systems provides:– Commodity hardware (Commercial-off-the-self: COTS).– Moderate-cost.– Fast hardware development.– Even more painful software development.

Page 29: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems28

Concept of ClusterCluster: Hardware system based on commodity hardware

connected by a dedicated (high-performance) network.– Nodes: PCs or workstations (SMPs).– Network: From high-speed networks to specific hardware.

Mysterious acronyms:– PoPCs: Pile of PCs– COWs: Clusters of workstations– CLUMPS: Clusters of multiprocessors– NOWs: Networks of workstations– ....

Page 30: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems29

Hardware Characteristics• Nodes:

– Processor: Intel Pentium, AMD Athlon, Compaq Alpha, IBM PowerPC, Sun SuperSparc (3-4...Ghz)

– Memory: SDRAM, DDR or similar (2-8 GB)– Storage: SCSI or RAID

• Network:– Key element. – It could cost 50+% of the system value– Cheap alternative: Ethernet (100-1000Mb/seg)

Page 31: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems30

Cluster Networks (I)• General purpose network technologies:

– Improvement in network bandwidth.– Only reduced improvements in the latency Not well-suited

• Low-latency protocols:– Active Messages (Berkeley): “Zero-copy” synchronous model. GAM.– Fast Messages (Illinois): Reliable AM in order.– VMMC (Princeton): Distributed shared memory pages (DSM).– U-net (Cornell): Virtual interfaces for memory pages.– BIP (ENS Lyon): Low-latency basic interface.

Page 32: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems31

Cluster Networks (II)• Cluster communication standards:

– VIA: Hardware interface (native/emulated) for communications. Mpasphysical memory regions and virtual network interfaces. MPI versions over VIA.

– InfiniBand: I/O hardware standard (2.5Gbps) using one-way connections. 6 Communication models. Using RDMA and IPv6.

• Network hardware:– Ethernet, FastEthernet, GigaEthernet: Cheap but limited. Collision

problems. VIA emulations.– Giganet (cLAN): Implementation over VIA (1.26Gbps)– Myrinet: Low-latency programmable networks. Cut-through routing

and failure detection. GM protocol.– Others: QsNet, ServerNet, SCI, ATM, FiberChannel, HIPPI, ATOLL,...

Page 33: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems32

Technologies ComparativeGigabit

EthernetGiganet Myrinet QsNet SCI ServerNet2

MPI badwidth – stable(MB/sec)

35-50 105 140 208 80 65

MPI latency (μseg) 100-200 20-40 ~18 5 6 20.2

Maximum number ofnodes

1000’s 1000’s 1000’s 1000’s 1000’s 64k

VIA support Win/Linux Win/Linux Over GM NOne Software Hardware

MPI support type MPICH over MVIA or TCP

Thrirdparties

Thrirdparties

Quadrics orCompaq

Thrirdparties

Compaq orThrirdparties

© Amy Apon / Mark Baker 2000

Page 34: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems33

Software Development (I)• Operating Systems:

– Linux: • Free, cheap, fast and fast-development.• e.g., Beowulf

– Solaris: • Good parallelism support and good network services.• e.g., Solaris MC

– AIX: • Powerful and well-optimized software development tools.• e.g., SP2

– Windows: • Why not?• e.g., Wolfpack

Page 35: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems34

Software Development (II)• Middleware and SSI:

– SSI (Single System Image): The whole cluster is presented as a single monoprocessor.

– Layered development:• Hardware (Local).• Operating system (μkernel) or gluing level: GLUnix or MOSIX• Application, services and middleware: CODINE

– Common services (desirable):•Single access point.•Single file hierarchy.•Single management point.•Single network connection.•Single work-management service.

•Single user interface•Single I/O space•Single process space•Checkpointing.•Process migration

Page 36: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems35

Software Development (III)• Programming tools:

– Thread support: Pthreads or OpenMP– Message passing in clusters:

• MPI: MPICH or LANMPI.• PVM: Worse performance but more features.

– DSM: Distributed shared memory:• Software: TreadMarks, Linda or Nanos• Hardware: DASH or Merlin

– Parallel debuggers – Instrumentation tools.

Page 37: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems36

Software Development (IV)• Administration tools:

– Remote management: • Administrative commands: install software, copy files.• Process-level resource management.• User list and other system information: NIS.• e.g., SP2 tools, Cluster Command & Control (C3)

– Scheduling systems:• Work queues and workload management• Resource supervision.• e.g., CODINE, CONDORPBS (Portable Batch System)

Page 38: Distributed (Operating) Systemslaurel.datsi.fi.upm.es/~jmpena/INSA/DS-1-introduction.pdfDistributed Operating Systems 6 New Paradigms for DS ... – Flynn’72: SISD, SIMD, MISD, MIMD

Fernando PérezJosé María PeñaMaría S. Pérez

Distributed Operating Systems37

Input/Output System• I/O Crisis:

– Exponential growth of CPUs power (Moore’s law).– I/O systems much smaller growth.– I/O phase is the actual bottleneck of high-performance systems.

• Solution based on I/O parallelism:– Parallel I/O systems: MPI I/O– Parallel filesystems: ParFiSys, GPFS– Intelligent I/O: Armada, Panda