01b intro hpc cluster computing n research

Upload: dika-galih-pratama

Post on 04-Jun-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    1/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    2/73

    2

    Outline

    Kenapa perlu komputasi parallel:Masalah membutuhkan sumber dayakomputasi tinggi

    Pengenalan komputasi parallel

    Cluster Computing

    Research in Cluster Computing Rocks for Cluster Computing

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    3/73

    3

    Resource Hungry Applications

    Solving grand challenge applications using

    computer modeling,simulationand analysis

    Life Sciences

    CAD/CAM

    Aerospace

    Military ApplicationsDigital Biology Military ApplicationsMilitary Applications

    Internet &

    Ecommerce

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    4/73

    4

    Contoh proses Parallel

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    5/73

    5

    Contoh proses Parallel

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    6/73

    6

    Contoh proses Parallel

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    7/73

    7

    Parallel I/O

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    8/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    9/73

    9

    The Area/research topics

    At Applications/problems to besolved,

    At Cluster computing problems

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    10/73

    10

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    11/73

    11

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    12/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    13/73

    13

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    14/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    15/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    16/73

    16

    Era of Computing

    Rapid technical advances

    the recent advances in VLSI technology

    software technology

    OS, PL, development methodologies, & tools

    grand challenge applications have becomethe main driving force

    Parallel computing

    one of the best ways to overcome the speedbottleneck of a single processor

    good price/performance ratio of a smallcluster-based parallel computer

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    17/73

    17

    In Summary

    Need more computing power Improve the operating speed of processors &

    other components constrained by the speed of light,

    thermodynamic laws, & the high financial costsfor processor fabrication

    Connect multiple processors together &

    coordinate their computational efforts parallel computers

    allow the sharing of a computational task amongmultiple processors

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    18/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    19/73

    19

    Technology Trends

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    20/73

    20

    Trend

    [Traditional Usage] Workstations withUnix for science & industry vs PC-based machines for administrative

    work & work processing [Trend] A rapid convergence in

    processor performance and kernel-level functionality of Unix workstationsand PC-based machines

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    21/73

    21

    Rise and Fall of ComputerArchitectures

    Vector Computers (VC) - proprietary system: provided the breakthrough needed for the emergence

    of computational science, buy they were only apartial answer.

    Massively Parallel Processors (MPP) -proprietary

    systems: high cost and a low performance/price ratio.

    Symmetric Multiprocessors (SMP): suffers from scalability

    Distributed Systems:

    difficult to use and hard to extract parallelperformance.

    Clusters - gaining popularity: High Performance Computing - Commodity

    Supercomputing High Availability Computing - Mission Critical

    Applications

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    22/73

    22

    The Dead Supercomputer Society

    http://www.paralogos.com/DeadSuper/

    ACRI Alliant

    AmericanSupercomputer

    Ametek

    Applied Dynamics

    Astronautics

    BBN

    CDC

    Convex

    Cray Computer

    Cray Research(SGI?Tera)

    Culler-Harris

    Culler Scientific

    Cydrome

    Convex C4600

    Dana/Ardent/Stellar Elxsi

    ETA Systems

    Evans & SutherlandComputer Division

    Floating Point Systems Galaxy YH-1

    Goodyear AerospaceMPP

    Gould NPL

    Guiltech

    Intel ScientificComputers

    Intl. Parallel Machines

    KSR

    MasPar

    Meiko Myrias

    ThinkingMachines

    Saxpy

    ScientificComputerSystems (SCS)

    SovietSupercomputer

    s Suprenum

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    23/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    24/73

    24The promise of supercomputing to the average PC User ?

    Towards Clusters

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    25/73

    25

    Towards Commodity ParallelComputing

    linking together two or more computers to jointlysolve computational problems

    since the early 1990s, an increasing trend to moveaway from expensive and specialized proprietaryparallel supercomputers towards clusters ofworkstations Hard to find money to buy expensive systems

    the rapid improvement in the availability ofcommodity high performance components forworkstations and networks

    Low-cost commodity supercomputing from specialized traditional supercomputing platforms

    to cheaper, general purpose systems consisting ofloosely coupled components built up from single ormultiprocessor PCs or workstations

    History: Clustering of

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    26/73

    26

    History: Clustering ofComputers for CollectiveComputing

    1960 1990 1995+1980s 2000+

    PDA

    Clusters

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    27/73

    27

    Why PC/WS Clustering Now ?

    Individual PCs/workstations are becoming increasingpowerful

    Commodity networks bandwidth is increasing andlatency is decreasing

    PC/Workstation clusters are easier to integrate intoexisting networks

    Typical low user utilization of PCs/WSs

    Development tools for PCs/WS are more mature

    PC/WS clusters are a cheap and readily available Clusters can be easily grown

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    28/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    29/73

    29

    Cluster Architecture

    Sequential Applications

    Parallel Applications

    Parallel Programming Environment

    Cluster Middleware

    (Single System Image and Availability Infrastructure)

    Cluster Interconnection Network/Switch

    PC/Workstation

    Network InterfaceHardware

    Communications

    Software

    PC/Workstation

    Network InterfaceHardware

    Communications

    Software

    PC/Workstation

    Network InterfaceHardware

    Communications

    Software

    PC/Workstation

    Network InterfaceHardware

    Communications

    Software

    Sequential ApplicationsSequential Applications

    Parallel ApplicationsParallel Applications

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    30/73

    30

    Cluster Components

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    31/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    32/73

    32

    System CPUs

    Processors Intel x86 Processors

    Pentium Pro and Pentium Xeon

    AMD x86, Cyrix x86, etc.

    Digital Alpha Alpha 21364 processor integrates processing,

    memory controller, network interface into asingle chip

    IBM PowerPC

    Sun SPARC

    SGI MIPS

    HP PA

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    33/73

    33

    System Disk

    Disk and I/O

    Overall improvement in disk access timehas been less than 10% per year

    Amdahls law Speed-up obtained by from faster

    processors is limited by the slowest systemcomponent

    Parallel I/O

    Carry out I/O operations in parallel,supported by parallel file system based onhardware or software RAID

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    34/73

    34

    Commodity Components forClusters (II): Operating Systems

    Operating Systems 2 fundamental services for users

    make the computer hardware easier to use create a virtual machine that differs markedly from the

    real machine

    share hardware resources among users Processor - multitasking

    The new concept in OS services

    support multiple threads of control in a processitself parallelism within a process multithreading

    POSIX thread interface is a standard programming environment

    Trend

    Modularity MS Windows, IBM OS/2

    Microkernel provide only essential OS services

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    35/73

    35

    Prominent Components ofCluster Computers

    State of the art Operating Systems Linux (MOSIX, Beowulf, and many more)

    Microsoft NT (Illinois HPVM, Cornell Velocity)

    SUN Solaris (Berkeley NOW, C-DAC PARAM)

    IBM AIX (IBM SP2)

    HP UX (Illinois - PANDA)

    Mach (Microkernel based OS) (CMU)

    Cluster Operating Systems (Solaris MC, SCOUnixware, MOSIX (academic project)

    OS gluing layers (Berkeley Glunix)

    d f

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    36/73

    36

    Commodity Components forClusters

    Operating Systems Linux

    Unix-like OS

    Runs on cheap x86 platform, yet offers the power

    and flexibility of Unix Readily available on the Internet and can be

    downloaded without cost

    Easy to fix bugs and improve system performance

    Users can develop or fine-tune hardware driverswhich can easily be made available to other users

    Features such as preemptive multitasking,demand-page virtual memory, multiuser,multiprocessor support

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    37/73

    C di C f

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    38/73

    38

    Commodity Components forClusters

    Operating Systems Microsoft Windows NT (New Technology)

    Preemptive, multitasking, multiuser, 32-bits OS

    Object-based security model and special file system

    (NTFS) that allows permissions to be set on a file anddirectory basis

    Support multiple CPUs and provide multitasking usingsymmetrical multiprocessing

    Support different CPUs and multiprocessor machineswith threads

    Have the network protocols & services integrated withthe base OS

    several built-in networking protocols (IPX/SPX, TCP/IP,NetBEUI), & APIs (NetBIOS, DCE RPC, Window Sockets(Winsock))

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    39/73

    39

    Prominent Components ofCluster Computers (III)

    High Performance Networks/Switches Ethernet (10Mbps), Fast Ethernet (100Mbps), Gigabit Ethernet (1Gbps) SCI (Scalable Coherent Interface- MPI- 12sec

    latency) ATM (Asynchronous Transfer Mode) Myrinet (1.2Gbps) QsNet (Quadrics Supercomputing World, 5sec

    latency for MPI messages) Digital Memory Channel FDDI (fiber distributed data interface) InfiniBand

    Prominent Components of

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    40/73

    40

    Prominent Components ofCluster Computers (IV)

    Fast Communication Protocolsand Services (User LevelCommunication):

    Active Messages (Berkeley) Fast Messages (Illinois)

    U-net (Cornell)

    XTP (Virginia) Virtual Interface Architecture (VIA)

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    41/73

    41

    Cluster Interconnects: Comparison(created in 2000)

    Myrinet QSnet Giganet ServerNet2 SCI GigabitEthernet

    Bandwidth(MBytes/s)

    140 33MHz215 66 Mhz

    208 ~105 165 ~80 30 - 50

    MPILatency (s)

    16.5 33Nhz11 66 Mhz

    5 ~20 - 40 20.2 6 100 - 200

    List price/port $1.5K $6.5K $1.5K ~$1.5K

    HardwareAvailability

    Now Now Now Q200 Now Now

    Linux Support Now Now Now Q200 Now Now

    Maximum#nodes

    1000s 1000s 1000s 64K 1000s

    ProtocolImplementation

    Firmware onadapter

    Firmwareon adapter

    Firmware onadapter

    Implemented inhardware

    Implementedin hardware

    VIA support Soon None NT/Linux Done in hardware SoftwareTCP/IP, VIA

    NT/Linux

    MPI support 3rdparty Quadrics/Compaq

    3rdParty Compaq/3rdparty MPICH TCP/IP

    1000s

    Firmwareon adapter

    ~$1.5K

    3rdParty

    ~$1.5K

    C dit C t f

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    42/73

    42

    Commodity Components forClusters

    Cluster Interconnects Communicate over high-speed networks using a

    standard networking protocol such as TCP/IP or a low-level protocol such as AM

    Standard Ethernet 10 Mbps

    cheap, easy way to provide file and printer sharing

    bandwidth & latency are not balanced with thecomputational power

    Ethernet, Fast Ethernet, and Gigabit Ethernet Fast Ethernet 100 Mbps

    Gigabit Ethernet

    preserve Ethernets simplicity

    deliver a very high bandwidth to aggregate multiple FastEthernet segments

    C dit C t f

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    43/73

    43

    Commodity Components forClusters

    Cluster Interconnects Myrinet

    1.28 Gbps full duplex interconnection network

    Use low latency cut-through routing switches, which is ableto offer fault tolerance by automatic mapping of the

    network configuration Support both Linux & NT

    Advantages

    Very low latency (5s, one-way point-to-point)

    Very high throughput

    Programmable on-board processor for greater flexibility Disadvantages

    Expensive: $1500 per host

    Complicated scaling: switches with more than 16 ports areunavailable

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    44/73

    44

    Prominent Components ofCluster Computers (V) Cluster Middleware

    Single System Image (SSI)

    System Availability (SA) Infrastructure

    Hardware

    DEC Memory Channel, DSM (Alewife, DASH), SMPTechniques

    Operating System Kernel/Gluing Layers

    Solaris MC, Unixware, GLUnix, MOSIX

    Applications and Subsystems

    Applications (system management and electronic forms)

    Runtime systems (software DSM, PFS etc.)

    Resource management and scheduling (RMS) software SGE (Sun Grid Engine), LSF, PBS, Libra: Economy Cluster Scheduler,

    NQS, etc.

    Advanced Network Services/

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    45/73

    45

    Advanced Network Services/Communication SW

    Communication infrastructure support protocol for Bulk-data transport

    Streaming data

    Group communications

    Communication service provide cluster with important QoSparameters

    Latency Bandwidth

    Reliability

    Fault-tolerance

    Jitter control

    Network service are designed as hierarchical stack of protocols

    with relatively low-level communication API, provide means toimplement wide range of communication methodologies RPC

    DSM

    Stream-based and message passing interface (e.g., MPI, PVM)

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    46/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    47/73

    47

    Prominent Components ofCluster Computers (VII)

    Applications

    Sequential

    Parallel / Distributed (Cluster-awareapp.)

    Grand Challenging applications Weather Forecasting

    Quantum Chemistry Molecular Biology Modeling

    Engineering Analysis (CAD/CAM)

    .

    PDBs, web servers,data-mining

    Key Operational Benefits of

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    48/73

    48

    Key Operational Benefits ofClustering

    High Performance

    Expandability and Scalability

    High ThroughputHigh Availability

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    49/73

    49

    Clusters Classification (I)

    Application Target

    High Performance (HP)

    ClustersGrand Challenging Applications

    High Availability (HA) Clusters

    Mission Critical applications

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    50/73

    50

    Clusters Classification (II)

    Node Ownership

    Dedicated Clusters

    Non-dedicated clustersAdaptive parallel computing

    Communal multiprocessing

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    51/73

    51

    Clusters Classification (III)

    Node Hardware

    Clusters of PCs (CoPs)

    Piles of PCs (PoPs)

    Clusters of Workstations(COWs)

    Clusters of SMPs (CLUMPs)

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    52/73

    52

    Clusters Classification (IV)

    Node Operating System Linux Clusters (e.g., Beowulf)

    Solaris Clusters (e.g., Berkeley

    NOW) NT Clusters (e.g., HPVM)

    AIX Clusters (e.g., IBM SP2)

    SCO/Compaq Clusters (Unixware) Digital VMS Clusters

    HP-UX clusters

    Microsoft Wolfpack clusters

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    53/73

    53

    Clusters Classification (V)

    Node Configuration

    Homogeneous Clusters

    All nodes will have similararchitectures and run the same OSs

    Heterogeneous Clusters

    All nodes will have differentarchitectures and run different OSs

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    54/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    55/73

    55

    Cluster Programming

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    56/73

    56

    Code-GranularityCode ItemLarge grain(task level)Program

    Medium grain(control level)Function (thread)

    Fine grain(data level)Loop (Compiler)

    Very fine grain(multiple issue)

    With hardware

    Task i-l Taski Task i+1

    func1 ( )

    {

    ....

    ....

    }

    func2 ( )

    {

    ....

    ....

    }

    func3 ( )

    {

    ....

    ....

    }

    a ( 0 ) =..

    b ( 0 ) =..

    a ( 1 )=..

    b ( 1 )=..

    a ( 2 )=..

    b ( 2 )=..

    + x Load

    PVM/MPI

    Threads

    Compilers

    CPU

    Levels of Parallelism

    Cl P i

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    57/73

    57

    Cluster ProgrammingEnvironments Shared Memory Based

    DSM

    Threads/OpenMP (enabled for clusters)

    Java threads (IBM cJVM)

    Message Passing Based PVM (PVM)

    MPI (MPI)

    Parametric Computations Nimrod-G and Gridbus Data Grid Broker

    Automatic Parallelising Compilers

    Parallel Libraries & Computational Kernels(e.g., NetSolve)

    Programming Environments and

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    58/73

    58

    Programming Environments andTools (I)

    Threads (PCs, SMPs, NOW..)

    In multiprocessor systems

    Used to simultaneously utilize all the availableprocessors

    In uniprocessor systems

    Used to utilize the system resources effectively

    Multithreaded applications offer quicker response touser input and run faster

    Potentially portable, as there exists an IEEEstandard for POSIX threads interface (pthreads)

    Extensively used in developing both application andsystem software

    Programming Environments and

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    59/73

    59

    Programming Environments andTools (II)

    Message Passing Systems (MPI and PVM) Allow efficient parallel programs to be written for

    distributed memory systems

    2 most popular high-level message-passing systems PVM & MPI

    PVM both an environment & a message-passing library

    MPI

    a message passing specification, designed to bestandard for distributed memory parallelcomputing using explicit message passing

    attempt to establish a practical, portable, efficient,& flexible standard for message passing

    generally, application developers prefer MPI, as itis fast becoming the de facto standard for messagepassing

    Programming Environments and

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    60/73

    60

    Programming Environments andTools (III)

    Distributed Shared Memory (DSM) Systems Message-passing

    the most efficient, widely used, programming paradigm on distributedmemory system

    complex & difficult to program

    Shared memory systems

    offer a simple and general programming model but suffer from scalability

    DSM on distributed memory system

    alternative cost-effective solution

    Software DSM Usually built as a separate layer on top of the comm interface

    Take full advantage of the application characteristics: virtual pages, objects,& language types are units of sharing

    TreadMarks, Linda

    Hardware DSM Better performance, no burden on user & SW layers, fine granularity of

    sharing, extensions of the cache coherence scheme, & increased HWcomplexity

    DASH, Merlin

    Programming Environments and

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    61/73

    61

    Programming Environments andTools (IV) Parallel Debuggers and Profilers

    Debuggers

    Very limited

    HPDF (High Performance Debugging Forum) as Parallel ToolsConsortium project in 1996

    Developed a HPD version specification, which defines thefunctionality, semantics, and syntax for a commercial-line

    parallel debugger TotalView

    A commercial product from Dolphin Interconnect Solutions

    The only widely available GUI-based parallel debuggerthat supports multiple HPC platforms

    Only used in homogeneous environments, where each process ofthe parallel application being debugged must be running under

    the same version of the OS

    F ti lit f P ll l

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    62/73

    62

    Functionality of ParallelDebugger

    Managing multiple processes and multiplethreads within a process

    Displaying each process in its own window Displaying source code, stack trace, and stack

    frame for one or more processes Diving into objects, subroutines, and functions Setting both source-level and machine-level

    breakpoints Sharing breakpoints between groups of

    processes Defining watch and evaluation points Displaying arrays and its slices Manipulating code variable and constants

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    63/73

    Performance Analysis

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    64/73

    64

    Performance Analysisand Visualization Tools

    Tool Supports URL

    AIMS Instrumentation, monitoringlibrary, analysis

    http://science.nas.nasa.gov/Software/AIMS

    MPE Logging library and snapshotperformance visualization

    http://www.mcs.anl.gov/mpi/mpich

    Pablo Monitoring library and analysis http://www-pablo.cs.uiuc.edu/Projects/Pablo/

    Paradyn Dynamic instrumentationrunning analysis

    http://www.cs.wisc.edu/paradyn

    SvPablo Integrated instrumentor,monitoring library and analysis

    http://www-pablo.cs.uiuc.edu/Projects/Pablo/

    Vampir Monitoring library performancevisualization

    http://www.pallas.de/pages/vampir.htm

    Dimenma

    s

    Performance prediction for

    message passing programshttp://www.pallas.com/pages/dimemas.htm

    Paraver Program visualization and

    analysis

    http://www.cepba.upc.es/paraver

    Programming Environments and

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    65/73

    65

    Programming Environments andTools (VI)

    Cluster Administration Tools Berkeley NOW

    Gather & store data in a relational DB Use Java applet to allow users to monitor a system

    SMILE (Scalable Multicomputer Implementation usingLow-cost Equipment)

    Called K-CAP Consist of compute nodes, a management node, & a

    client that can control and monitor the cluster K-CAP uses a Java applet to connect to the management

    node through a predefined URL address in the cluster

    PARMON A comprehensive environment for monitoring large

    clusters Use client-server techniques to provide transparent

    access to all nodes to be monitored parmon-server & parmon-client

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    66/73

    66

    Cluster Applications

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    67/73

    67

    Cluster Applications

    Numerous Scientific & engineering Apps.

    Business Applications:

    E-commerce Applications (Amazon,eBay);

    Database Applications (Oracle on

    clusters). Internet Applications:

    ASPs (Application Service Providers);

    Computing Portals;

    E-commerce and E-business.

    Mission Critical Applications:

    command control systems, banks,nuclear reactor control, star-wars, andhandling life threatening situations.

    Some Cluster Systems:

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    68/73

    68

    Some Cluster Systems:Comparison

    Project Platform Communications OS Other

    Beowulf PCs Multiple Ethernet with

    TCP/IP

    Linux and

    Grendel

    MPI/PVM.

    Sockets and HPF

    Berkeley

    Now

    Solaris-based

    PCs andworkstations

    Myrinet and Active

    Messages

    Solaris +

    GLUnix + xFS

    AM, PVM, MPI,

    HPF, Split-C

    HPVM PCs Myrinet with Fast

    Messages

    NT or Linux

    connection and

    global

    resource

    manager +

    LSF

    Java-fronted,

    FM, Sockets,

    Global Arrays,

    SHEMEM and

    MPI

    Solaris MC Solaris-based

    PCs and

    workstations

    Solaris-supported Solaris +

    Globalization

    layer

    C++ and

    CORBA

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    69/73

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    70/73

    70

    Many types of Clusters

    High Performance Clusters Linux Cluster; 1000 nodes; parallel programs; MPI

    Load-leveling Clusters Move processes around to borrow cycles (eg. Mosix)

    Web-Service Clusters LVS/Piranah; load-level tcp connections; replicate

    data

    Storage Clusters GFS; parallel filesystems; same view of data from

    each node

    Database Clusters Oracle Parallel Server;

    High Availability Clusters ServiceGuard, Lifekeeper, Failsafe, heartbeat,

    failover clusters

    Cluster Design Issues

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    71/73

    71

    Enhanced Performance (performance @ low cost)

    Enhanced Availability (failure management)

    Single System Image (look-and-feel of one system)

    Size Scalability (physical & application)

    Fast Communication (networks & protocols)

    Load Balancing (CPU, Net, Memory, Disk)

    Security and Encryption (clusters of clusters)

    Distributed Environment (Social issues)

    Manageability (admin. And control) Programmability (simple API if required)

    Applicability (cluster-aware and non-aware app.)

    Cluster Design Issues

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    72/73

    72

    Summary: Cluster Advantage

    Price/performance ratio is low whencompared with a dedicated parallelsupercomputer.

    Incremental growth that often matches

    with the demand patterns. The provision of a multipurpose system

    Scientific, commercial, Internet applications

    Have become mainstream enterprisecomputing systems: In 2003 List of Top 500 Supercomputers, over

    50% of them are based on clusters and many ofthem are deployed in industries.

  • 8/13/2019 01b Intro HPC Cluster Computing n Research

    73/73

    References

    High Performance Cluster Computing:Architectures and Systems, BookEditor: Rajkumar Buyya, Slides: Hai

    Jin and Raj Buyya http://www.buyya.com/cluster

    Bahan kuliah Topik Dalam Komputasi

    Paralel, Fasilkom UI

    http://www.buyya.com/clusterhttp://www.buyya.com/cluster