grid everywhere - jpgrid.org · maestro: nasa jet propulsion laboratory first steps towards the...

44
Adrian Cockcroft Chief Architect High Performance & Technical Computing Grid Everywhere Accelerating Innovation: Sun’s HPTC/Grid Strategy Grid Everywhere www.sun.com/solutions/hpc/

Upload: others

Post on 04-Feb-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

  • Adrian Cockcroft

    Chief Architect

    High Performance & Technical Computing

    Grid EverywhereAccelerating Innovation: Sun's HPTC/Grid Strategy

    Grid Everywhere

    www.sun.com/solutions/hpc/

  • Maestro: NASA Jet Propulsion LaboratoryFirst Steps Towards the Inter Planetary Grid� Problem

    Simulate remote control and data collection

    Global web access to image data

    Multi-platform visualization

    � Solution Multi-site, multi-organizational collaboration

    Open standard imaging technology

    JAVA Advanced Imaging and JAVA 3D

    � Result Universal access to scientific data

    Real time public involvement in science and exploration

    NASA recorded 109 million hits in the first 24-hours

    250,000 Maestro software downloads in 3 weeks

    "I love this program. I have never felt so close to space exploration as I do when I'm poking around it. It is an awe inspiring mission and this software practically lets you touch it."

    E-mail to JPL commenting on Java 3D& JAI based Maestro software

  • AgendaAgenda

    HPTC @ SunHPTC @ SunEarly AdoptersEarly Adopters

    Grid TechnologiesGrid TechnologiesGrid Alliances and StandardsGrid Alliances and Standards

    Grid SolutionsGrid SolutionsVisual GridVisual Grid

  • � Perform mathematical modeling of natural phenomena (science and engineering) or business processes (business optimization)

    Crash Simulation, Risk Management, Electronic Design, Gene Modeling, Weather forecasting, etc.

    � Many possible types of work loads

    Transactions, Queries, Numerical, Visual/Multi-media, Networking

    � Represent EARLY ADOPTERS, often a leading indicator of future IT requirements. (64bit, SMP, Web, Grid, graphics,....)

    What does HPTC mean @ Sun?

  • 8000 Grids using Sun technology350,000 CPUs served

  • Sun: HPTC Volume Leader

    � Sun is the volume leader based on IDC Data for 2003

    � CY2003: Technical server shipments 89,907 totalSun #1 with 40.2% market share (36,156 units)

    � CY2003: Technical server revenue $5.4B totalSun #3 with 18.3% market share ($981M)

    IDC Technical Server Tracker February 23rd 2004

    Q402 Q103 Q203 Q303 Q403

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    8000

    9000

    10000

    11000

    12000

    13000

    14000

    Technical Server Unit Shipments

    Q302 Q402 Q103 Q203 Q303 Q403

    0.0

    25.0

    50.0

    75.0

    100.0

    125.0

    150.0

    175.0

    200.0

    225.0

    250.0

    275.0

    300.0

    325.0

    Technical Server Revenue $M

  • Early AdoptersEarly Adopters

  • Centerof Excellence

    Proof ofConcept

    Reference Architectures and Solutions

    CustomerReady

    Systems

    AdvancedSystems &Services

    Solutions DevelopmentMethodology

    Productsand Labs

    Technology

    Sun HPTC Transforms Innovation into Customer Value

  • Early Adopter Example: Interval MathCoordinate research, partner and take it to market

    Interval ComponentsInterval is a span of all numbers between two values [x, y]Interval math manages numerical accuracy and stability issuesSun Fortran has Interval Datatype and operationsSolver libraries are being researchedInterval algorithms existMore scalable, and more efficient (fewer GF to solve)

    The Interval ProblemMost textbooks say you can't solve nonlinear optimization

    problems true for point solutions only!!

    Key Book: Global Optimization Using Interval Analysisby Eldon Hansen, G. William Walster ISBN: 0824740599

  • Hill Climb Optimization ExampleTry hill climb algorithm on a difficult dataset, with very narrowand specific optimal solutions it's not a robust solution....

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    Telegraph poles on the prarie

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    Sample Prarie

  • Interval Optimization ExampleInterval solver tells you the range of the possible solution overan interval, here we split the data into four intervals and evaluate

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    Telegraph poles on the prarie

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    Prarie Interval ranges

  • Interval Optimization ExampleKeep zooming in until the result is a small enough interval, and youend up with a deterministic and correctly bounded solution

    00.05

    0.10.15

    0.20.25

    0.30.35

    0.40.45

    0.50.55

    0.60.65

    0.70.75

    0.80.85

    0.9

    Zoom in on Maximum

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    Interval Driven True Solution

    The Answer!

  • Grid TechnologiesGrid Technologies

  • The Strategic Need for Shared Resources

    � Life Sciences

    � Electronic Design

    � Financial Services

    � Automotive Manufacturing

    � Scientific Research

    � Oil and Gas Exploration

    � Telecommunications

    � Business Computing

    Discovery Informatics, database queries

    Grid enabled enterprise applications, database and transaction processing

    Simulations, verifications, regression testing

    Enhanced delivery of network services

    Simulations, seismic analysis, visualization

    Large computational problems, collaboration

    Crash testing simulations, stress testing, aerodynamics modeling

    Risk and portfolio analysis, simulations

    Industries Computing Tasks

  • Types of Grid

    SensorGrid

    DesktopGrid

    ClusterGrid

    EnterpriseGrid

    GlobalGrid

    DataCapture

    Computation andVisualization

    DataStorage

    Trade Exchange

    VisualGrid

  • The Value of Grid

    The business benefits of Grid are those of a network-wide consolidation (across line of business boundaries and geographies)

    Enable a service-oriented, automated delivery of IT

    Improved utilization

    � Enables new capabilities and services� Lower operational costs of running IT

    Maximize productivity through optimal use of

    horizontally or vertically scaled systems

  • Sun's Evolutionary Compute Grid Strategy:Expand existing capabilities and integrate new ones

    Cluster GridResource Sharing

    Simplest Grid deployment Optimal utilization of resources Enables new capabilities and services Lower operational costs for IT Maximize productivity

    Enterprise GridJava Web Services, Data Grid

    Resources shared within the enterprise Policies ensure computing on demand Gives multiple groups seamless access to

    enterprise resources Meets commercial enterprise standards

    Global GridTrade Exchange

    Resources shared over the Internet Global view of distributed resources and data Trade Exchange for compute capacity and

    data access Foundation for Utility Computing

  • Grid: Where is your reality today?What is the effect upon your business?

    No grid - Local grid Enterprise grid Global grid

    Failing Surviving Profiting Investing

    Old Reality Current Reality Future Reality

    Lagging - Conforming Exploiting Leading Pioneering

  • Ch

    asm

    Grid Crossing the Chasm

    Early adopters

    HPTC has led in adoption of Grid technologies

    Technical Computing

    Clusters

    GlobalGrid

    EnterpriseGrid

  • Ch

    asm

    Grid Crossing the Chasm

    Early adopters

    HPTC methodologies and expertise are key for Grid to cross the business computing chasm

    Technical Computing

    Clusters

    GlobalGrid

    EnterpriseGrid

    Cha

    sm

    Business Computing

  • Grid Datacenter

    Enterprise Grid System Components

    Virtualized Physical Resources

    Java Web Services

    Utility Computing

    Bu

    sin

    ess

    M

    od

    el

    Tech

    no

    log

    y Fo

    un

    dati

    on

  • What it takes

    � Thought Leadership Trade exchange model, Utility Computing, Reference Architectures,

    Infrastructure Solutions, Grid Professional Services practice

    DARPA HPCS High Productivity Computing System covers administration and developer issues, not just peta-scale performance

    � Expertise HPTC experience to assemble solution recipes out of technologies

    Professional services to architect, implement, manage

    � Technology Java web services, infrastructure and tools (JES, Java Studio, OSS/J)

    Automatic cluster configuration (Sun Control Station/SDSC ROCKS)

    Optimal matching of workloads to resources (N1 Grid Engine)

    Automated provisioning of resources, applications and services (N1 Grid )

    Advanced platforms (SPARC, Opteron, Solaris, Linux, SAN/NAS, Visualization)

    � Plus alliances and standards....

  • Alliances andAlliances and

    StandardsStandards

  • Sun Grid Services Environment

    GlobusToolkit

    Grid Engine Portal

    JAVAEnterprise

    System

    N1 Grid Engine

    Web Service Enabled

    Enterprise Grid Enabled C C++

    FORTRANApplications

    JAVA

    Network Enabled

    Global Grid Enabled

    DRMAAStandardInterface

  • Alliances and Standards

    � Partners Sun has wide ranging agnostic HW/SW/Services/Channel alliances

    HPTC@Sun has signed up more than 25 interconnect and grid partners

    � Global Grid Forum GGF - www.gridforum.org Sun led the creation of the GGF DRMAA standard

    Sun holds an Area Director position within GGF

    � Globus Alliance - www.globus.org Many Globus toolkit deployments are on Sun

    Sun is working closely with Globus on future developments

    � Enterprise Grid Alliance www.gridalliance.org Sun is a founding member of the EGA

    The EGA is working to specify and standardize Enterprise Grid solutions

    � DMTF and OASIS Sun has many representatives and is working to make sure that future Grid

    standards are truly open and royalty-free

  • Enterprise Grid Alliancewww.gridalliance.org

    � Focus Practical Solutions for Enterprise Grid Example: run General Ledger on a Grid (not scientific research grid!)

    Short term database cluster grid and datacenter virtualization

    Medium term combine to run technical applications in spare cycles

    Long term global grid integration once standards have matured

    � Board-level Members EMC, Fujitsu-Siemens, HP, Intel, NEC, Network Appliance, Oracle, Sun

    � Initial Working Group Proposals Reference Model vocabulary, context, use cases

    Component Provisioning standardization

    Data Provisioning standardized high level functionality

    Utility Accounting use cases, data models, mappings to standards

    Grid Security mapping to existing security standards

    � Relationships Many EGA member companies and people are members of other groups

    Adopt mature standards from GGF, DMTF, OASIS, TMF, SNIA, ITIL...

    New! April 2004

  • Grid SolutionsGrid Solutions

  • SDSC ROCKstar Supercomputer0 to #201 in under 2 hours!

    SDSC ROCKstar: A 128 node/256 CPU x86 cluster, #201 in Top 500SDSC ROCKstar: A 128 node/256 CPU x86 cluster, #201 in Top 500

    Built live on the SC2003 show floor in 100 minutes, adding a new Built live on the SC2003 show floor in 100 minutes, adding a new

    server every 30 seconds. 200 jobs were submitted to the queue server every 30 seconds. 200 jobs were submitted to the queue

    and the cluster began to execute 15 minutes into the build cycle.and the cluster began to execute 15 minutes into the build cycle.

  • Grid Computing ComponentsBoth Horizontal Scale-out and Vertical Scale-up

    Sun Fire E25K

    Sun Fire E20K

    Sun Fire E6900

    Sun Fire E4900

    Sun Fire E2900

    Sun Fire V880

    Sun Fire V480

    Sun Fire V440

    Sun Fire Blades

    Sun Fire V20z (Opteron) Sun Fire

    TM

    V210

    (SPARC)Sun BladeTM

    Workstation

    UltraSPARC

    IA32, AMD64

    UltraSPARC

    Capability Computing

    Services

    CapacityComputing

    Services

    Prebuilt Racks

  • The Grid Architecture Dilemma:Large memory SMP nodes or many thin nodes?

    Workload scales vertically:� Parallel codes: OpenMP� Large Shared Memory� Top Performance� Higher acquisition cost� Lower development and

    management complexity & cost

    Workload scales horizontally:� Serial and parallel codes: MPI� Optimized Throughput� Lower acquisition cost� Higher development and

    management complexity & cost

    Acquisition Cost / CPU

    The DecidingFactor

    What do theworkloads

    require?

    Complexity Cost / CPU

  • CMT On-Chipxxx GB/s0.1 - 0.01 µs$xxx?

    Interconnect ComponentsMapping Out Bandwidth and Latency

    Ethernet0.1 GB/s100 - 10µs$xxx

    Myri/IB/Q/FL

    0.4 – 4.8 GB/s10 - 1 µs$x,xxx

    Memory

    9.6 - 57 GB/s1 - 0.1 µs$xx,xxx

    Latency(invertedlog scale)

    10ns

    100ns

    1us

    10us

    100us

    Proximity

    SystemCall

    LibraryCall

    Load/StoreInstruction

    0.1 1 10 100 1000Gigabytes/sec Bandwidth (on logarithmic scale)

  • Interconnect ComponentsMapping Out Bandwidth and LatencyEstimated/Approximate performance numbers

    0.1 1 10 100 1000Gigabytes/sec Bandwidth (on logarithmic scale)

    Latency(invertedlog scale)

    10ns

    100ns

    1us

    10us

    100usGBESolaris 9

    GBELinux

    10GBELinuxAMD

    UsermodeGBELinux

    QuadrixLinuxAMD

    Myrinet IB x4LinuxAMD

    IB x12LinuxAMD

    SMPV20z

    SMPE25K

    SMPE6900

    SMPV440

    CMTFuture

    PCI-X 133SpeedLimit

    MPI calllatency Limit

    10GBEPCI-E

    PCI-Expressx8 SpeedLimit

  • Visual GridVisual Grid

  • High End Visual System EvolutionTrend: Disaggregation and Virtualization of Components

    Starting point: Applications run on single system with rendering Starting point: Applications run on single system with rendering

    distributed across multiple 3D Graphics Accelerators and hardwired displaydistributed across multiple 3D Graphics Accelerators and hardwired display

    3D Graphics3D GraphicsAcceleratorAccelerator

    3D Grpahics3D GrpahicsAcceleratorAcceleratorXlibXlib

    X11 ServerX11 Server

    IntegratedIntegratedVisualVisualSystemSystem

    OpenGLOpenGLServerServer

    X11X11ServerServer

    Xinerama, MDU, CAVElib, etc.Xinerama, MDU, CAVElib, etc.

    DDXDDX OpenGL DPOpenGL DP

    ApplicationApplication

    OpenGLOpenGL

    StorageStorageSystemSystem

    SAN orNetwork

  • High End Visual System EvolutionVirtualize and Distribute 3D Display Capability

    Virtualize Display: Applications run on single system with rendering Virtualize Display: Applications run on single system with rendering

    distributed across multiple 3D Graphics Accelerators and Virtualized Displaydistributed across multiple 3D Graphics Accelerators and Virtualized Display

    3D Graphics3D GraphicsAcceleratorAccelerator

    3D Grpahics3D GrpahicsAcceleratorAcceleratorXlibXlib

    X11 ServerX11 Server

    IntegratedIntegratedVisualVisualSystemSystem

    OpenGLOpenGLServerServer

    X11X11ServerServer

    Xinerama, MDU, CAVElib, etc.Xinerama, MDU, CAVElib, etc.

    DDXDDX OpenGL DPOpenGL DP

    ApplicationApplication

    OpenGLOpenGL

    StorageStorageSystemSystem

    SAN orNetwork

    SwitchSwitchDisplayDisplay

    viaviaVideoVideoSwitchSwitch

    ororRedirectRedirectover IPover IP

    NetworkNetwork

    IP

    Video

  • High End Visual Grid SystemVirtualize and Distribute 3D Accelerators

    Applications run on High Capacity Enterprise Server with rendering Applications run on High Capacity Enterprise Server with rendering

    distributed via Grid Engine across a visual grid of multiple 3D server nodesdistributed via Grid Engine across a visual grid of multiple 3D server nodes

    3D Graphics3D GraphicsAcceleratorAccelerator

    DZGLDZGL

    XlibXlib

    3D server3D server

    3D Grpahics3D GrpahicsAcceleratorAccelerator

    DZGLDZGL

    XlibXlib

    3D server3D server

    X11 ServerX11 Server

    DZGLDZGL

    XlibXlib

    Application ServersApplication Servers

    OpenGLOpenGLServerServer

    X11X11ServerServer

    OpenGLOpenGLServerServer

    X11X11ServerServer

    Xinerama, MDU, CAVElib, etc.Xinerama, MDU, CAVElib, etc.

    DDXDDX OpenGL DPOpenGL DP

    ApplicationApplication

    OpenGLOpenGL

    High Bandwidth Grid Interconnect

    StorageStorageSystemSystem

    SAN orNetwork

    IP

    Video

    SwitchSwitchDisplayDisplay

    viaviaVideoVideoSwitchSwitch

    ororRedirectRedirectover IPover IP

    NetworkNetwork

    IP

    SwitchSwitchDisplayDisplay

    viaviaVideoVideoSwitchSwitch

    ororRedirectRedirectover IPover IP

    NetworkNetwork

    IP

  • High End Visual Grid SystemVirtualize and Distribute Application and Visualization

    Applications virtualized using Grid Engine on a cluster of any kind of servers withApplications virtualized using Grid Engine on a cluster of any kind of servers with

    rendering distributed across multiple 3D server nodesrendering distributed across multiple 3D server nodes

    3D Graphics3D GraphicsAcceleratorAccelerator

    DZGLDZGL

    XlibXlib

    3D server3D server

    3D Grpahics3D GrpahicsAcceleratorAccelerator

    DZGLDZGL

    XlibXlib

    3D server3D server

    X11 ServerX11 Server

    DZGLDZGL

    XlibXlib

    Application Server ClusterApplication Server Cluster

    OpenGLOpenGLServerServer

    X11X11ServerServer

    OpenGLOpenGLServerServer

    X11X11ServerServer

    Xinerama, MDU, CAVElib, etc.Xinerama, MDU, CAVElib, etc.

    DDXDDX OpenGL DPOpenGL DP

    ApplicationApplication

    OpenGLOpenGL

    High Bandwidth Grid Interconnect

    StorageStorageSystemSystem

    SAN orNetwork

    IP

    Video

    SwitchSwitchDisplayDisplay

    viaviaVideoVideoSwitchSwitch

    ororRedirectRedirectover IPover IP

    NetworkNetwork

    IP

    SwitchSwitchDisplayDisplay

    viaviaVideoVideoSwitchSwitch

    ororRedirectRedirectover IPover IP

    NetworkNetwork

    IP

  • The University of Texas at AustinVisualization at the ACES Visualization Lab

    "The speed of graphics and superior image quality of the Sun Fire V880z visualization system will play a significant role in these key areas of technology and research."

    Dr. Kelly Gaither, Associate Director of the Texas Advanced Computing Center

    � Visualization for a range of applications including computational fluid dynamics, bioinformatics and geophysical modeling

    � Complex rendering and multiprocessing of graphics and images for researchers in a diverse range of fields

    � Sun Fire V880z and a Scalable Distributed Sun Fire 6800 visualization system

  • Data

    ComputeandVisualization

    GridSystemSoftware

    Sun Grid Reference ArchitecturesProven Solutions, Updated Regularly

  • Implementing a Grid Solution� Compute Grid Infrastructure Solution

    Sun Fire V20z based Compute Grid Rack System Opteron

    Sun Fire E25K Supercluster 144 SPARC cores/node

    Customer Ready Systems factory built to specification

    StarterCluster for Bioinformatics low cost entry point

    � Grid Reference Architecture Designed, tested, tuned, benchmarked and documented

    Opteron with Gbit Ethernet, Myrinet and Infiniband

    Scalable NFS using clustered QFS

    Sun BluePrint: Implementing Globus on Solaris

    � Grid Professional Services Worldwide Practice Workshop, architecture assessment, design and implementation

  • Case Study: CellTech,Life Sciences� Better

    Up to 99% Utilization Heterogeneous resources

    � Faster 6-10x improvement in productivity

    � Cheaper Lower cost of software licenses

    Sun enabled us to meet our peak demandSun enabled us to meet our peak demand

    turnaround requirements and run turnaround requirements and run

    algorithms that simply weren't possible beforealgorithms that simply weren't possible before

    Alan Hart, Celltech Alan Hart, Celltech

  • Sun's Grid Ecosystem

    Grid System Software

    Grid Data Center

    Global Grid Practice

    Alliances

    Data Compute Visual

  • Adrian Cockcroft

    [email protected] Everywhere.

    www.sun.com/solutions/hpc/

    Grid EverywhereSun's HPTC/Grid Strategy