australian partnership for advanced computing

32
Australian Partnership for Advanced Computing “providing advanced computing and grid infrastructure for eResearch” Rhys Francis Manager, APAC grid program Partners: Australian Centre for Advanced Computing and Communications (ac3) in NSW The Australian National University (ANU) Commonwealth Scientific and Industrial Research Organisation (CSIRO) Interactive Virtual Environments Centre (iVEC) in WA Queensland Parallel Supercomputing Foundation (QPSF) South Australian Partnership for Advanced Computing (SAPAC) The University of Tasmania (TPAC) Victorian Partnership for Advanced Computing (VPAC)

Upload: alec

Post on 09-Jan-2016

41 views

Category:

Documents


2 download

DESCRIPTION

Australian Partnership for Advanced Computing. Partners: Australian Centre for Advanced Computing and Communications ( ac3 ) in NSW The Australian National University ( ANU ) Commonwealth Scientific and Industrial Research Organisation ( CSIRO ) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Australian Partnership for Advanced Computing

Australian Partnership forAdvanced Computing

“providing advanced computing andgrid infrastructure for eResearch”

Rhys FrancisManager, APAC grid program

Partners:• Australian Centre for Advanced Computing and

Communications (ac3) in NSW• The Australian National University (ANU)• Commonwealth Scientific and Industrial Research

Organisation (CSIRO)• Interactive Virtual Environments Centre (iVEC) in WA• Queensland Parallel Supercomputing Foundation (QPSF)• South Australian Partnership for Advanced Computing

(SAPAC) • The University of Tasmania (TPAC)• Victorian Partnership for Advanced Computing (VPAC)

Page 2: Australian Partnership for Advanced Computing

• National Facility Program – a world-class advanced computing service– currently 232 projects and 659 users (27 universities)– major upgrade in capability (1650 processor Altix 3700 system)

• APAC Grid Program– integrate the National Facility and Partner Facilities– allow users easier access to the facilities– provide an infrastructure for Australian eResearch

• Education, Outreach and Training Program– increase skills to use advanced computing and grid systems– courseware project– outreach activities – national and international activities

APAC Programs

Page 3: Australian Partnership for Advanced Computing

Engineering Taskforce

Implementation Taskforce

Project Leader

Research Leader

Steering Committee

Activities

APAC Grid Development

APAC Grid Operation

Research Activities Development Activities

Project Leader

Activities

140 people

>50 full time equivs

$8M pa in people

Plus compute/dataresources

Page 4: Australian Partnership for Advanced Computing

Projects

Grid Infrastructure Computing Infrastructure

• Globus middleware• certificate authority• system monitoring and management

(grid operation centre)

Information Infrastructure• resource broker (SRB)• metadata management support

(Intellectual Property control)• resource discovery

User Interfaces and Visualisation Infrastructure

• portals to application software• workflow engines• visualisation tools

Grid Applications

Astronomy

High-Energy Physics

Bioinformatics

Computational Chemistry

Geosciences

Earth Systems Science

Page 5: Australian Partnership for Advanced Computing

Organisation Chart

Program Manager Rhys Francis

Project Leader S/C Chair Astronomy Gravity Wave Susan Scott Rachael Webster Astrophysics portal Matthew Bailes Rachael Webster Australian Virtual Observatory Katherine Manson Rachael Webster Genome annotation Matthew Bellgard Mark Ragan Molecular docking Rajkumar Buyya Mark Ragan Chemistry workflow Andrey Bliznyuk Brian Yates Earth Systems Science workflow Glenn Hyland Andy Pitman Geosciences workflow Robert Woodcock Scott McTaggart EarthBytes Dietmar Muller Scott McTaggart Experimental high energy physics Glenn Moloney Tony Williams Theoretical high energy physics Paul Coddington Tony Williams Remote instrument management Chris Willing Bernard Pailthorpe

Project Leader Services Gateway VM Compute Infrastructure David Bannon CA NG1, NG2 VOMS/VOMRS Gram2/4 Information Infrastructure Ben Evans SRB NGdata GridFTP MDS2/4 UI&VI Rajesh Chabbra Gridsphere NGportal Myproxy Collaboration Services Chris Willing A/G

APAC Executive Director John O’Callaghan

Name Partner Name Partner Youzhen Cheng ac3 David Baldwyn ANU Bob Smart CSIRO Darran Carey iVEC Martin Nicholls QPSF/UQ Grant Ward SAPAC John Dalton TPAC Chris Samuel VPAC

Associated grid nodes David Green QPSF/Griffith Ian Atkinson QPSF/JCU Ashley Wright QPSF/QUT Marco La Rosa UoM

Gateway Servers Support Team David Bannon

Services Architect Markus Buchhorn

LCG VM Marco La Rosa

Infrastructure Support (Middleware)

Application Support

Infrastructure Support (Systems)

Strategic Management

Middleware Deployment

Research Applications

Systems Management

Page 6: Australian Partnership for Advanced Computing

Experimental High Energy Physics

• Belle Physics Collaboration– K.E.K. B-factory detector

• Tsukuba, Japan– Matter/Anti-matter investigations– 45 Institutions, 400 users worldwide

• 10 TB data currently– Australian grid for KEK-B data

• testbed demonstrations• data grid centred on APAC National Facility

• Atlas Experiment– Large Hadron Collider (LHC) at CERN

• 3.5 PB data per year (now 15 PB pa)• operational in 2007

– Installing LCG (GridPP), will follow EGEE

Page 7: Australian Partnership for Advanced Computing

Belle Experiment• Simulated collisions or events

– used to predict what we’ll see (features of data)– essential to support design of systems– essential for analysis

• 2 million lines of code

Page 8: Australian Partnership for Advanced Computing

Belle simulations

• Computationally intensive– simulate beam particle collisions, interactions, decays

– all components and materials : 10x10x20 m, 100 µm accuracy

– tracking and energy deposition through all components

– all electronics effects (signal shapes, thresholds, noise, cross-talk)

– data acquisition system (DAQ)

• Need 3 times as many simulations as real events to reduce statistical fluctuations

Page 9: Australian Partnership for Advanced Computing

Belle status

• Apparatus at KEK in Japan

• Simulation work done world wide

• Shared using an SRB federation: KEK, ANU, VPAC,

Korea, Taiwan, Krakow, Beijing…(led by Australia!)

• Previous research work used script based workflow

control, project is currently evaluating LCG middleware for

workflow management

• Testing in progress: LCG job management, APAC grid job

execution (2 sites), APAC grid SRB data management (2

sites) with data flow using international SRB federations

• Limitation is international networking

Page 10: Australian Partnership for Advanced Computing

Earth Systems Science Workflow

• Access to Data Products– Inter-governmental Panel Climate

Change scenarios of future climate (3TB)

– Ocean Colour Products of Australasian and Antarctic region (10TB)

– 1/8 degree ocean simulations (4TB)– Weather research products (4TB)– Earth Systems Simulations– Terrestrial Land Surface Data

• Grid Services– Globus based version of OPeNDAP (UCAR/NCAR/URI)– Server side analysis tools for data sets: GRADS, NOMADS– Client side visualisation from on-line servers– THREDDS (catalogues of OPeNDAP repositories)

Page 11: Australian Partnership for Advanced Computing

Workflow Vision

Discovery Visualisation

Digital Library

OPeNDAP

AP

AC

NF

VP

AC

AC

3

SA

PA

C

IVE

C

Job/Data Management

Analysis Toolkit

Crawler

Page 12: Australian Partnership for Advanced Computing

DiscoveryPortlet

VisualisationPortlet

Get DataPortlet

Analysis ToolkitPortlet

Web MapService

Web ProcessingService

Web CoverageServiceOAI

Library API (Java)

Live Access Server (LAS)

OPeNDAP Server

Processing App.

Metadata

Crawler

Digital Repository

Gridsphere Portal

WebServices

Application Layer

DataLayer

HardwareLayer Compute Engine

ConfigMetadata

Workflow Components

Page 13: Australian Partnership for Advanced Computing

APAC NF (Canberra)International IPCC model results (10-50Tb)TPAC 1/8 degree ocean simulations (7Tb)

Met Bureau Research Centre (Melbourne)Near real-time LAPS analyses products (<1Gb)Sea- and sub-surface temperature products

TPAC & ACE CRC (Hobart)NCEP2 (150Gb), WOCE3 Global (90Gb)Antarctic AWS (150Gb), Climate modelling (4Gb)Sea-ice simulations, 1980-2000

CSIRO Marine Research (Hobart)Ocean colour products & climatologies (1Tb)Satellite altimetry data (<1Gb)Sea-surface temperature product

CSIRO HPSC (Melbourne)IPCC CSIRO Mk3 model results (6Tb)

AC3 Facility (Sydney)Land surface datasets

OPeNDAP Services

Page 14: Australian Partnership for Advanced Computing
Page 15: Australian Partnership for Advanced Computing

Data

User

SRB

MCAT

SSA

SRB

get( )

AVD

Registry

SIASSA

query

List of m

atches

quer

y

Lis

t of

mat

ches

Australian Virtual Observatory

Page 16: Australian Partnership for Advanced Computing
Page 17: Australian Partnership for Advanced Computing
Page 18: Australian Partnership for Advanced Computing

APAC Grid Geoscience

• Conceptual models • Databases• Modeling codes• Mesh generators• Visualization packages• People• High Performance

Computers• Mass Storage

Facilities

Core

Deep Mantle

UpperMantle

Oceanic Lithosphere

UpperCrust

SedimentsOceanicCrust

Oceans

Biosphere

Atmosphere

lower Crust

Subcontinentallithosphere

weathering

Page 19: Australian Partnership for Advanced Computing

Mantle Convection

• Observational Databases–access via SEE Grid Information

Services standards

• Earthbytes 4D Data Portal–Allows users to track observations

through geological time and use them as model boundary conditions and/or to validate process simulations.

• Mantle Convection–solved via Snark on HPC resources

• Modeling Archive–stores the problem description so they

can be mined and auditedTrial application provided by:•D. Müller (Univ. of Sydney)•L. Moresi (Monash Univ./MC2/VPAC)

Page 20: Australian Partnership for Advanced Computing

Workflows and services

Resource Registry

Service Registry

Results Archive

Data Management Service

HPC Repository

LoginJob

MonitorRun

SimulationEdit Problem Description

Local Repository

Archive Search

Geology S.A

Geology W.ARock Prop.

N.S.W

Rock Prop. W.A

AAA Job Management

Service

Snark ServiceEarthBytes

Service

User

Page 21: Australian Partnership for Advanced Computing

Status update

APAC National Grid

Page 22: Australian Partnership for Advanced Computing

Key steps

• Implementation of our own CA

• Adoption of VDT middleware packaging

• Agreement to a GT2 base for 2005, GT4 in 2006

• Agreement on portal implementation technology

• Adoption of federated SRB as base for shared data

• Development of gateways for site grid architecture

• Support for inclusion of ‘associated’ systems

• Implementation of VOMS/VOMRS

• Development of user and provider policies

Page 23: Australian Partnership for Advanced Computing

VDT components

DOE and LCG CA Certificates v4 (includes LCG 0.25 CAs) GriPhyN Virtual Data System (containing Chimera and Pegasus) 1.2.14 Condor/Condor-G 6.6.7 VDT Condor configuration scriptFault Tolerant Shell (ftsh) 2.0.5 Globus Toolkit 2.4.3 + patches VDT Globus configuration scriptGLUE Schema 1.1, extended version 1 GLUE Information Providers CVS version 1.79, 4-April-2004 EDG Make Gridmap 2.1.0 EDG CRL Update 1.2.5 GSI-Enabled OpenSSH 3.4 Java SDK 1.4.2_06 KX509 2031111 Monalisa 1.2.12 MyProxy 1.11 PyGlobus 1.0.6 UberFTP 1.3 RLS 2.1.5 ClassAds 0.9.7 Netlogger 2.2

Apache HTTPD, v2.0.54 Apache Tomcat, v4.1.31 Apache Tomcat, v5.0.28 Clarens, v0.7.2 ClassAds, v0.9.7 Condor/Condor-G, v6.7.12

VDT Condor configuration scriptDOE and LCG CA Certificates, vv4 (includes LCG 0.25 CAs) DRM, v1.2.9 EDG CRL Update, v1.2.5 EDG Make Gridmap, v2.1.0 Fault Tolerant Shell (ftsh), v2.0.12 Generic Information Provider, v1.2 (2004-05-18) gLite CE Monitor, v1.0.2 Globus Toolkit, pre web-services, v4.0.1 + patches Globus Toolkit, web-services, v4.0.1 GLUE Schema, v1.2 draft 7 Grid User Management System (GUMS), v1.1.0

GSI-Enabled OpenSSH, v3.5Java SDK, v1.4.2_08jClarens, v0.6.0 jClarens Web Service Registry, v0.6.0 JobMon, v0.2 KX509, v20031111 Monalisa, v1.2.46 MyProxy, v2.2 MySQL, v4.0.25 Nest, v0.9.7-pre1 Netlogger, v3.2.4 PPDG Cert Scripts, v1.6 PRIMA Authorization Module, v0.3 PyGlobus, vgt4.0.1-1.13 RLS, v3.0.041021 SRM Tester, v1.0 UberFTP, v1.15 Virtual Data System, v1.4.1 VOMS, v1.6.7 VOMS Admin (client 1.0.7, interface 1.0.2, server 1.1.2), v1.1.0-r0

Page 24: Australian Partnership for Advanced Computing

Our most important design decision

V-LAN

Gateway Server

Cluster

Datastore

Cluster

Gateway Server

Cluster

Datastore

Cluster

Installing Gateway Servers at all grid sites, using VM technology to

support multiple grid stacks

Gateways will support, GT2, GT4, LCG/EGEE, Data grid (SRB etc),

Production Portals, development portals, experimental grid stacks

High bandwidth, dedicated private networking between grid

sites

Page 25: Australian Partnership for Advanced Computing

Gateway Systems

• Support the basic operation of the APAC National Grid and translate grid protocols into site specific actions– limit the number of systems that need grid components

installed and managed – enhance security as many grid protocols and associated

ports only need to be open between the gateways– in many cases only the local gateways need to interact

with site systems– support roll-out and control of production grid

configuration– support production and development grids and local

experimentation using Virtual Machine implementation

Page 26: Australian Partnership for Advanced Computing
Page 27: Australian Partnership for Advanced Computing

Grid pulse – every 30 minutesGateway Down [email protected] Gateway Up [email protected] Gateway Up [email protected] Gateway Up [email protected] Gateway Up [email protected] Gateway Down [email protected] Gateway Up [email protected] Gateway Up [email protected] Gateway Up [email protected] Gateway Down [email protected] Gateway Down [email protected] Gateway Up [email protected] Gateway Up [email protected] Gateway Up [email protected] Gateway Up [email protected] Gateway Up [email protected] Gateway Up [email protected] Gateway Down [email protected] Gateway Up [email protected]

NG1 – globus toolkit 2 servicesANUiVECVPAC

NG2 – globus toolkit 4 servicesiVECSAPAC (down)VPAC

NGDATA – SRB & GridFTPANUiVEC VPAC (down)

NGLCG – special physics stackVPAC

NGPORTAL – apache/tomcatiVECVPAC

http://goc.vpac.org/

Page 28: Australian Partnership for Advanced Computing

A National Grid

GrangeNet BackboneCentie/GrangeNet Link

AARNet Links

TownsvilleQPSF

Brisbane

CanberraANU

MelbourneVPACCSIRO

Sydneyac3

PerthIVEC

CSIRO

AdelaideSAPAC

HobartTPACCSIRO

+3500 processors

+3PB near line storage

Page 29: Australian Partnership for Advanced Computing

Mass stores (15TB cache, 200+ TB holdings, 3PB capacity)• ANU 5+1300 TB CSIRO 5+1300 TB plus several 70-100 TB stores

Compute Systems (aggregate 3500+ processors)• Altix 1,680 1.6 GHz Itanium-II 3.6 TB 120 TB disk• NEC 168 SX-6 vector cpus 1.8 TB 22 TB disk• IBM 160 Power 5 cpus 432 GB • 2 x Altix 160 1.6 GHz Itanium-II 160 GB• 2 x Altix 64 1.5 GHz Itanium-II 120 GB NUMA• Altix 128 1.3 GHz Itanium-II 180 GB 5TB disk, NUMA• 374 x 3.06 GHz Xeon 374 GB Gigabit Ethernet• 258 x 2.4 GHz Xeon 258 GB Myrinet• 188 x 2.8 GHz Xeon 160 GB Myrinet• 168 x 3.2 GHz Xeon 224 GB GigE, 28 with

infiniband• 152 x 2.66 GHz P4 153 GB 16TB disk, GigE

Significant Resource Base

Page 30: Australian Partnership for Advanced Computing

Resources Users

Data Compute Monitoring Constraints Activities Interfaces

Grid Staging and Execution

Progress Monitoring

Authorisation(Policy &

Enforcement)

Global resource allocation and

scheduling

Command line access to

Resources

Resource Discovery

VO Mgmt (Rights, Shares,

Delegations)

Workflow Processing (Job

execution)Portals, workflow

Access Services

Grid Interfaces

Resource Availability

AccountingApplication development

Portal for Grid Mgmt (GOC)

Data Movement

QueuesResource

RegistrationConfiguration

Mgmt

Data and Metadata Mgmt

(Curation)

AccessGrid interaction

Files, DBs, Streams

Binaries, Libraries, Licenses

History,Auditing

Authentication (Identity Mgmt)

Reporting, analysis and

summarisation

3rd party GUIs for applications and

activities

Operating Systems and Hardware

Firewalls, NATs and Physical Networks

Security: agreements, obligations, standards, installation, configuration, verification

Functional decomposition

Page 31: Australian Partnership for Advanced Computing

Resources Users

Data Compute Monitoring Constraints Activities Interfaces

Grid Staging and Execution

Progress Monitoring

Authorisation(Policy &

Enforcement)

Global resource allocation and

scheduling

Command line access to

Resources

Resource Discovery

VO Mgmt (Rights, Shares,

Delegations)

Workflow Processing (Job

execution)

Portals, workflow

Access Services

Grid Interfaces

Resource Availability

AccountingApplication development

Portal for Grid Mgmt (GOC)

Data Movement

QueuesResource

RegistrationConfiguration

Mgmt

Data and Metadata Mgmt

(Curation)

AccessGrid based interaction

Files, DBs, Streams

Binaries, Libraries, Licenses

History,Auditing

Authentication (Identity Mgmt)

Reporting, analysis and

summarisation

3rd party GUIs for applications and

activities

Operating Systems and Hardware

Firewalls, NATs and Physical Networks

Security: agreements, obligations, standards, installation, configuration, verification

1 2 43 5 6

Page 32: Australian Partnership for Advanced Computing

APAC National Grid

• Basic Services– single ‘sign-on’ to the facilities

– portals to the computing and data systems

– access to software on the most appropriate system

– resource discovery and monitoring

VPAC

QPSF

TPAC

IVEC

APACNATIONALFACILITY

ANU

CSIRO

SAPAC

AC3

one virtualsystem of

computationalfacilities