open science grid

13
Rob Gardner • University of Chicago Open Science Grid Rocky Mountain HPC Symposium University of Colorado, August 12, 2015 OSG User Support and Campus Grids area coordinator [email protected]

Upload: rob-gardner

Post on 23-Jan-2017

301 views

Category:

Science


1 download

TRANSCRIPT

Rob Gardner • University of Chicago

Open Science GridRocky Mountain HPC SymposiumUniversity of Colorado, August 12, 2015

OSG User Support and Campus Grids area [email protected]

OSG: distributed high throughput (HTC)

aggregate of individualclusters > 100k cores

● 2014 stats○ 67% size of XD, 35% BlueWaters○ 2.5 Million CPU hours/day○ 800M hours/year○ 125M/y provided opportunistic

● >1 petabyte of data/day xfered● 50+ research groups● thousands of users● XD service provider for XSEDE

Rudi EigenmannProgram Director Division of

Advanced Cyberinfrastructure (ACI)NSF CISE

CASC Meeting, April 1, 2015

OSG: distributed high throughput (HTC)

OSG for users - HTC profile (vs HPC)

● Does your science workflow involve computations that can be split into many independent jobs?

● Can individual jobs be run on a single processor (as opposed to a single large-scale MPI job simultaneously utilizing many processors)?

● Can your applications be interrupted and restarted at a later time or on another processor or computing site?

● Is your application software “portable”? ● How does computing fit into your science workflow? Are results needed

immediately after submission?● Do your computations require access to or produce huge amounts of data?

Login Service: connect users to national ACI

OSG Connect Service

OSG as a campus cluster

★ Login host★ Job scheduler★ Software★ Storage

Submit jobs to OSG with HTCondor

● Simple HTCondor submission● Complexity hidden from the user● No grid (X509) certificates required● Uses HTCondor ClassAd and glidein

technology● DAGMan and other workflow tools

Software & tools on the OSG

● Distributed software file system● Special module command

○ identical software on all clusters

● Common tools & libs● Curate on demand

continuously● HTC apps in XSEDE

campus bridging yum repo

$ switchmodules oasis

$ module avail

$ module load R

$ module load namd

$ module purge

$ switchmodules local

https://goo.gl/A7kNkf

Storage service: “Stash”

● Storage service for job input/output data● Globus Server for managed transfers between

campus and the OSG ● POSIX access provided to the login host● 500 TB capacity● Personalized http endpoint● Connected to 100 Gbps SciDMZ (I2, ESnet)

campus to OSG stash

Connecting campus to national ACI

OSG for resource providers

● Connect your campus users to the OSG○ connect-client - job submit client for the local cluster○ provide “burst” like capability for HTC jobs to shared

opportunistic resources● Connect campus cluster to OSG

○ Lightweight connect : OSG sends “glidein” jobs to your cluster, using a simple user account■ No local software or services needed!

○ Large scale: deploy the OSG software stack■ Support more science communities at larger scale

@OsgUserssupport.opensciencegrid.orguser-support@opensciencegrid.org

opensciencegrid