data view of teragrid logical site model
TRANSCRIPT
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
TeraGrid:Logical Site Model
Chaitan BaruData and Knowledge Systems
San Diego Supercomputer Center
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
National Science Foundation TeraGrid
• Prototype for Cyberinfrastructure (the “lower” levels)
• High Performance Network: 40 Gb/s backbone, 30 Gb/s to each site
• National Reach: SDSC, NCSA, CIT, ANL, PSC• Over 20 Teraflops compute power• Approx. 1 PB rotating Storage• Extending by 2-3 sites in Fall 2003
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
Services/Software View of Cyberinfrastructure
Hardware
Grid Services & Middleware
DevelopmentTools & Libraries
Applications• Environmental Science• High Energy Physics• Proteomics/Genomics• …
Domain-specific Cybertools (software)
Domain-specific Cybertools (software)
Domain-specific Cybertools (software)
Domain-specific Cybertools (software)
Shared Cybertools (software)
Shared Cybertools (software)
Shared Cybertools (software)
Shared Cybertools (software)
Distributed Resources
(computation, communication,
storage, etc.)
Distributed Resources
(computation, communication,
storage, etc.)
Distributed Resources
(computation, communication,
storage, etc.)
Distributed Resources
(computation, communication,
storage, etc.)
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SDSC Focus on Data: A Cyberinfrastructure “Killer App”
• Over the next decade, data will come from everywhere• Scientific instruments• Experiments• Sensors and sensornets• New devices (personal digital devices,
computer-enabled clothing, cars, …)
• And be used by everyone• Scientists• Consumers• Educators• General public
• SW environment will need to support unprecedented diversity, globalization, integration, scale, and use
Data from sensors
Data from simulations
Data from
instruments
Data from analysis
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
Prototype for Cyberinfrastructure
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SDSC Machine Room Data Architecture
• Enable SDSC to be the grid data engine
Blue Horizon
HPSS
LAN (multiple GbE, TCP/IP)
SAN (2 Gb/s, SCSI)
Linux Cluster, 4TF
Sun F15K
WAN (30 Gb/s)
SCSI/IP or FC/IP
FC Disk Cache (400 TB)
FC GPFS Disk (100TB)
200 MB/s per controller
Silos and Tape, 6 PB, 1 GB/sec disk to tape 32 tape drives
30 MB/s per drive
Database Engine
Data Miner
Vis Engine
Local Disk (50TB)
Power 4
Power 4 DB
• .5 PB disk• 6 PB archive• 1 GB/s disk-to-tape• Support for DB2 /Oracle
DBMS disk (~10TB)
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
The TeraGrid Logical Site View
• Ideally, applications / users would like to see:• One single computer • Global everything: filesystem, HSM, database system• With highest possible performance
• We will get there in steps• Meanwhile, the TeraGrid Logical Site View
provides a uniform view of sites• A common abstraction supported by every site
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
Logical Site View
• Logical Site View is currently simply provided as a set of environment variables• Can easily become a set of services
• This is minimum required to enable a TG application to easily make use of TG storage resources
• However, for “power” users, we also anticipate the need to expose mapping from logical to physical resources at each site• Enables applications to take advantage of site-specific
configurations and obtain optimal performance
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
Basic Data Operations
• The Data WG has stated as a minimum requirement:a) the ability for a user to transfer data between any TG
storage resource to memory on any TG compute resource – possibly via the use of an intermediate storage resource
b) Ability to transfer data between any two TG storage resources
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
ComputeCluster
DBMS
Logical Site View
ComputeCluster
HSM
DBMSCollection
Management
Scratch
Staging Area
Staging Area
Staging Area
“Network” Staging
Area
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
Environment Variables
• TG_NODE_SCRATCH• TG_CLUSTER_SCRATCH• TG_GLOBAL_SCRATCH• TG_SITE_SCRATCH…?• TG_CLUSTER_HOME• TG_GLOBAL_HOME • TG_STAGING • TG_PFS
• TG_PFS_GPFS, TG_PFS_PVFS, TG_PFS_LUSTRE
• TG_SRB_STAGING
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
Issues Under Consideration
• Suppose a user wants to run computation, C, on data, D
• The TG middleware should automatically figure out• Whether C should move to where D is, or vice versa• Whether data, D, should be pre-fetched, or “streamed”• Whether output data should be streamed to persistent
storage, or staged via intermediate storage• Whether prefetch/staging time ought to be “charged” to
the user or not