nmi testbed grid utility for virtual organization
DESCRIPTION
NMI Testbed GRID Utility for Virtual Organization. Art Vandenberg [email protected] Director, Advanced Campus Services Georgia State University. NSF Supported. This material is based in part upon work supported by the National Science Foundation under Grant No. ANI-0123937 and - PowerPoint PPT PresentationTRANSCRIPT
NMI Testbed GRIDUtility for Virtual Organization
Art Vandenberg [email protected]
Director, Advanced Campus Services Georgia State University
NSF Supported
This material is based in part upon work supported by the National Science Foundation under
Grant No. ANI-0123937 and
Grant No. ITR-0312636.
Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).
Overview
NMI Testbed GRID – “virtual organization”Participating sitesResources for VO
Catalog of grid applications
Example: genome alignment for VO
Plans for May-August 2004
Vision – NMI Testbed GRID VO
NMI Integration Testbed Program•NSF #ANI-0123937
Explore grid capability – interoperability•Researchers & faculty•Across heterogeneous sites •Integrated with enterprise middleware
A utility grid using NMI components•Non-specialized, open, transparent
Collaborative environment VO
Beyond application specific gridsLeverage enterprise middleware
•Identity management, authN, authZ...•Strive for transparent access
Portals•Ease of use: submit, monitor, retrieve data
Security policy & technology•Federation of cooperating sites
Your Grid is here, now
We want VO utility grid to be here...
Participating sites - the VO
Testbed sites – push interoperation limits
•Georgia State University•Texas Advanced Computing Center•University of Alabama at Birmingham•University of Alabama at Huntsville•University of Michigan•University of Southern California•University of Virginia
Site resources – VO
Testbed sites – interoperation challenges•GSU: Shibboleth, GridPort portal, REU & Grads, disk
•TACC: REU student, portal, Enterprise CA, cluster
•UAB: beowulf cluster, CA, Pubcookie, OGCE portal
•UAH: application expertise, NASA IPG Certs
•UMich: KX.509 & Kerberos, MGrid, ATLAS integration
•USC: CA, Pubcookie, Shibboleth, Linux cluster, KX.509
•UVa: Bridge CA model
Sites non-homogeneous – a VO challenge
Catalog of grid applications
Knowledge base is important•REU students – Nicole Geiger, Anish Shindore
•Graduate Research Asst – Manish Garg
NMI Testbed Sites initially•Researchers, schools, projects•Grid specific as well as grid potential•Started as spreadsheet, now online db
Catalog of grid applications
Catalog of Grid Applications (current version)
http://art12.gsu.edu:8080/grid_cat/index5.jsp
Expanding scope beyond testbed sites•18 schools/labs, 300 researchers & counting
Differentiated from Globus www.gpds.org•Oriented to researcher, institutional level•Planning clustering, visualization modality•Clustering work related to: NSF #ITR-0312636
Example: genome alignment for VO
(GSU – UAB)
An opportunity for utility Grid VO•Nova Ahmed, CS grad with Dr. Yi Pan, GSU•dynamic programming algorithm for genome sequence alignment
Initial runs on GSU shared memory hydra•Limited access (grad student, shared cycles)
•Algorithm improvement using multi-processor cluster across a grid?
The Genome Alignment Problem• Alignment of DNA sequences Sequence X: TGATGGAGGT Sequence Y: GATAGG
• Count the matching score as 1 => matching 0 => non-matching • Populate the Similarity matrix using:
Observation re Similarity Matrix:• Many zero values• Reduction of memory possible by reducing zero value elements
Improved Parallel Algorithmfor Genome Alignment
The new Data Structure: • New algorithm calculates only non-zero values of the similarity matrix• Memory is dynamically allocated as needed
The parallel Method:• Similarity matrix is divided among processors• Processors calculate in parallel to match the partial sequence• Communication is done among the processors to match the whole sequence
Results on the Shared Memory Machine (Hydra)
PerformanceComputation time decreases with increased number of processors
2 4 6 8 10 12
Computation Time(Shared Memory)
0
100
200
300
400
500
Co
mp
uta
tio
n
Tim
e
Number of Processors
Computation Time(Shared Memory)
Limitations• Can not allocate memory for long sequences
Ex: Largest sequence to align is 2000 x 2000
• Number of processors are limited
Ex: For Hydra 12 processors
• Not scalable
Results on the Beowulf Cluster of UAB
Using the beowulf cluster:• Longer genome sequence can be aligned
Highest sequence length can be 10,000 in the cluster
• Limited scalability
Can increase the number of processors up to a certain limit
2
6
10
14
18
22
26
30
0
100
200
300
400
500
Co
mp
uta
tio
n t
ime
(sec)
Number of processors
Computation Time(Cluster)
Computation Time (SharedMemory)
Results via the GRID at UAB
Advantages:• Scalable – Can add new clusters to the grid• Easier job submission – Don’t need account on every node• Scheduling is easier – Can submit multiple jobs at one time
Submitting genome alignment program using Globus and MPICH
2 6
10
14
18
22
26
30
0
100
200
300
400
500 Computation Time(Cluster)
Computation Time(Grid)
Computation Time(Shared Memory)
Future Work Genome Alignment
Use MPICH-G2 (instead of MPICH) –•Use the power of Grid
Expand the computational resources –•Combine more clusters across the Grid
Develop program to align Multiple Genome Sequences (rather than two at a time) – •Requiring more computation resources
Use Georgia State certificate via Bridge CA•Via Shibboleth protected sector CA…?
Plans for May-August 2004
More resources•Contributed from current sites (others?)
Portal for NMI Testbed GRID•Cf. NPACI Hotpage https://hotpage.npaci.edu/
Integration of campus authN•UVa Bridge CA
More applications•Utility grid for grad research & education
Plans for May-August 2004…
Documentation•Web site•Application docs and demos
Catalog of Grid Applications•Provide for self service contribution•Develop clustering (SOM), visualization options (“find researchers or projects like X”)
•Auto-discovery of Grid researchers & apps based on reference sets (core sites)?
Contact Information
Art Vandenberg•[email protected]
NMI Testbed GRID•http://www.gsu.edu/~wwwacs/GRID_Group/NMI.html