rosa m. badia grid and clusters manager barcelona supercomputing center coregrid
DESCRIPTION
User-driven resource selection in GRID superscalar Last developments and future plans in the framework of CoreGRID. Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center http://www.coregrid.net [email protected]. Outline. GRID superscalar overview - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/1.jpg)
Managed by
User-driven resource selection in GRID superscalar Last developments and future plans
in the framework of CoreGRID
Rosa M. BadiaGrid and Clusters Manager
Barcelona Supercomputing Centerhttp://www.coregrid.net
![Page 2: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/2.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 2
Outline
1. GRID superscalar overview
2. User defined cost and constraints interface
3. Deployment of GRID superscalar applications
4. Run-time resource selection
5. Plans for CoreGRID
![Page 3: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/3.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 3
1. GRID superscalar overview
Programming environment for the Grid
Goals:
– Grid as transparent as possible to the programmer
Approach
– Sequential programming (small changes from original
code)
– Specification of the Grid tasks
– Automatic code generation to build Grid applications
– Underlying run-time (resource, file, job management)
![Page 4: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/4.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 4
1. GRID superscalar overview: interface
GS_On();
for (int i = 0; i < MAXITER; i++) {
newBWd = GenerateRandom();
subst (referenceCFG, newBWd, newCFG);
dimemas (newCFG, traceFile, DimemasOUT);
post (newBWd, DimemasOUT, FinalOUT);
if (i % 3 == 0) Display(FinalOUT);
}
fd = GS_Open(FinalOUT, R);
printf("Results file:\n"); present (fd);
GS_Close(fd);
GS_Off();
![Page 5: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/5.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 5
1. GRID superscalar overview: interface
void dimemas(in File newCFG, in File traceFile, out File DimemasOUT){ char command[500]; putenv("DIMEMAS_HOME=/usr/local/cepba-tools"); sprintf(command, "/usr/local/cepba-tools/bin/Dimemas -o %s %s", DimemasOUT, newCFG ); GS_System(command);}
void display(in File toplot){ char command[500];
sprintf(command, "./display.sh %s", toplot); GS_System(command);}
![Page 6: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/6.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 6
1. GRID superscalar overview: interface
interface MC {
void subst (in File referenceCFG, in double newBW, out File newCFG);
void dimemas (in File newCFG, in File traceFile, out File DimemasOUT);
void post (in File newCFG, in File DimemasOUT, inout File FinalOUT);
void display (in File toplot)
};
![Page 7: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/7.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 7
1. GRID superscalar overview: code generation
app.idl
app-worker.capp.c app-functions.c
worker
gsstubgen
app.h
master
app-stubs.c
![Page 8: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/8.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 8
for (int i = 0; i < MAXITER; i++) {
newBWd = GenerateRandom();
substitute (“nsend.cfg”, newBWd, “tmp.cfg”);
dimemas (“tmp.cfg”, “trace.trf”, “output.txt”);
postprocess (newBWd, “output.txt”, “final.txt”);
if(i % 3 == 0) display(“final.txt”);
}
T10 T20
T30
T40
T50
T11 T21
T31
T41
T51
T12
…
1. GRID superscalar overview: behaviour
![Page 9: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/9.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 9
1. GRID superscalar overview: runtime features
Data dependence analysis
File renaming
Shared disks management
File locality exploitation
Resource brokering
Task scheduling
Task submission
Checkpointing at task level
Exception handling
Current version over Globus 2.x, using the API
File transfer, security, … provided by Globus
Ongoing developments of versions:– Ninf-g2– ssh/scp– GT4
![Page 10: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/10.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 10
2. User defined cost and constraints interface
app.idl
app-worker.capp.c app-functions.c
worker
gsstubgen
app.h
master
app-stubs.c
app_constraints.cc app_constraints_wrapper.cc
app_constraints.h
![Page 11: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/11.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 11
2. User defined cost and constraints interface
File app_constraints.cc contains the interface of functions for – Resource constraints specification– Performance cost estimation
Sample default functions:
string Subst_constraints(file referenceCFG, double seed, file newCFG) {
string constraints = ""; return constraints;}double Subst_cost(file referenceCFG, double seed,
file newCFG) { return 1.0;}
![Page 12: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/12.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 12
2. User defined cost and constraints interface
Users can edit and specify constraints and performance cost for each function– Constraints syntax: Condor ClassAds– Performance cost syntax: pure C/C++
string Dimem_constraints(file cfgFile, file traceFile){ return "(member(\"Dimemas\", other.SoftNameList) &&
other.OpSys == \"Linux\" && other.Mem > 1024 )"}double Dimem_cost(file cfgFile, file traceFile){
double complexity, time;
complexity = 10.0 * num_processes (traceFile) + 3.0 * no_p_to_p (traceFile) + no_collectives (traceFile) + no_machines(cfgFile); time = complexity / GS_GFlops();return(time);
}
![Page 13: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/13.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 13
3. Deployment of GRID superscalar applications
Java based GUIAllows GRID resources specification: host details, libraries
location…Selection of Grid configuration Grid configuration checking process:
– Aliveness of host (ping)– Globus service is checked by submitting a simple test– Sends a remote job that copies the code needed in the
worker, and compiles itAutomatic deployment
– Sends and compiles code in the remote workers and the master
Configuration file generation
![Page 14: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/14.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 14
3. Deployment of GRID superscalar applications
Resource specification (by Grid administrator)– Only one time – description of the GRID
resources stored in hidden xml file
![Page 15: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/15.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 15
3. Deployment of GRID superscalar applications
Project specification (by user)– Selection of hosts
Afterwards application is automatically deployed
A project configuration xml file is generated
![Page 16: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/16.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 16
4. Runtime resource selection
Runtime evaluation of the functions– Constraints and performance cost functions
dynamically drive the resource selection
When an instance of a function is ready for execution– Constraint function is evaluated using
ClassAdd library to match resource ClassAdds with task ClassAdds
– Performance cost function used to estimate the ellapsed time of the function (ET)
![Page 17: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/17.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 17
4. Runtime resource selection
For those resources r that meet constraints
FT= File transfer time to resource r
ET = Execution time of task on resource r
f ( r ) FT( r ) ET( r )
![Page 18: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/18.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 18
4. Runtime resource selection: call sequence
app.c
LocalHost
app-functions.c
![Page 19: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/19.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 19
4. Runtime resource selection:call sequence
app.c
app-stubs.c
GRID superscalarruntime
app_constraints_wrapper.cc
app_constraints.cc
GT2
LocalHostRemoteHost
app-functions.c
app-worker.c
![Page 20: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/20.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 20
Grid-unaware Application
5. Plans for CoreGRID WP7
Integrated toolkit
application meta-data repository
app-level info-cache
monitoring services
information services
resource management
PSE user portal
application manager
Runtime environment
steering/tuning component
steering
Grid-awareapplication
![Page 21: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/21.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 21
5. Plans for CoreGRID task 7.3
Leader: UPC
Participants: INRIA, USTUTT, UOW, UPC, VUA, CYFRONET
• Objectives:– Specification and development of an
Integrated Toolkit for the Generic platform
![Page 22: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/22.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 22
5. Plans for CoreGRID task 7.3
The integrated toolkit will– provide means for simplifying the development
of Grid applications– allow executing the applications in the Grid in
a transparent way– optimize the performance of the application
![Page 23: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/23.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 23
5. Plans for CoreGRID task 7.3: subtasks
Design of a component oriented integrated toolkit– Applications basic requirements will be
mapped to components – based on the generic platform (task 7.1)
Definition of the interface and requirements with the mediator components– Tightly performed with the definition of the
mediator components (task 7.2)Component communication mechanisms
– Enhancement of the communication of integrated toolkit application components
![Page 24: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/24.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24
5. Plans for CoreGRID task 7.3
Ongoing work– Study of partners projects– Definition of Roadmap– Integration PACX-MPI Configuration manager
with GRID superscalar deployment center– Specification of GRID superscalar based on
the component model
![Page 25: Rosa M. Badia Grid and Clusters Manager Barcelona Supercomputing Center coregrid](https://reader036.vdocuments.us/reader036/viewer/2022062803/56814736550346895db474fb/html5/thumbnails/25.jpg)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 25
Summary
GRID superscalar has proven to be a good solution for programming grid-unaware applications
New enhancements for resource selection are very promising
Examples of extensions– Cost driven resource selection– Limitation of data movement (confidentiality
preservation)Ongoing work in CoreGRID to integrate with
component-based platforms and other partners tools