the open grid computing environments project marlon pierce community grids laboratory indiana...
Post on 20-Jan-2016
217 Views
Preview:
TRANSCRIPT
The Open Grid Computing Environments Project
Marlon Pierce
Community Grids Laboratory
Indiana University
Acknowledgements
Funding from NSF NMI (2003-2007) and OCI SDCI (2007-2010).
Current participants Indiana University (Pierce, Gannon) RENCI (Kandaswamy) RIT(von Laszewski) SDSC (Wilkins-Diehr) SDSU (Thomas, Edwards) TACC (Dahan)
Outline
Web Portals and Science GatewaysOGCE efforts
OGCE Portal Software Portal tools
Java COG, GTLAB
OGCE Gateway Services GFAC, GPIR
Software Engineering Issues
What is next?
OGCE Goals
To provide easily installable, well-tested software for building Web client and service components that constitute a Grid Computing Environment. Science Web Portal --> GCE --> Science
GatewayTo support developing groups through
training, outreach, and divine intervention. Gateways have many needs that can’t be
solved by downloadable software alone.
What Is a Web Portal? Aggregate content from
multiple sources into a single display.
Typically consume RSS/Atom news feeds.
More powerful versions these days support Flickr, calendars, games, etc. Gadgets, widgets
Examples: iGoogle, Netvibes, My Yahoo!
Science Portals and Gateways
Science portals resemble standard portals, but must also Support access to computing and storage
resources. Allow users remote, Unix-like access to these
resources. Provide access to science applications and data
sets.So security is crucial.And we must provide value added services as
well as user interfaces.
A Comprehensive Gateway Architecture
Gateway Services
Grid Portal Server
Grid Portal Server
SecurityServices
SecurityServices
Workflow/ ApplicationExecution Engine
Workflow/ ApplicationExecution Engine
ApplicationResourceCatalogs
ApplicationResourceCatalogs
User Data& Metadata
Catalogs
User Data& Metadata
Catalogs
User’s BrowserUser’s Browser
Workflow ComposerWorkflow ComposerU
ser’s
De
skto
pU
ser’s
De
skto
p
DataServices
DataServices Information
Services
InformationServices Job MGMT, Resource Broker
And Scheduling Services
Job MGMT, Resource BrokerAnd Scheduling Services Security
Services
SecurityServices
Globus-Teragrid “OGSA-Like” Services
Components for Science Portals
OGCE is founded on the principal that portals should be built out of reusable parts.
Key standard in our first phase: the JSR 168 portlet specification.
Portlets can run in multiple containers uPortal, Sakai, GridSphere, LifeRay, etc.
Allows us to build Grid specific components and deploy along side other goodies: Sakai collaboration tools, contributed portlets, etc.
OGCE Portal Software
OGCE GPIR portlet can interoperate with TeraGrid and your own GPIR
services.
Manage TeraGrid MyProxy credentials with the OGCE
ProxyManager portlets.
OGCE file management client portlets interact with TeraGrid GridFTP
servers.
General purpose batch and interactive job submission to GRAM, WS-GRAM is supported.
Dashboard Portlet
14
The dashboard portlet allows users to track jobs on the selected resource. The user can view either his own set of jobs or get information on all submitted jobs.
Queue forecasting portlets work with the NWS QBETS to predict wait times and deadlines.
PURSe portlets manage user requests for portal accounts and Grid credentials.
Condor and Condor-G
OGCE IFrame Portlet can be used to integrate external sites.
Building Your Own Grid Portlets
Coding Portlets
Portlets are just servlet-like Java classes.Basic API key methods:
doView(), processAction().
These are coupled to JSP pages (typically) through tag libraries and request dispatchers. OGCE supports Velocity portlets
So we must provide the coding logic for processAction().
COG abstraction layers provide this.
CoG Abstraction Layer
CoG CoG CoG CoG CoG
CoG Data and Task Management Layer
CoG Gridfaces Layer
CoG CoG
CoG
GridID
E
GT2GT3(X)
GT4WS-RF
Condor Unicore
Applications
SSH Others
Nanomaterials
Bio-Informatics
DisasterManagement
Portals
CoG Abstraction Layer
CoG CoG CoG CoG CoG
CoG Data and Task Management Layer
CoG Gridfaces Layer
CoG CoG
CoG
GridID
E
DevelopmentSupport
CoG Abstraction Layers
TaskTask
Handler
Service
TaskSpecification
SecurityContext
ServiceContact
The class diagram is thesame for all grid tasks (running jobs, modifying files, moving data).
Classes also abstract toolkit provider differences. You set these as parameters: GT2, GT4, etc.
Task and Specification
Task task=new TaskImpl(“mytask”,Task.JOB_SUBMISSION);
task.setProvider(“GT2”);JobSpecification spec=
new JobSpecificationImpl();spec.setExecutable(“rm”);spec.setBatchJob(true);spec.setArguments(“-r”);…task.setSpecification(spec);
Service and Security Context
Service service=new
ServiceImpl(Service.JOB_SUBMISSION);
service.setProvider(“GT2”);
SecurityContext securityContext=
CoreFactory.newSecurityContext(“GT2”);
//Use cred object from ProxyManager
securityContext.setCredentials(cred);
service.setSecurityContext(
(SecurityContext)securityContext);
Service Contact and Submit
ServiceContact serviceContact=
new ServiceContact(“myhost.myorg.org”);
service.setServiceContact(serviceContact);
task.setService(
Service.JOB_SUBMISSION_SERVICE,
service);
TaskHandler handler=new GenericTaskHandler();
handler.submit(task);
Coupling CoG TasksThe COG
abstractions also simplify creating coupled tasks.
Tasks can be assembled into task graphs with dependencies. “Do Task B after
successful Task A”Graphs can be
nested.
Problems with Portlet Development
Grid portlets typically wrap each single Grid capability in a separate portlet
Problem is that Grid portlets need to combine these operations Portlets are entire web applications, so we need a component model for
portlets: reusable portlet parts Even with the COG Abstraction Layer, we must still do a lot of
coding to biuld new applications. To address these problems we have adopted Java Server
Faces Provides several nice Model-View-Controller features JSF provides an extensible framework (tag libraries) for making
reusable components. Apache JSF portlet bridge allows you to convert standalone JSF
applications (development phase) into portlets (deployment phase).
Grid Tag Libraries and Beans (GTLAB)
GTLAB provides common components for building portlets using tags and reusable parts.
The goal of GTLAB to simplify Grid portlet development Enable rapid development
GTLAB capabilities include Grid operations with XML based tags within Java Server Faces (JSF) framework.
Grid tag libraries are built using JSF custom component development techniques
Grid tags are interfaces to backing Grid beans End users pass values to Grid beans by using tag attributes.
We build on Java CoG 4’s abstraction layer. Each backing Grid bean has equal capability with a portlet
application in case of Grid portlet approach.
29
GTLAB Example
<html>
<body>
<f:form>
<o:submit id=”test” action=”next_page” />
<o:myproxy id=”pr” hostname=”gf1.ucs.indiana.edu” port=”7512” lifetime=”2” username=“mnacar” password=”***” />
<o:jobsubmit id=”task” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4” executable=”/bin/ls” stdout=”tmp/result stderr=”tmp/error” />
</o:submit>
</f:form>
</body>
</html> 30
• Grid tags are associated with Grid services via Grid beans• Grid Beans wrap the Java COG Kit (version 4)
• We show an example JSF page section below.• This allows you to develop new Grid portlets with no additional Java code.
Grid Tags Associated Grid Beans Features
<submit/> ComponentBuilderBean Creating components, job handlers, submitting jobs
<handler/> MonitorBean Handling monitoring page actions
<multitask/> MultitaskBean Constructing simple workflow
<dependency/> MultitaskBean Defining dependencies among sub jobs
<myproxy/> MyproxyBean Retrieving myproxy credential
<fileoperation/> FileOprationBean Providing Gridftp operations
<jobsubmission/> JobSubmitBean Providing GRAM job submissions
<filetransfer/> FileTransferBean Providing Gridftp file transfer
ResourceBean Describes common properties among all tags and beans. Passing values given by standard visual JSF components.
How to prepare application pages
Developers embed Grid tags snippet into JSF page These components are non-visual and are not displayed in
HTML. Resource bean provides bridging with form inputs and GTLAB
framework. <h:outputText value="Taskname: "/>
<h:inputText value="#{resource.taskname}" /> <o:multitask id="multi" persistent="true" taskname="#{resource.taskname}" />
Dynamic values to Grid tag attributes are provided by Resource bean.
Only visual component is <o:submit/> tag that is associated with action method of GTLAB.
32
GTLAB Dashboard PortletExample
<o:submit id=”track” action=”list_page” /> <o:multitask id=”dashboard” taskname=”track” persistent=”true” >
<o:myproxy id=”proxy” hostname=”gf1.ucs.indiana.edu” lifetime=”2” username=”#{resource.username}” password=”#{resource.password}” /> <o:jobsubmit id=”jobA” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4” executable=”/bin/whoami” stdout=”tmp/result” stderr=”tmp/error” /> <o:jobsubmit id=”jobB” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4” executable=”/bin/showq” stdin=”tmp/result” stdout=”tmp/list” stderr=”tmp/error” /> <o:dependency id=”depend” task=”jobB” dependsOn=”jobA” /> </o:multitask></o:submit>
33
Tracking and Managing Jobs
GTLAB manages lifecycles of jobs and monitor their status.
Grid operations are usually batch processes We provide callback mechanism to follow up the jobs GTLAB creates handlers for jobs and persistently stores them.
GTLAB handlers manages the job events such as stop, cancel or resuming the running jobs.
GTLAB provides archive for job metadata and allows managing the archive Handler tag helps to organize user’s job repository <o:handler id=”delete” action="#{monitor.delete}" > <f:param id="task" name="taskname“ value="#{task}"/> </o:handler>
34
OGCE Gateway Services
36
Web Services in Scientific Communities (G.
Kandaswamy) Web services are used to “wrap” scientific
applications to Describe, publish, discover and consume scientific
applications in a standard way Compose complex workflows from scientific
applications Run and monitor complex workflows on distributed
resources
Such web services that “wrap” scientific applications are called “application services”
37
ApplicationService
Command-line
ApplicationWeb Service
Client
Host1 Host2
SOAP Request
SOAP Response
Command-line Arguments
Output Results
A Simple Application Service
38
Things Are Usually More Complicated
ARPS-TRNARPS-TRN
ARPS-SFCARPS-SFC
EXT2ARPSEXT2ARPS
MCI2ARPSMCI2ARPS
NIDS2ARPSNIDS2ARPS88D2ARPS88D2ARPS
ADASADAS
ARPS2WRFARPS2WRF
WRFWRF
ARPS-PLOTARPS-PLOT
EXT2ARPSEXT2ARPS
Initial boundary conditions
Initial boundary conditions
Run for each forecast
and/or ADAS analysis
Run for each forecast
and/or ADAS analysis
Decoded data from other programs (sfc,
rwh etc.)
Decoded data from other programs (sfc,
rwh etc.)
Level III dataLevel III data
Level II dataLevel II data
Satellite dataSatellite data
Run once per forecast region
Run once per day
Lateral boundary conditions
39
The Problem
Application services may not be available during a workflow execution Unreliable resources (software, computers,
networks) Heavy load on service Does not meet QoS or security requirements
of client Workflows cannot complete unless all
services are available
40
GFAC Solution
A Generic Application Factory A persistent web service that knows how to
create instances of any application service
Use a Generic Application Factory to create instances of application services on-demand from workflows
41
Implementation
The Generic Application Factory (GFac) The Generic Service Toolkit: A toolkit that
“wraps” any command-line application as an application service Without writing any web service code Without modifying the application in any
significant way
42
Creating an Application Service (1/2)
Write “ServiceMap” document to describe your service
Write “Application Deployment Description” document to describe a deployment of your application
Upload the above two documents to a Registry service
43
Creating an Application Service (2/2)
GFac
Generic Web
Service5. Register capabilities
RegistryService
5. Register WSDL
3. Create service
1. Create service request
Certificate & Capabilities
Vault
Generic ServicePortlet
MyProxy Service
Capability Manager Service
Portal
2. Get ServiceMap & Host Description
ApplicationService
4. Configure service
Host1
Host2
Service Provider
44
Invoking an Application Service
ApplicationService
RegistryService
Certificate & Capabilities
Vault
Generic ServicePortlet
MyProxy Service
Capability Manager Service
Portal
4. Run application
Application
3. Return user interface
4. Invoke Service
7. Return results
2. Access service
5. Get Application Deployment Description and Host Description
6. Send notifications
Host2
Host3
User
Software Engineering Issues
OGCE Code Repository
We use SourceForge, SVN http://sourceforge.net/projects/ogce
Other SourceForge tools are useful. Replaced old OGCE bugzilla with SF
bugzilla recently after we were attacked by robots.
Portal Build System The portal download gives you everything you need to get
started except Java. Includes Tomcat, GridSphere, Ant, and Maven. Assume you have a Grid somewhere.
Build system (recently revised) is designed to build everything in one command. “mvn clean install” Also designed to support extensibility (I.e. replace GridSphere with
Sakai) and simple updates of portlets. We use Maven 2 exclusively.
Nice for managing third party jar dependencies. It can call Ant as necessary
Testing portals is another matter Normal unit test systems like Junit are not really appropriate.
JMeter Test SuiteFile Transfer portlet unit tested in JMeter UI: check for valid HTML response
Create lots of unit tests, run, and see results in a dashboard
Nightly Builds and Tests on NMI Testbed
What’s Next?
Some Future Issues
Better support for science tools, not just bare grids. Experiment builder, Xbaya workflow manager,
metadata repository services and clients.Better support for TeraGrid Science Gateways
Logging, auditing, integration with GridShibJavaScript Grid abstraction layers and agent
services to support non-portlet clients.More projects: obviously we are interested in
working with the OSG
What About Web 2.0?
This is another talk entirely. http://grids.ucs.indiana.edu/ptliupages/presentatio
ns/Web20Tutorial_CTS.ppt http://grids.ucs.indiana.edu/ptliupages/publications
/Web20ChapterFinal.pdfSee also recent OGF 19 and 21 Workshops.Join us at SC07 for the GCE07 Science
Gateway Workshop ~20 peer-reviewed or invited talks, with focus on
Web 2.0.
More Information
OGCE Web Site: www.collab-ogce.org
Announcements Atom Feed http://collab-ogce.blogspot.com/atom.xml
Contact me: mpierce@cs.indiana.edu
Some Example Portals
LEAD Gateway PortalNSF Large ITR and Teragrid Gateway - Adaptive Response to Mesoscale weather events - Supports Data exploration,Grid Workflow
TeraGrid User Portal
User Portal Sharable PortletsAccount Management
view projects and allocation usage view system account usernames view DNs registered for account add users to projects supports >3500 users
Resource view comprehensive list of TG
resources and their attributes view job queues, load, status of
resources
Documentation current User Info
documentation contextual help for all interfaces
Consulting TG help desk information portal feedback channel
Allocation Info about how to apply
for/renew allocations
North Carolina Bioportal Principal collaborators: John McGee
and Lavanya Ramakrishnan Features
access to common bioinformatics tools
extensible toolkit and infrastructure OGCE and National Middleware
Initiative (NMI) leverages emerging international
standards remotely accessible or locally
deployable packaged and distributed with
documentation National reach and community
TeraGrid deployment Portals hosted at RENCI and NCSA
Education and training
UNC-CharlotteVisual Grid Portal
Project Lead: Prof. Barry WilkinsonPortal Developer: Jeremy Villalobos
top related