holding slide prior to starting show
DESCRIPTION
Holding slide prior to starting show. (Some) Key Issues in Grid Computing. David Walker School of Computer Science Cardiff University. http://users.cs.cf.ac.uk/David.W.Walker. Main Thesis of Talk. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/1.jpg)
Holding slide prior to starting show
![Page 2: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/2.jpg)
(Some) Key Issues in Grid Computing
David WalkerSchool of Computer Science
Cardiff University
http://users.cs.cf.ac.uk/David.W.Walker
![Page 3: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/3.jpg)
Main Thesis of Talk
• At a surface level many aspects of Grid Computing appear to be straightforward, and reduce to simple programming tasks and the use of existing tools.
• This talk aims to show that for domain scientists to effectively use the Grid many challenging CS issues need to be addressed.
![Page 4: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/4.jpg)
What is the Grid?
• The Grid is an emerging communication and computational infrastructure for the transparent sharing of distributed computing resources.
• Resources include computers, data, instruments, sensors, visualisation platforms, and sometimes even people.
The Grid is more than just a faster Internet
![Page 5: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/5.jpg)
Cambridge
Newcastle
Edinburgh
Oxford
Glasgow
Manchester
Cardiff
Southampton
London
Belfast
DL
RAL Hinxton
UK Grid Network National Centre in
Edinburgh/Glasgow 8 regional centres Grid support centre
![Page 6: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/6.jpg)
Why is the Grid Important?
• It will give access to more computational power.
• It will make more computing and data resources more readily available.
• It will enable collaborative working and resource sharing through virtual organisations and communities.
• It will create new economic resources, products, and services.
![Page 7: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/7.jpg)
Benefits from the Grid
• Empowering individuals and organisations.
• Enhancing national economic competitiveness, security, and quality of life.
• An engine for social transformation.
![Page 8: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/8.jpg)
Areas of Impact
• e-Science: large-scale collaborative multi-disciplinary science in areas such as high energy physics, bio-informatics, astrophysics, chemistry, etc.
• e-Business: streamline, distribute, and enhance business processes.
• e-Government: transform relations between government and citizens, businesses, and other arms of government. – Better delivery of government services to citizens– Improved interactions with business and industry– Citizen empowerment through access to
information.
![Page 9: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/9.jpg)
Grid Model - the Information Utility
MIDDLEWARE
Experiment
Experiment
Computing
Computing
Computing
Storage
Storage
Storage
Analysis
AnalysisScientist
CLRC Daresbury
Scientist
Scientist
![Page 10: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/10.jpg)
A Typical Scientific Process
![Page 11: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/11.jpg)
Key Elements of the Grid
• The specification of problems – how do you program the Grid?
• The dynamic discovery of Grid resources.• Provenance support for Grid applications.• The interoperability and federation of different
Grid middleware stacks.• Grid access to legacy applications.• Support for remote collaboration over the
Grid.
![Page 12: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/12.jpg)
A Simple Example
• A simple use of the Grid involves the use of a PSE or portal to do a set of pre-determined tasks.
• This corresponds to the “utility computing” mode of use.
• No support for building new applications or services.
• No support for dynamic discovery of resources.
• No support for collaboration.
![Page 13: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/13.jpg)
Programming the Grid
• Problem specification could involve– Use of high-level domain-specific
programming/scripting language.– Representing coordinated tasks with a
workflow graph assembled in a visual programming environment.
– Use of recommender systems to assist users in formulating and solving problems.
![Page 14: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/14.jpg)
Workflow
• Commonly used to represent applications composed of interacting services.
• Services may be hierarchical – composed of other services.
• Easy to represent graphically, but not scalable with number of services or number of inputs/outputs.
![Page 15: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/15.jpg)
Workflow Composition
Would like to support the domain scientist in designing workflows to solve problems.
![Page 16: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/16.jpg)
Problems in Workflow Composition
• How do you know that the input port of one service is compatible with the output port of another service?
• Given that the services may have been created by different people/organisations?
• Type signatures must match, but semantics must also match.
![Page 17: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/17.jpg)
Annotating Services
• To support “plug-and-play” between services in a workflow requires the use of ontologies.
• Need to give semantic content (meaning) to service inputs and outputs.
• This allows composition hints in the form of “semantic suggestions”. For example, for a given service port we could find all services that could be connected to it.
![Page 18: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/18.jpg)
Types of Workflow Composition
Manual
User generates workflowgraphically or through
text editor.
TrianaBPWS4J
Self-Serve
Semi-automated
“Semantic suggestions”User still has to select theservice required from a
shortlist.
Cardoso & ShethGEODISEmyGRID
Sirin , Hendler et al.,
Automated
The entire composition is automated using AI
technologies.
SHOP2
Pegasus – ISIMcIllraith
IRS-II
![Page 19: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/19.jpg)
Workflow Composition in Semantic Grids
• Semantic Web technologies enable automation at several levels – automated resource discovery, selection, management, service composition, execution.
• Promises automated seamless interoperation of autonomous, heterogeneous distributed applications.
• Our focus is on the use of Semantic Web technologies to automate service composition in Grid environments.
• See S Majithia, DW Walker, and WA Gray “Automatic Composition of Web Services,” in Proceedings of the UK e-Science Programme All-Hands Meeting 2004. Available online at http://www.allhands.org.uk/proceedings/papers/148.pdf
• Main developer is Shalil Majithia.
![Page 20: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/20.jpg)
Framework - OverviewWFMS – Workflow Manager Service
AWFC – Abstract Workflow Composition ServiceCWFC – Concrete Workflow Composition Service
RS – Reasoning Service
MMS – Matchmaking Service
AWFR – Abstract Workflow Repository
CWFR – Concrete Workflow Repository
RB - Rulebase
AWFC CWFC
RS MMS
RB AWFR CWFR
WFMS
High level objective
![Page 21: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/21.jpg)
Framework - Interactions
Client WFMS CWFCAWFC WFEE
1
2
3
4
5
6
7
8
1. High Level Request 5. Composed Concrete WF2. Request for Abstract WF 6. Request for Execution3. Composed Abstract WF 7. Results or Request for Alternatives4. Request for Concrete WF 8. Final Results
![Page 22: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/22.jpg)
Abstract Workflow Composer
• An abstract workflow specifies a workflow without referring to a specific service implementation .
• The Abstract Composer tries to generate an abstract workflow by using:– AWF Repository: stores semantically annotated
descriptions of services and workflows. Use ontology to match services.
– Rulebase: a rulebase specifies the “recipe” to achieve an objective
– Chaining services: try and chain services by matching service outputs and inputs.
![Page 23: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/23.jpg)
Concrete Workflow Composer
• A concrete workflow specifies an executable workflow by referring to specific service implementations.
• The Concrete Composer tries to generate an executable workflow by using:– Matchmaking: match abstract workflow with
service implementations available at that time.– Chaining services: try and chain services by
matching service outputs and inputs.
![Page 24: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/24.jpg)
Other Components
• Matchmaker service (based on that of Paolucci et al.) adapted for dynamic substitution.
• Chaining service: backward chaining service based on domain ontologies.
• Repositories: store semantically annotated abstract and concrete workflows.
![Page 25: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/25.jpg)
Implementation• All components
implemented as Web services using Axis server.
• Services and workflows described using OWL-S.
• DQL/JTP server used for subsumption reasoning
• Rulebase implemented in RuleML
• Plug-in module enables generation of concrete workflows in BPEL4WS.
<profileHierarchy:SignalProcessing rdf:ID="FFT"><profile:input>
<profile:ParameterDescription rdf:ID="FFTInput"><profile:restrictedTo
rdf:resource="Concepts.owl#VectorType"/></profile:ParameterDescription>
</profile:input><profile:output><profile:ParameterDescription rdf:ID="FFTOutput">
<profile:restrictedTordf:resource="Concepts.owl#ComplexSpectrum"/>
</profile:ParameterDescription></profile:output>
</profileHierarchy:SignalProcessing>
<profileHierarchy:SignalProcessing rdf:ID="FFT"><profile:input>
<profile:ParameterDescription rdf:ID="FFTInput"><profile:restrictedTo
rdf:resource="Concepts.owl#VectorType"/></profile:ParameterDescription>
</profile:input><profile:output><profile:ParameterDescription rdf:ID="FFTOutput">
<profile:restrictedTordf:resource="Concepts.owl#ComplexSpectrum"/>
</profile:ParameterDescription></profile:output>
</profileHierarchy:SignalProcessing>
Snippet of OWL-S Profile for FFT
![Page 26: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/26.jpg)
Family Tree Example
• Families trees have 3 basic relationships– Spouse_of– Child_of– Parent_of
• Other relationships (aunt, grandparent, cousin, etc) can expressed in terms of these relationships through an ontology.
![Page 27: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/27.jpg)
Cousins Example
• Suppose we want to create a workflow to find the cousins of a given person, X.
• Query is submitted to WFMS which checks the AWF repository (i.e., checks annotated name of workflows)
• If no match then check rule base
![Page 28: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/28.jpg)
Rulebase
Grandparents(X)=Parents[Parents[X]]
Cousins(X)=exclude[Grandchildren[Grandparents(X), Children[Parents[X]]]]
Note: There is no rule for Grandchildren[X]. The Chaining Service would deduce how to do this from the ontology.
![Page 29: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/29.jpg)
Abstract Workflow From Rulebase
X
Grandparents Grandchildren
Parents Children
ExcludeCousins
Atomic service
Composite service
![Page 30: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/30.jpg)
WF after Recursive Application of Rulebase
X
Parents Parents
Parents Children ExcludeCousins
Grandchildren
![Page 31: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/31.jpg)
WF after Application of Chaining Service
X
Parents Children ExcludeCousins
Parents Parents Children
Children
Note opportunity for optimization and parallelism.
![Page 32: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/32.jpg)
Dynamic Resource Discovery and Scheduling
• Assume that semantically annotated services can be found through a registry or repository service.
• Scheduling of workflow nodes on distributed resources.– Early binding model: bind to specific service/platform at
composition time (“validation”).– Intermediate binding model: bind at “compile” time (when
converting from XML to executable form).– Late binding model: bind dynamically at runtime.
• Later binding allows the use of more up-to-date information to make scheduling decisions.
• In our framework binding is done by the Matchmaker Service, and can follow any of the above binding models.
![Page 33: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/33.jpg)
Provenance Support in Service-Oriented Grids
• A workflow may produce many intermediate and final data products that may need to be later reviewed and analysed.
• A person, project, or organisation may need to archive many such workflows and their results.
• Want to store the provenance of data products: how they were produced and why.
• Main developer is Shrija Rajbhandari.
![Page 34: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/34.jpg)
Provenance
• Provenance can be regarded as historical metadata that provides an explanation of how a particular data product has been generated.
• Uniquely defines the derived data.• Identifies what data is passed between
services.• Provides a traceable path to the origin of
the data.
![Page 35: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/35.jpg)
Provenance Importance and Problem
• No known standards to support archiving provenance in service-oriented Grid environment.
• Requires recording the provenance:– The transformation of data occurred during
the invocation of services in a workflow. – Complex service executed via a workflow
Engine.
![Page 36: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/36.jpg)
Original Motivation
• Would like to be able to view an electronic publication, and click on tables and figures of results to:– See how they were generated: requires
provenance browser.– Re-run the workflows that generated the results to
verify them, or to perform “what-if” study by changing the workflow inputs.
– See the results of any re-run workflows in the same format as the original data (table of graph).
![Page 37: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/37.jpg)
Provenance Model
Provenance mySql
Database
Provenance Server
PCS
PQS
Workflow Engine
[BPWS4J]
JENA
RDF Schema
INTERFACE
PCS = Provenance Collection ServicePQS = Provenance Query ServiceJena is a Java framework for building Semantic Web applications. http://jena.sourceforge.net/
![Page 38: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/38.jpg)
Prototype Provenance System
• Provenance Schema– Resource Description Framework (RDF).– Provenance of workflow execution.
• Provenance Collection Service (PCS)– Provenance is represented in RDF statements.– Database storage.
• Provenance Query Service (PQS)– Client interface to browse provenance.– Allows re-execution of retrieve provenance for
“what- if” style of analysis.
![Page 39: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/39.jpg)
7) PQS Client passes query to the database server which returns the provenance data using Jena tools to access RDF data.
Prototype Dataflow
8) PQS allows re-execution of the workflow from the provenance data retrieved. Also allows parameter changes during re-execution of such workflow.
PCS
PQS Client Interface
PCS Client Interface
3) BPWS4J invokes the partner services
Web Services
1) User Client Interface sends the workflow invocation parameters to PCS.
4) BPWS4J sends message about invoked services, and the input and output parameters to PCS
2) PCS sends the invocation initiation of a workflow to BPWS4J. BPWS4J
Engine
6) PCS stores the RDF graph in the database server using Jena toolsProvenance
Database
5) PCS Creates RDF representation of the collected provenance data of the workflow execution
Provenance RDF schema
Uses
![Page 40: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/40.jpg)
Services Composition and Invocation
• Compose Web services using BPEL4WS• Execute with BPEL4WS compliant
engine: IBM’s BPWS4J• Dynamically invoke Web services using
Web Service Invocation Framework (WSIF).
![Page 41: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/41.jpg)
Provenance RecordingExample: Adding two numbers and multiplying the result with
a third number
![Page 42: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/42.jpg)
Provenance Recording (cont..)
![Page 43: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/43.jpg)
Provenance Recording (cont..)
![Page 44: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/44.jpg)
Provenance Query
![Page 45: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/45.jpg)
Re-execution for “what-if” analysis
![Page 46: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/46.jpg)
Other Grid Projects
• Quality of Service: http://www.cs.cf.ac.uk/user/Rashid/
• Resource-Aware Visualization Environment (RAVE): http://www.wesc.ac.uk/projects/rave/
• Grid-Enabled Computational Electromagnetics (GECEM): http://www.wesc.ac.uk/projects/gecem/
• Workflow Optimization Services for e-Science (WOSE): http://www.wesc.ac.uk/projects/wose/
![Page 47: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/47.jpg)
Summary
• Semantic Web technologies play a key role in enabling;– “plug-and-play” in the composition of service to
create workflows.– dynamic discovery of resources.– Support for provenance.
• The above, together with collaborative visualisation, are important in convincing scientists (and others) to use the Grid.
![Page 48: Holding slide prior to starting show](https://reader035.vdocuments.us/reader035/viewer/2022070404/56813b60550346895da45c4a/html5/thumbnails/48.jpg)