geospatial service workflow concepts and tools

21
Page 1 LAITS Duh 7/10/03 Geospatial Service Workflow Concepts and Tools Liping Di Laboratory for Advanced Information Technology and Standards (LAITS) George Mason University [email protected]

Upload: beau-peterson

Post on 02-Jan-2016

52 views

Category:

Documents


2 download

DESCRIPTION

Geospatial Service Workflow Concepts and Tools. Liping Di Laboratory for Advanced Information Technology and Standards (LAITS) George Mason University [email protected]. Contents. What are Service oriented architecture and web services? What is a workflow tool?  What does it do?  - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Geospatial Service Workflow Concepts and Tools

Page 1 LAITS

Laboratory for Advanced Information Technology and Standards

Duh 7/10/03

Geospatial Service Workflow Concepts and Tools

Liping Di

Laboratory for Advanced Information Technology and Standards (LAITS)

George Mason University

[email protected]

Page 2: Geospatial Service Workflow Concepts and Tools

Page 2 LAITS

Laboratory for Advanced Information Technology and Standards

Contents

• What are Service oriented architecture and web services?

• What is a workflow tool? 

• What does it do? 

• Why do we need one in the Grid? 

• What are some common workflow tools used by the Grid community and web service community? 

Page 3: Geospatial Service Workflow Concepts and Tools

Page 3 LAITS

Laboratory for Advanced Information Technology and Standards

The Service-Oriented Architecture (SOA)

• The key component in the service-oriented architecture is services

• A service is a well-defined set of actions. It is self-contained, stateless, and does not depend on the state of other services.

• Stateless means that each time a consumer interacts with a Web Service, an action is performed. After the results of the service invocation have been returned, the action is finished. There is no assumption that subsequent invocations are associated with prior ones.

• In the service-oriented architecture, the description of a service is essentially a description of the messages that are exchanged between the consumer and the service.

• Standard-based individual services can been chained together to solve complex tasks.

• The implementation of SOA in the web environment is called Web services.

Page 4: Geospatial Service Workflow Concepts and Tools

Page 4 LAITS

Laboratory for Advanced Information Technology and Standards

Web Services

• Web Services are self-contained, self-describing, modular applications that can be published, located, and dynamically invoked across the Web.

• Web services perform functions, which can be anything from simple requests to complicated business processes.

• Once a Web service is deployed, other applications (and other Web services) can discover and invoke the deployed service.

• The real power of web services relies on – Everyone on the Internet can set up a web service to

provide service to anyone who wants—many services will be available.

– The standard-based services can be chained together dynamically to solve complicated tasks – Just in-time integration.

Page 5: Geospatial Service Workflow Concepts and Tools

Page 5 LAITS

Laboratory for Advanced Information Technology and Standards

Globus Toolkit 3.0 (GT3)

-- OGSA, OGSI, and GT3

the architec

t

the enginee

r

the workers

Page 6: Geospatial Service Workflow Concepts and Tools

Page 6 LAITS

Laboratory for Advanced Information Technology and Standards

Difference between Web Service and Open Grid Service

• Globus 3.0 implemented the Open Grid Service Architecture.• The fundamental concepts of services in the Grid are the same

as Web services.• The differences between Grid and Web services include

– A Web service can be invoked by any consumer over the Web while a Grid service can only be invoked by consumers within the virtual organization, similar to the difference between Internet and Intranet.

– Web services practice has been extended in Grid to accommodate the additional requirements of Grid services

• Stateful interactions between consumers and services• Exposure of a web service’s “publicly visible state”• Access to (possibly large amounts of) identifiable data• Service lifetime management

• Currently the Grid and Web communities are merging through the Web Service Resource Framework (WSRF).

Page 7: Geospatial Service Workflow Concepts and Tools

Page 7 LAITS

Laboratory for Advanced Information Technology and Standards

Service operations

Broker

ProviderRequestor

1. Publish2. Find

3. Bind

Provider

4. Chain

Page 8: Geospatial Service Workflow Concepts and Tools

Page 8 LAITS

Laboratory for Advanced Information Technology and Standards

Service Operations

• Publish – advertise (or remove) data and services to a broker (e.g., a registry, catalog or clearinghouse).

• Find – Service requestors and service brokers collaborate to perform the find operation. Service requestors describe the kinds of services they’re looking for to the broker and the broker delivers the results that match the request.

• Bind – A service requestor and a service provider negotiates as appropriate so the requestor can access and invoke services of the provider.

• Chain – The chain operation binds a sequence of services.

Page 9: Geospatial Service Workflow Concepts and Tools

Page 9 LAITS

Laboratory for Advanced Information Technology and Standards

Service Chaining

• A Service Chain is defined as: a sequence of services where, for each adjacent pair of services, occurrence of the first action is necessary for the occurrence of the second action.

• When services are chained, they are combined in a dependent series to achieve larger tasks.

• Three types of chaining defined in ISO 19119 and OGC:• User-defined (transparent) – the Human user defines and manages

the chain.

• Workflow-managed (translucent) – the Human user invokes a service that manages and controls the chain, where the user is aware of the individual services in the chain.

• Aggregate (opaque) – the Human user invokes a service that carries out the chain, where the user has no awareness of the individual services in the chain.

Page 10: Geospatial Service Workflow Concepts and Tools

Page 10 LAITS

Laboratory for Advanced Information Technology and Standards

Construction of Service Chains

• The first type of chaining allows users to construct a geospatial model to be run in the system

– Require domain knowledge—for expert to contribute their domain knowledge.– The knowledge is kept in the Geo-tree/service chain.

• The second type of chaining basically is to use existing geo-tree to materialize a virtual object.

– Anyone can use this type of chaining to produce a virtual product on demand.– Anyone can use but it is not able to produce a product who’s geo-tree doesn’t already

exist in a data/information system.

• The third type of chaining require the system to be intelligent enough to automatically form a geo-tree/service chain by decomposing user’s query.

– require the domain knowledge– require the automated reasoning.– Anyone can use and can produce a new product based on users’ query automatically.

• The first two types of chains do not require significant machine intelligence. – Current technology is enough for implementing such chaining approach.

• The third one requires significant machine intelligence– Current technologies are not able to provide such kind of chaining.– Significant research is needed.

Page 11: Geospatial Service Workflow Concepts and Tools

Page 11 LAITS

Laboratory for Advanced Information Technology and Standards

Workflows and workflow tools

• What we mean:– The executable scripts representing the service chains.– The total composition and orchestration of an experimental

run, including all the details of post-processing, data-mining, visualization.

– What the high-end user (scientist) needs to do in order to get the underlying computational code to produce accessible and usable results somewhere.

– What in the past was usually done through shell-scripting, but more (e.g., rpc’s).

– Previous examples: not a “single” workflow, but a number of decoupled, cooperating, communicating workflows.

• Workflows, in most cases, are encoded in BPEL4WS, a OASIS standard.

• Any tools dealing with creation, management, and execution of workflows are called workflow tools.– The most significant one is the workflow engines that

manage the execution of workflows.

Page 12: Geospatial Service Workflow Concepts and Tools

Page 12 LAITS

Laboratory for Advanced Information Technology and Standards

Steps from Geospatial process model to a user defined product (User geo-object)

GeospatialModel

Virtualgeo-object

LogicalWorkflow

ConcreteWorkflow

Workflowexecution

user geo-object

Knowledge Capture phase

User query Phase User retrieval phase

Page 13: Geospatial Service Workflow Concepts and Tools

Page 13 LAITS

Laboratory for Advanced Information Technology and Standards

Availability of Workflow Tools for Geospatial Services

• Tools are needed for every steps from the creation of geospatial models to the materialization of virtual geospatial products.

• General workflow tools are being developed both in Grid and Web service communities.

• Most of the tools are not tested in geospatial environment.

Page 14: Geospatial Service Workflow Concepts and Tools

Page 14 LAITS

Laboratory for Advanced Information Technology and Standards

Workflow Tools built by GriPhyN

• Using Virtual Data Language (VDL) from Globus team to encode both abstract and concrete workflows.

• Build an abstract workflow based on VDL descriptions (Chimera)

• Build an executable workflow based on the abstract workflows (Pegasus)

• Execute the workflow (Condor’s DAGMan)• Those tools run under Globus 2

Page 15: Geospatial Service Workflow Concepts and Tools

Page 15 LAITS

Laboratory for Advanced Information Technology and Standards

Alliance Science Portal ExpeditionWorkflow Tools Development

• Objective– Provide a workflow tool (engine + interface) through which

all of this can be accomplished without any knowledge of:• XML• Jython, Java, or any particular PL

– Provide a tool which is reusable in the sense of not being specific to any one scientific research domain

• Approach1. Templated Patterns (a repertoire of pre-defined,

parameterized “workflow scripts”)• Just as with designing software systems in general ...• High-level (Sequence, Branch/Merge, Parallel, ...)• Extend these down several levels, e.g.:

– “STAGE” = [ make dirs, get files, set permissions ]

2. An Environment through which the high-level user can create and manipulate workflow scripts.

Page 16: Geospatial Service Workflow Concepts and Tools

Page 16 LAITS

Laboratory for Advanced Information Technology and Standards

O.G.R.E.: An Extension to Apache Ant

• O.G.R.E. = Open Grid Computing Environments Runtime Engine

• What Ant lacked, but we needed:1. Broader conditional execution,

• Ant: based on write-once String properties.

2. A general “loop” structure for Task execution.3. Data-communication between Tasks (and with their

containers).4. Specialized tasks

1. File reading and writing2. Local and remote file management (gridftp)3. Web service related tasks4. Event- and process-monitoring-tasks

Page 17: Geospatial Service Workflow Concepts and Tools

Page 17 LAITS

Laboratory for Advanced Information Technology and Standards

Workflow Execution Engines in Web Services

• We are examining two workflow execution engines– IBM BPWS4J –//http://www.alphaworks.ibm.com/tech/bpws4j

– The Collaxa BPEL Server• The IBM BPWS4J is a free software while Collaxa BPEL server

is commercial software.– Collaxa BPEL Server, Developer Edition $2K per developer– Collaxa BPEL Server, Enterprise Edition $20K per CPU

• Both Engines work under web service environment.• Questions need to be answered:

– Are the engines good enough for geospatial Grid/Web services?

– Can make those engine works under Grid environment?

– What is the evolution of Grid Workflow standards and the execution engine?

Page 18: Geospatial Service Workflow Concepts and Tools

Page 18 LAITS

Laboratory for Advanced Information Technology and Standards

BPWS4J -- The BPEL Engine for Execution

• What is BPWS4J?

The IBM Business Process Execution Language for Web Services Java Runtime provides a platform upon which business processes written using BPEL4WS may execute. BPWS4j-engine-2.0 version supports the BPEL4WS v1.1 specification.

• How does it work?

For each process, the engine takes in a BPEL4WS document which describes the process, a WSDL document (without binding information) which describes the interface that the process will present to clients, and WSDL documents (with binding information) which describe the services that the process may/will invoke during its execution. After deployment the process will be made available to outside consumers through a SOAP interface.

The engine has been tested on WebSphere Application Server 5.0 and on Apache Tomcat under both Linux and Windows.

** Note: This and the next slide are from BEPL4J documentation.

Page 19: Geospatial Service Workflow Concepts and Tools

Page 19 LAITS

Laboratory for Advanced Information Technology and Standards

Developing and Deploying a Process

Step 1: Create a BPEL4WS document and the corresponding WSDL document. The WSDL document describes the interface of the process that will be presented to the outside world. (This

includes the description of all receive and onMessage elements.) The WSDL document should not contain any bindings; the SOAP binding will be added by the engine during deployment. One service element must be present within the WSDL file (the name of the process is taken from the name attribute on the service element.)

Step 2: If the process invokes another Web service (i.e. if the process contains an invoke activity), then create/obtain the WSDL document(s) that describe the service which is to be invoked.

These WSDL documents must have bindings and endpoint information that describe where and how the service may be invoked. The engine supports SOAP, EJB, JMS, and direct Java class bindings.

Step 3: Deploy the process to the engine. When deploying the process, you will need to specify the WSDL documents which fulfill the

partner roles .

Step 4: Create the SOAP client.

The client interaction with the service is defined by the process's WSDL document that you provided during deployment.

Additional Notes:All imports within the WSDL documents must be absolute. If you are deploying on Tomcat and have WSDL documents which have imports, you must make sure that you have

defined the .wsdl extension and text/xml MIME type to Tomcat, otherwise it will complain about not being able

to resolve the imports. You can do so either by modifying the conf/web.xml under Tomcat, or by modifying the WEB-INF/web.xml file within your WAR file. See the web.xml file in the engine's WAR for an example.

Page 20: Geospatial Service Workflow Concepts and Tools

Page 20 LAITS

Laboratory for Advanced Information Technology and Standards

The Collaxa BPEL Server

• Native BPEL 1.1 Implementation

• Easy-to-Use Modeling Tool

• Rich and Flexible Binding Framework(Web Services but also JCA, JMS, Email, EDI)

• Unparalleled Management and Monitoring(In-flight Instance Management, Auditing, Debugging)

• High Performance and Scalability(Throughput, Clustering, Large XML Documents)

• Easy-to-deploy/Non-intrusive(Get up and running in less than 15 minutes)

Page 21: Geospatial Service Workflow Concepts and Tools

Page 21 LAITS

Laboratory for Advanced Information Technology and Standards

The Collaxa BPEL Server

JAVA PLATFORM

BPEL

Eclipse

BPEL DESIGNERD

ES

IGN

BPEL TaskService

TA

SK

S, PO

RTA

L

BPEL CONSOLE

MO

NIT

OR

JCA JMS Email

WSDL BINDING FRAMEWORK

CONNECT

BPEL SERVERDEHYDRATE