provenance aware service oriented architecture (1 year on) pasoa

Post on 05-Jan-2016

29 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Provenance Aware Service Oriented Architecture (1 year on) www.pasoa.org. Professor Luc Moreau University of Southampton L.Moreau@ecs.soton.ac.uk. The PASOA Team. PASOA Southampton Simon Miles, Paul Groth, Miguel Branco, Luc Moreau PASOA Cardiff - PowerPoint PPT Presentation

TRANSCRIPT

Provenance Aware

Service Oriented Architecture

(1 year on)

www.pasoa.org Professor Luc MoreauUniversity of SouthamptonL.Moreau@ecs.soton.ac.uk

The PASOA Team

PASOA Southampton Simon Miles, Paul Groth, Miguel Branco, Luc

Moreau

PASOA Cardiff Ian Wootten, Shrija Rajbhandari, Omer Rana,

David Walker

Provenance Definition

Merriam-Webster Online dictionary: the origin, source; the history of ownership of a valued object or

work of art or literature The provenance of a piece of data is the

process that led to the data Our aim is to conceive a computer-based

representation of provenance that allows us to perform useful analysis and reasoning to support our use cases

Provenance Use Cases (1)

High Energy Physics: tracking, analysing, verifying data sets in the ATLAS Experiment of the Large Hadron Collider (CERN)

Bioinformatics: verification of “experiment validity”.

Provenance Use Cases (2)

Aerospace engineering: maintain a historical record of design processes, up to 99 years.

Organ transplant management: tracking of previous decisions, crucial to maximise the efficiency in matching and recovery rate of patients

The Provenance Problem

Given a set of services in an open grid environment that are composed in order to produce a given result;

How can we determine the process that generated the result?

(especially after their composition, i.e., virtual organisation, has been disbanded)

Provenance “Lifecycle”

ApplicationApplication

Results

ProvenanceStore

Record Documentation of Execution

QueryProvenance

ofData

ManageStore and itscontents

Core Interfaces to Provenance Store

Logical

Architecture Adopted by

EU Provenance

as strawman

[Miles et al. 05]

Recording & Querying

PReP [Groth et al. 04] Protocol adopted by

application components Allow for multiple

provenance stores (scalability)

Query Interface [Miles et al.05] Purpose

Obtain the provenance of some specific data

Allow for “navigation” of the data structure representing provenance

Abstract interface Allows us to view the

provenance store as if containing XML data structures

Based on XPath and XQuery

client serviceinvocation

resultinvocationand result recording

ProvenanceStore

ProvenanceStore

invocationand result recording

Assertions about Performance and Availability

A taxonomy of gathered information about performance Recorded (invocation

start/end time and counts) Derived from Recorded

Information (averages) Queried against other

actor owned metrics Compilation of

assertions in a measure of trust (both from service and client perspective)

[Wootten, Rana 05]

Trust is a subjective probability that an

actor will perform a particular action

[Gambetta]

[Rajbhandari, Rana 05]

PReServ [Groth et al. 05]

Implementation of PReP protocol and Query Interface

Provenance store implemented as a Web Service

Client side libraries for using Provenance Store

Axis Handler for automatically recording communication between Axis-based Web Services

AxisHandler

AxisHandler

Provenance Service

Backend Store Interface

File SystemStore

In-Memory

Store…

Backend Stores

PS Client Side

Library

PS Client Side

Library

Web Service WS Client

Query Actor WS

PS Client Side

Library

WS CallsJava Calls

Bioinformatics Application

Bioinformatics workflow studying compressibility of biological sequences

Implemented as a VDT workflow, scheduled by Condor

Each service, script, command records provenance

[HPDC’05]

Bioinformatics Application (2)

Use Cases Algorithm verification

A bioinformatician, A, downloads a protein sequence from the RefSeq database and runs the compressibility experiment.

A later performs the same experiment on the same sequence data, again downloaded from RefSeq.

A compares the two experiment results and notices a difference.

A determines whether the difference was caused by the algorithms used to process the sequence data having been changed.

Bioinformatics Application (3)

Recording Scalability Querying Scalability

Other Applications

EU Provenance project Pre-prototype about

baking cakes

e-Demand Detect sharing of

services in workflow execution to offer more resilient execution

[Townend et al 05][Xu et al 05]

Conclusions

Mostly unexplored area that is crucial to develop trusted systems

Current work: System and protocol designing, architecture

specification, generic support for use cases Pursue the deployment in concrete application and

performance evaluation

Download our software from www.pasoa.org Tell us about your use cases: we are keen to

find new collaborations in this space!

Talk to Paul and Simon

Publications

1. Paul Groth, Simon Miles, Weijian Fang, Sylvia C. Wong, Klaus-Peter Zauner, and Luc Moreau. Recording and Using Provenance in a Protein Compressibility Experiment. In Proceedings of the 14th IEEE International Symposium on High Performance Distributed Computing (HPDC'05), July 2005.

2. Paul T. Groth. Recording Provenance in Service-Oriented Architectures. 9 Month Report, University of Southampton; Faculty of Engineering, Science and Mathematics; School of Electronics and Computer Science, 2004.

3. Paul Groth, Michael Luck, and Luc Moreau. A protocol for recording provenance in service-oriented Grids. In Proceedings of the 8th International Conference on Principles of Distributed Systems (OPODIS'04), Grenoble, France, December 2004.

4. Paul Groth, Michael Luck, and Luc Moreau. Formalising a protocol for recording provenance in Grids. In Proceedings of the UK OST e-Science second All Hands Meeting 2004 (AHM'04), Nottingham, UK, September 2004.

5. Simon Miles, Paul Groth, Miguel Branco, and Luc Moreau. The requirements of recording and using provenance in e-Science experiments. Technical report, University of Southampton, 2005.

6. Luc Moreau, Syd Chapman, Andreas Schreiber, Rolf Hempel, Omer Rana, Lazslo Varga, Ulises Cortes, and Steven Willmott. Provenance-based Trust for Grid Computing --- Position Paper. In , 2003.

7. Paul Townend, Paul Groth, and Jie Xu. A Provenance-Aware Weighted Fault Tolerance Scheme for Service-Based Applications. In Proc. of the 8th IEEE International Symposium on Object-oriented Real-time distributed Computing (ISORC 2005), May 2005.

top related