recording actor provenance in scientific workflows ian wootten, shrija rajbhandari, omer rana...
TRANSCRIPT
![Page 1: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/1.jpg)
Recording Actor Provenance in Scientific Workflows
Ian Wootten, Shrija Rajbhandari, Omer [email protected]
Cardiff University, UK
![Page 2: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/2.jpg)
What?
Provenance is concerned with process This may or may not be documented
Data Provenance – The process which leads to a particular piece of data
Actor Provenance - The process which leads to a particular actor state How an actor (client or service) arrived at a particular
state during an interaction (for stateless actors)
![Page 3: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/3.jpg)
What? Actor Provenance
Service
Enactment Engine
ServiceInteraction Assertions: Asserting the contents of a message by an actor sending or receiving it.
A1
A2
B1
B2
Actor State Assertions: Asserting the state of an actor at a particular time during an interaction.
![Page 4: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/4.jpg)
Metrics for Actor State Assertion
Static No variation in value over actor lifetime
Per Node - Node identity, Operating system Per Actor - Actor identity, Name, Owner, Version
Dynamic Variation in value over actor lifetime
Per Node - Memory usage, Network traffic Per Actor - Execution Time, Availability
Instrumented Actor is ‘Instrumented’ at Key Points in its Execution
Description of internal data flow Eg. German Aerospace Center (DLR)
Completion states for action events and file transfers
![Page 5: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/5.jpg)
How? Actor Provenance
Service
Enactment Engine
Service
B1
B2
M1 M2
InstrumentedOutput
MonitorOutput
Monitoring Sources: Service information derived from hosting platform via monitoring sources (eg Ganglia)
Instrumented Actor: Service information obtained from instrumented points within an actor.
![Page 6: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/6.jpg)
Why? Standalone and Combined Value
Standalone State Assertion Value Actor Selection
Performance• Evaluation of Past / Prediction of Future
Resource Allocation Actor administrator allocates resources according to performance
metrics
Combined Value - Putting Assertions into Context Interaction – Through Actor State Assertions
Determining the likely cause of error / results Understanding what an actor is doing
Actor – Through Interaction Assertions Understanding performance pattern observations Understanding instrumented metric observations
![Page 7: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/7.jpg)
How? Actor Provenance Registry
Attempt to provide a mechanism to specify and record actor state assertions for any application
Generic Mechanism Problems No Knowledge of Potential Resources
Monitoring sources, containers No Direct Knowledge of Implementation
Instrumented Data Capture
![Page 8: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/8.jpg)
How? Actor Provenance Registry
Resource and Rule Registration Resource – Monitoring Tool Rule - User defined instructions
Indirectly from Resources Coordinator polls resources for information Times of interest – Service Invocation, Request
Directly from actor Collection of Instrumented data
Representation?
![Page 9: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/9.jpg)
How? Actor Provenance Registry
Integration with PReP [Groth et al.]
Provenance Store
Client Service
Record Provenance
Record Provenance
Reg
istr
y
Monitoring Sources
Registry
Monitoring Sources
Invoke
ResultRecord Actor Provenance
Record Actor Provenance
Local Store
Local Store
![Page 10: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/10.jpg)
Data Mining Prototype
Record assertions using registry during invocation of a data modelling service
Service takes incoming data sets and generates a model based upon it Uses Quantitative Structure-Activity
Relationship (QSAR) to attempt to correlate biological activity to a chemical compound
Larger data set = longer run time
![Page 11: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/11.jpg)
Performance Evaluation
0
5000
10000
15000
20000
25000
30000
35000
40000
0 50 100 150 200 250 300 350 400
Size of Data Set (KB)
Invo
cati
on
Tim
e (m
s) No rules
1 rule
5 rules
![Page 12: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/12.jpg)
Conclusions / Future Work
Actor Provenance data is important Without it, we don’t get the full picture
Prototype shows that it can be done Room for improvement
Interface to Monitoring System Caching of results
No inclusion of ‘instrumented’ actor capture Requires service provider adoption to work
![Page 13: Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana I.M.Wootten@cs.cf.ac.uk Cardiff University, UK](https://reader036.vdocuments.us/reader036/viewer/2022082821/5697bfd81a28abf838caebf5/html5/thumbnails/13.jpg)
Prototype Configuration
Single machine holding both client, service and registry
Rules executed on invocation of service XQuery Invocations performed 100 times on datasets
between 30KB – 340KB in size
Coordinator records rule results to a local file store