ryan fraser, nicholas carr connecting rdsi, nectar and ands via provenance - vhirl
TRANSCRIPT
![Page 1: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/1.jpg)
Ryan Fraser, Nicholas Carr
Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL
![Page 2: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/2.jpg)
● What are VLs?
● What is provenance?
● How do we represent VLs using standardised provenance?
Outline
![Page 3: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/3.jpg)
From https://nectar.org.au/virtual-laboratories-1, they are:
● data repositories and computational tools and streamlining research workflows
What are VLs?
![Page 4: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/4.jpg)
Connecting the commons with VHIRL and Provenance
![Page 5: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/5.jpg)
From http://en.wikipedia.org/wiki/Provenance#Computer_Science:
What is provenance?
“Computer science uses the term provenance to mean the lineage of data or processes, as per data provenance. However there is a field of informatics research within computer science called provenance that studies how provenance of data and processes should be characterised, stored and used. Semantic web standards bodies, such as the World Wide Web Consortium, ratified a standard for provenance representation in 2014, known as PROV.”
![Page 6: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/6.jpg)
Do you make decisions? Yes. Should someone remember how you made those decisions? Yes = PROV
![Page 7: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/7.jpg)
Data Services
Data Layers discovered
Layers consist of numerous remote data services
PROV: a) Service
captures data service information (hosted on RDS)
b) Captures subset details of data selected
Subset Selected for processing
![Page 8: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/8.jpg)
Compute/Storage Services
Flexibility in what compute provider to utilise
PROV: Captures job details, login info, where/what/ when/how computed etc
Includes all relevant NeCTAR details for cloud processing
![Page 9: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/9.jpg)
Available Toolboxes
TCRM – estimate wind speed from cyclone and severe wind
ANUGA – estimate inundation from riverine floods, tsunami, dam break and storm surge
PROV: Captures code utilised along with “how” it is used (template/input files)
![Page 10: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/10.jpg)
Example for tsunami inundation
PROV: Captures location (PID) of where input files/scripts are persisted
![Page 11: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/11.jpg)
Processing Services
The steps so far have been building an environment to run a processing script
Either write your own script...
...or build from existing templates
...when you’re done, it will be submitted for processing on the Cloud!
PROV: Captures location (PID) of where input files/scripts are persisted
![Page 12: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/12.jpg)
PROV: Finalised outputs are persisted with PIDs on RDS and captured in prov information
![Page 13: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/13.jpg)
PROV: After job is completed – finalised Prov record is published to provenance store
PROV record endpoints could be registered in ANDS RDA along side output data!!!
![Page 14: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/14.jpg)
![Page 15: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/15.jpg)
Components of the Virtual Hazard Impact & Risk Laboratory (VHIRL)
Data Services Processing Services
Compute Services Enablers
Virtual Laboratories/Ap
psData Analytics
Magnetics
Gravity
DEM
eScript
ANUGA
NCIPetascale
NCICloud
NeCTAR Cloud
AmazonCloud
Desktop
Service Orchestration
ProvenanceMetadata
Auth.
CoastalInundation
Tsuanmi Inundation
Scenario
Cyclone Wind Path Calculation
Landsat
Bathymetry
Cyclone WindModel
Surface Wave Propagation (earthquake)TCRM
![Page 16: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/16.jpg)
Basic scientific data processing model - 1
Input Data ProcessOutput Data
![Page 17: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/17.jpg)
Background: How do we represent VLs using standardised provenance?
![Page 18: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/18.jpg)
Basic scientific data processing model - 2
Code ProcessOutput Data
Config
Input Data
input item Roles
![Page 19: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/19.jpg)
Basic scientific data processing model - 3, PROV
Code ProcessOutput Data
Config
Input Data
Who/ which
system
Who
wasGeneratedBy
wasAttri
butedTowasAssociatedWith
used
Entity Activity AgentPROV classes:
![Page 20: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/20.jpg)
Basic scientific data processing model - 4, PROMS
Report N
Entity Activity AgentPROV classes:PROMS classes:
hadStartingActivity /
hadEndingActivityReporting System X
reportingSystem
R.S. Report
![Page 21: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/21.jpg)
Basic scientific data processing model - 5, Storage
Report N
Entity Activity AgentPROV classes:PROMS classes:
Reporting System X
R.S. Report
Report NReport N
Report M
Report NReporting System Y Report N
Report NReport N
OrganisationalProvenance
Store
reported and stored
![Page 22: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/22.jpg)
managed data
web service
data
user supplied
data
managed code
user supplied
code
Data Management
VL ID’d and persisted
output data
cited using PROMS-O format
soon to be VL ID’d and persisted, with minimal metadata recorded too
SSSC ID’s and persisted
perhaps SSSC ID’s and persisted, perhaps VL managed
soon to be VL ID’d and persisted, if required, perhaps with time limits
![Page 23: Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL](https://reader030.vdocuments.us/reader030/viewer/2022032313/56649e725503460f94b71e73/html5/thumbnails/23.jpg)
managed data
web service
data
user supplied
data
managed code
user supplied
code
Data Management
VL ID’d and persisted
output data
cited using PROMS-O format
soon to be VL ID’d and persisted, with minimal metadata recorded too
SSSC ID’s and persisted
perhaps SSSC ID’s and persisted, perhaps VL managed
soon to be VL ID’d and persisted, if required, perhaps with time limits
Virtual Labs Service Citation Example
[{ref}] {service title}{service endpoint URI}{query}{time queried}{cached copy ID}
[1] “Subset of elevation”
http://pid.csiro.au/service/anuga-thredds“bussleton.nc?var=elevation&spatial=bb&north=-33.06495205829679&south=-33.551573283840156&west=114.84967874597227&east=115.70661233971667&temporal=all&time_start=&time_end=&horizStride”
“2014-12-15T13:15:11”
http://pid.csiro.au/dataset/abcd1234