electronic notebooks: an interface component for semantic records systems james d. myers, michael...

22
Electronic Notebooks: An Electronic Notebooks: An Interface Component for Semantic Interface Component for Semantic Records Systems Records Systems James D. Myers , Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics and Computational Science Directorate Pacific Northwest National Laboratory

Upload: william-hodges

Post on 26-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

Electronic Notebooks: An Interface Electronic Notebooks: An Interface Component for Semantic Records Component for Semantic Records

SystemsSystems

Electronic Notebooks: An Interface Electronic Notebooks: An Interface Component for Semantic Records Component for Semantic Records

SystemsSystems

James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott

Mathematics and Computational Science DirectoratePacific Northwest National Laboratory

Page 2: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

2

OutlineOutlineOutlineOutline

Why have an electronic notebook?The changing science/IT landscapeSemantic repositories Scientific Annotation Middleware

ENs on semantic repositories The ELN on SAM

Page 3: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

3

Secure shared WWW based space

Hierarchical Chapters/Pages/Notes

Add/View/Search Notes

File upload, sketch, text, equations, forms, image capture, …

Interactive views of data

Editor/Viewer APIs

Cross-out capability

Digital Signatures/Timestamps

Java Client, Perl and Java (2001+) servers

PNNL Electronic Laboratory Notebook (ELN)PNNL Electronic Laboratory Notebook (ELN)~1995+~1995+

PNNL Electronic Laboratory Notebook (ELN)PNNL Electronic Laboratory Notebook (ELN)~1995+~1995+

Page 4: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

4

What distinguishes ENs from other tools?What distinguishes ENs from other tools?What distinguishes ENs from other tools?What distinguishes ENs from other tools?

Emphasis on multimedia human-entered informationChronological, page-oriented displayMaster/personal project recordRecords functionality: Non-repudiation - digital signatures and timestamps Persistence/completeness - write-once/no deletions/audit trail Standardized lifecycle – signing/witnessing policies, archiving,

retention schedules, …

Page 5: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

5

The Systems Science RevolutionThe Systems Science RevolutionThe Systems Science RevolutionThe Systems Science Revolution

Community ResourcesBi-directional flow/feedback of information

Partial results being combined to produce new knowledge

Experiment/Theory/Model comparisons Multiscale optimizations

Rapid EvolutionHigh ComplexityShifting/Emerging disciplinary boundaries

Resources will be distributedWith multiple curators

Supernova Cosmology Requires Complex,Widely Distributed Workflow ManagementSupernova Cosmology Requires Complex,Supernova Cosmology Requires Complex,Widely Distributed Workflow ManagementWidely Distributed Workflow Management

Slide from Bill Johnston, LBNL

Page 6: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

6

Advances in Problem Solving Advances in Problem Solving Environments/Grids/Semantic TechnologiesEnvironments/Grids/Semantic Technologies

Advances in Problem Solving Advances in Problem Solving Environments/Grids/Semantic TechnologiesEnvironments/Grids/Semantic Technologies

Multiple Applications recording data Pedigree/Provenance

Experiment Metadata Project Organization Workflow Categorization Detected Features Instrument logs … Replica Locations Endorsements Community Annotations …

How do we provide EN capabilities in this larger context?

Page 7: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

7

Semantic RepositoriesSemantic RepositoriesSemantic RepositoriesSemantic Repositories

Use self-describing metadata/relationships Triple-stores RDF OWL

Aggregate information generated by multiple applicationsAllows browsing, searching, reasoning across integrated information

Page 8: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

8

Scientific Annotation Middleware (SAM) Scientific Annotation Middleware (SAM) - 5 yr DOE funded research project- 5 yr DOE funded research project

Scientific Annotation Middleware (SAM) Scientific Annotation Middleware (SAM) - 5 yr DOE funded research project- 5 yr DOE funded research project

Develop middleware to create semantic repositoriesEnable the sharing of this information among portals and problem solving environments, software agents, scientific applications, and electronic notebooks With different levels of sophistication Without global schema

Improve the completeness, accuracy, and availability of the scientific record.

http://www.scidac.org/SAM/

Page 9: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

9

SAM ArchitectureSAM ArchitectureSAM ArchitectureSAM Architecture

Notebook Services

Semantic Services

Metadata Services

DataGrid

Database

Web

DA

V, D

AS

L, J

MS

, SA

M E

xten

sio

ns

DA

V, J

DB

C, G

rid

FT

P

Page 10: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

10

Web Distributed Authoring and Versioning Web Distributed Authoring and Versioning (WebDAV)(WebDAV)

Web Distributed Authoring and Versioning Web Distributed Authoring and Versioning (WebDAV)(WebDAV)

An early web servicePut/Get data with arbitrary properties (dynamic)Properties can be discovered and accessed independentlyDASL, Versioning, Transactions, …Widely supported (MS Office, databases, file system drivers,…)

Page 11: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

11

Binary Format Description (BFD) LanguageBinary Format Description (BFD) LanguageBinary Format Description (BFD) LanguageBinary Format Description (BFD) Language

XML Language to describe ASCII, Binary, and XML data formatsGeneric Parser to extract and semantically tag data in files/streamsThe meaning of data can be

captured, regardless of format, for future use

Data Format Description Language Standard

XSLStylesheet(reformat)

XSLTProcessor

XML Format2

BFD Parser

BFDDescription

1

XML Format1

File Format1

<XSIL>

<Param Name=“units”>meters</Param>

<Param Name=“numColumns” Type="int“/>

<Vector Name=“orbitData”>

<Dim><XBFDvalue-of

select="/XSIL/Param

[@Name='numColumns']"/>

</Dim>

<Dim>4</Dim>

</Vector>

<Stream Type=“remote”

XBFDStreamnumber=“0”

Encoding = “biinary”/>

</XSIL>

Page 12: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

12

SAM Metadata Services LayerSAM Metadata Services LayerSAM Metadata Services LayerSAM Metadata Services Layer

Jakarta Slide DAV server plus configurable:Mapping to Data Store(s)Property Generation from binary/ASCII/XML filesDynamic Virtual TranslationsServer generated Properties and Relationships

Timestamp, size, CopyOf

FortranApplication

‘LocalDisk’ DAV

DAV+

Content

ELNProp1Prop2Prop1

hastranslationBFDWebServiceXSLT

TranslatedContent

RDF Export

Page 13: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

13

SAM Semantic Services LayerSAM Semantic Services LayerSAM Semantic Services LayerSAM Semantic Services Layer

SAM Metadata Layer plus configurable: Relation-scoped Queries Translation of DAV Properties to RDF Triples RDF/GXL Pedigree Generation …

Page 14: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

14

Back to ENs…Back to ENs…Back to ENs…Back to ENs…

What is needed to be able to provideUnstructured human entry of information?Chronological, page-oriented display?A master/personal project record?Records functionality?

Page 15: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

15

Creating NotesCreating NotesCreating NotesCreating Notes

A ‘standard’ ELN client can create notesStored as content with a hasNote

relationship with pages, notes

Plus…any app can store notes the same way

Page generation – works as before

Page 16: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

16

ENs as a Primary View?ENs as a Primary View?ENs as a Primary View?ENs as a Primary View?

Instruments, PSEs, etc. may organize parts of the experiment that an EN should not duplicate

define other relationships as part of the EN chapter/page/note hierarchy:

Project Experiment1 Experiment2 Data1 Data2

Notebook1 Chapter1 Chapter2 Page1 Page2

Defined by PSE Interpreted by EN

Page 17: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

17

Records?Records?Records?Records?

Digital Signatures, Timestamps, etc. are services that can be exposed as repository services and associated metadataBut What do we sign (content/metadata)? Where is the edge of the record?

How deep do we travel through the web of relations? How do we stop other applications from changing/deleting

signed content?

Page 18: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

18

Multiple OptionsMultiple OptionsMultiple OptionsMultiple Options

Simple: Sign content plus defined subset of metadata Stop at edge of server Treat relationship cycles as links Lock content and metadata subset when signed

Advanced: Multiple self-describing signatures (e.g. XMLSignature) Allow records across servers via trust, cached metadata/data Define fine-grained retention schedules

Page 19: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

19

SAM Notebook Services LayerSAM Notebook Services LayerSAM Notebook Services LayerSAM Notebook Services Layer

SAM Metadata and Semantic Layers plus:Notebook Management, Page Display, …Digital SignaturesCanonicalizationNotarized TimestampsData/Signature Migration Capabilities

Notebook API, Notebook ComponentsSupports ELN 5.1, Annotation Applet, new portal-based EN

client

EN Portlets

Page 20: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

20

Collaboratory for Multiscale Chemical Science (CMCS) SAM as primary data system, pedigree, notebook

NEESgrid/CHEF Portal/NMI Grid User Computing Environments ELN, SAM as a metadata/pedigree store?

Genomes-To-Life SAM as annotation/metadata repository, notebook

Internal PNNL Projects Concept Map Repository, Interface to Lustre, Biological Data Annotation

DOE2000 Notebook Community (1500+ email addresses) Upgrades to DOE2K Notebooks E.g. Columbia University Environmental Science Lab Notebooks

CollaborationsCollaborationsCollaborationsCollaborations

Page 21: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

21

A Scientific Content Repository VisionA Scientific Content Repository VisionA Scientific Content Repository VisionA Scientific Content Repository Vision

Notebooks become just one view of the scientific informationApplications contribute data, metadata, and relationships directlyRecords functionality provided by middleware, available to multiple applicationsContent is stored in multiple repositories managed independentlyThe scientific record becomes richer and re-integrated

Page 22: Electronic Notebooks: An Interface Component for Semantic Records Systems James D. Myers, Michael Peterson, K Prasad Saripalli, Tara Talbott Mathematics

22

AcknowledgmentsAcknowledgmentsAcknowledgmentsAcknowledgments

Carina Lansing, PNNLAl Geist, Jens Schwidder, David Jung, ORNLU.S. Department of Energy

Pacific Northwest National Laboratory Pacific Northwest National Laboratory is a multiprogram national laboratory operated by

Battelle Memorial Institute for the U.S. Department of Energy under Contract DE-AC06-76RL0 1830

Oak Ridge National Laboratory Oak Ridge National Laboratory is a multiprogram national laboratory operated by UT-

Battelle, LLC for the U.S. Department of Energy under Contract DE-AC05-00OR22725

Mathematical, Information and Computational Sciences Division of the Office of Science