dr liz lyon, associate director outreach uk digital curation centre an introduction digital curation...

30
Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation Grand Challenge Meeting, Bath June 2005

Upload: alex-rodriguez

Post on 28-Mar-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

Dr Liz Lyon, Associate Director Outreach

UK Digital Curation Centre An Introduction

Digital Curation Centrea centre of support for data curation and preservation

Grand Challenge Meeting, Bath June 2005

Page 2: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

2

For later use? In use now (and the future)?

Repositories and digital curation

Data preservation Data curation

Static Dynamic

“maintaining and adding value to a trusted body of digital information for current and future use”

Page 3: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

3

Assuring permanent access to the records of science & the humanities?

Long term access to primary data

• Increasing data volumes from eScience and Grid-enabled / cyberinfrastructure applications

• Changing research paradigm: data-driven science, “big science”

• Observational data, simulations, large-scale experimentation

• Multi-media resources, statistical data, surveys, geo-spatial data……

Page 4: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

4

Page 5: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

5

Facilitate “post-processing” and knowledge extraction

Enable the acquisition of newly-derived information and knowledge

• Run complex algorithms over primary datasets

• Mining (data, text, structures)

• Modelling (economic, climate, mathematical, biological)

• Analysis (statistical, lexical, pattern matching, gene)

• Presentation (visualisation, rendering)

Page 6: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

6

Page 7: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

7

Provide additional functionality beyond digital preservation processes

Annotations

• Gene and protein sequences

• e-Lab books (Smart Tea Project in chemistry)

Page 8: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

8

Research & e-Science workflows

Aggregator services: national, commercial

Repositories : institutional, e-prints, subject, data, learning objects

Data curation: databases & databanks

Validation

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Searching , harvesting, embedding

Presentation services: subject, media-specific, data, commercial portals

Resource discovery, linking, embedding

Linking

The scholarly knowledge cycle : linking research data to publications

eBank UK Projecthttp://www.ukoln.ac.uk/projects/ebank-uk/

Emerging policy on open access to data

Page 9: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

9

DCC people (some of them…)

• Management & Co-ordination– Director Chris Rusbridge (University of Edinburgh)

• Community Support & Outreach– Led by Dr Liz Lyon (UKOLN, University of Bath)

• Service Definition & Delivery– Led by Professor Seamus Ross (HATII [ERPANET], University of

Glasgow)

• Development– Led by Dr David Giaretta (Astronomical Software & Services,

CCLRC)

• Research– Led by Professor Peter Buneman (Informatics, University of

Edinburgh)

Page 10: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

10

(Some of) the challenges we face

Standards: Interoperability issues: technical & ??soluble

Scale: Volume and diversity of datasets

Culture: Bringing communities together

• Library/information science/archives “document tradition”

• Domain research (chemists, astronomers, biologists)

• Computer science (databases)

• Commercial suppliers (storage technology)

Process & Skills: Highly-distributed organisation

• Use collaborative tools, combined skills

Engagement: Existing work & key players

Page 11: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

11

User requirements analysis: some sound bytes…

R&D issues: Annotation services, Ontology development, Automating metadata creation, Tools and toolkits, Data Format Description Language, Identifiers, Registries, Economic and cost-benefits studies

Advisory services :“Ask-a-Curator”,FAQs, reports, briefings, awareness-raising materials, best practice guidance, Storage media, “Like Erpanet”, advise Government, Research Councils, funding bodies

Professional development: Short courses, conferences, seminars, workshops, secondments to DCC and to working repository services

Outreach: Leadership for the future, case studies, sharing solutions, collaboration with other partners, international peers, industry links

Taxonomy of “Users”

Page 12: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

12

Outline Taxonomy of digital curation users by role

1. Data Creators

2. Data Curators

3. Data Re-users

4. Policy makers

-funding bodies

-other leaders

Data Preservers

Data publishers

Page 13: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

13

Outline Taxonomy by significant function of organisational entity

1. Research

2. Service provision

3. Learning & teaching

4. Funders

5. Policy / strategy makers

“Designated communities”

Commercial

Page 14: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

14

Advisory services

• Responses to queries—from legal to technical guidance [email protected]

• FAQs constructed• Informing workshops and information

services• Monthly site visits (National Institute of

Environmental eScience)

Page 15: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

15

Professional development workshops

• 2005 Programme – Persistent identifiers June, Glasgow – Institutional repositories: July

University of Cambridge, with DSpace– Cost models July British Library,

London with the Digital Preservation Coalition

– Preservation of medical databases: October Gulbenkian Institute, Lisbon with ERPANET & the Wellcome Trust

Page 16: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

16

Standards Watch

• Covering existing and emerging standards• Working with community and standards

bodies (e.g. ISO)• Organising associates groups around new

standards developments• Initiating standardisation definitions where

gaps identified• Currently re-purposing Diffuse database of

standards materials

Page 17: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

17

Digital Curation Manual

• A world class resource• Constructed from topic-specific chapters

– written by international experts– editorial board comprising leading researchers and

practitioners

• 45 initial topics including– Appraisal and Selection; Costs; Freedom of

Information; Interoperability; the OAIS Reference Model; Preservation Strategies; and Open Source

• Less in-depth insight offered by DCC Briefing Papers, aimed at needs of senior managers

Page 18: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

18

OAIS Reference Model – Functional Model

4-1.

2

MANAGEMENT

Ingest

Data Management

SIP

AIPDIP

queries

result setsAccess

PRODUCER

CONSUMER

Descriptive Info

AIP

orders

Descriptive Info

Archival Storage

Administration

Preservation Planning

Page 19: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

19

Audit and Certification (1)

• How can people know who to entrust with their information?

• There is a demand for a certification process for– Repositories and components e.g. archive storage– Software

• Certification standards (ISO 9000 and ISO 17799) do not do the job

• OCLC/RLG Trusted Digital Repositories: Attributes and Responsibilities– high level model for design, delivery and maintenance of

digital repositories

Page 20: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

20

Audit and Certification (2)

• International expert group led by RLG and NARA is drafting a Certification standard

• DCC is participating: aiming for international consensus

• Draft goes to Technical Editor end of June• DCC testbeds to support development of audit

and certification standards• Commitment to

– offer guidance on self-audit and self-certification– carry out independent audits– issue certificates to qualifying repositories

Page 21: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

21

Tools and Technologies

• Accumulate and Maintain Registry and online Repository of relevant tools– Repository Implementations– Packaging Tools– Rendering Software– Format Converters– Device Drivers

Page 22: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

22

Representation Registry development• Simple PHP prototype• Scoping study

– Formats, standards, tools

• More robust prototype in development– Based on ebXML & JAXR– Potentially distributed, cooperative maintenance

model – Representation information: describe CCLRC

(science) data using EAST,

• Links to PRONOM, GDFR and other pilots • Aim to handover to services

Development info – see

http://dev.dcc.ac.uk

for details of Wiki and email list open to all

Page 23: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

23

Research agenda (1)

• Publishing & integrating scientific databases• ‘Archiving’ past states of volatile databases• Database provenance and annotation• Organisational dynamics of trusted

repositories• Automating metadata extraction• Cost-benefit analysis of data curation• Rights and responsibilities

Page 24: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

24

The database picture

Source data Curated data: classified, cleaned, annotated, integrated, cross-linked

Page 25: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

25

Curated databases – some issues

• Integrating, publishing and citing data so that someone else can use it.

• Annotating existing data and moving annotations to other databases

• Provenance: where did this data come from?

• Archiving: how do you preserve something that is constantly changing?

Page 26: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

26

Research agenda (2)

• Publishing & integrating scientific databases• ‘Archiving’ past states of volatile databases• Database provenance and annotation• Organisational dynamics of trusted

repositories• Automating metadata extraction• Cost-benefit analysis of data curation• Rights and responsibilities

– “Public domain, public interest, public funding” paper Waelde & McGinley

Page 27: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

27

www.dcc.ac.uk

Page 28: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

28

• www.ijdc.net

• Launch planned July

• Peer-review Editorial Board

• Peter Buneman Editor (research)

• Production editor Philip Hunter

• Papers for submission are very welcome!

Page 29: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

29

1st DCC International Conference

• Location - Bath UK

• 29-30 September 2005

• Keynote speakers

Clifford Lynch CNI

Graham Cameron European Bio-informatics Institute

• DCC Research update

• Social highlights

Page 30: Dr Liz Lyon, Associate Director Outreach UK Digital Curation Centre An Introduction Digital Curation Centre a centre of support for data curation and preservation

30

Associates Network

Goals

Develop understanding, share best practice, advance research, promote recognition, develop consensus

Membership

International groups, national bodies, industry partners, funders, research groups, HEIs, FEIs, individuals……

Benefits

Early access to R&D outputs, advisory services, training, input to definition and design, community participation

Discussion Forum www.dcc.ac.uk Please join us!