managing and exploiting the digital deluge: issues, challenges and opportunities

11
1 UKOLN R&D TL presentation, Bath, 5 June 2008 Managing and exploiting the digital deluge: issues, challenges and opportunities Michael Day

Upload: michael-day

Post on 17-Nov-2014

2.137 views

Category:

Technology


1 download

DESCRIPTION

Presentation, University of Bath, 5 June 2008

TRANSCRIPT

Page 1: Managing and exploiting the digital deluge: issues, challenges and opportunities

1UKOLN R&D TL presentation, Bath, 5 June 2008

Managing and exploiting the digital deluge: issues, challenges andopportunities

Michael Day

Page 2: Managing and exploiting the digital deluge: issues, challenges and opportunities

UKOLN R&D TL presentation, Bath, 5 June 2008 2

The digital deluge - outline

Understanding the scope of the problem Some challenges Opportunities for researchers

Page 3: Managing and exploiting the digital deluge: issues, challenges and opportunities

UKOLN R&D TL presentation, Bath, 5 June 2008 3

The digital deluge - what is it? (1) A phrase applicable in more than one context:

The network infrastructure (the 'Exaflood'): The rapid expansion of Internet traffic, e.g. from the

streaming of movies or TV (BBC iPlayer) Managing a rapidly growing and diverse range of digital

content, e.g. Personal content, e.g. from digital cameras, e-mail Digitised content, e.g. sound and video reformatting,

e-texts generated by mass-digitisation programmes The "Data Deluge" - curating the vast amounts of

research data being generated by experiments, observational instruments and computer simulation

Page 4: Managing and exploiting the digital deluge: issues, challenges and opportunities

UKOLN R&D TL presentation, Bath, 5 June 2008 4

Page 5: Managing and exploiting the digital deluge: issues, challenges and opportunities

UKOLN R&D TL presentation, Bath, 5 June 2008 5

Page 6: Managing and exploiting the digital deluge: issues, challenges and opportunities

UKOLN R&D TL presentation, Bath, 5 June 2008 6

The digital deluge - what is it? (2) International Data Corporation (IDC) White Paper:

Estimates the digital universe in 2007 as 281 exabytes (281 billion gigabytes) and still growing fast

But these estimates include outputs from surveillance cameras, financial transaction journals, Web search logs, as well as more directly user-generated forms of content

Notes a growing environmental impact: Increased power consumption, electronic waste

Key areas of recent growth identified include: Healthcare data, e.g. medical imaging User-generated content, e.g. YouTube videos Scientific experiments, e.g. LHC (300 exabytes a year)

Page 7: Managing and exploiting the digital deluge: issues, challenges and opportunities

UKOLN R&D TL presentation, Bath, 5 June 2008 7

The digital deluge - challenges (1)

Problems of scale: Can our infrastructures begin to cope with dealing

with petabytes or exabytes of data? Technology has been quite good at keeping pace

with data growth in the past (although Moore's Law will not rescue us for ever)

Dealing with Organisational change is more problematic

The need for better co-ordination of effort is compromised by:

Professional and disciplinary differences Fragmented funding structures

Page 8: Managing and exploiting the digital deluge: issues, challenges and opportunities

UKOLN R&D TL presentation, Bath, 5 June 2008 8

The digital deluge - challenges (2) Problems of complexity:

Many different types of digital content: Structured, semi-structured, completely unstructured Mediated, non-mediated Interactivity and contextual links Sometimes key supporting information

(documentation, metadata, representation information, etc.) is missing

Content is stored in many different places: Active environments 'Repositories' of various kinds (new forms of silos?)

Ownership and privacy issues

Page 9: Managing and exploiting the digital deluge: issues, challenges and opportunities

UKOLN R&D TL presentation, Bath, 5 June 2008 9

The digital deluge - challenges (3) Organisational problems:

Lack of co-ordination between sectors, institutions, funding bodies, etc.

Still little consensus on: Deciding what needs to be kept (selection and

appraisal) Deciding who should ultimately be responsible for

looking after content, i.e. who pays? Infrastructures for preservation

These are still emerging from R&D projects and the commercial sector (rapid progress in last five years)

In HE, still questions on exactly where institutional repositories fit within the digital preservation landscape

Page 10: Managing and exploiting the digital deluge: issues, challenges and opportunities

UKOLN R&D TL presentation, Bath, 5 June 2008 10

The digital deluge - opportunities

Some opportunities: While many curation challenges remain to be solved,

the growing availability of digital content means that researchers:

Will find new and innovative ways of combining data to develop and test new research hypotheses

Will develop methodologies for mining and analysing vast amounts of data

It could also foster new and innovative ways of doing research, e.g. 'Science 2.0'

Page 11: Managing and exploiting the digital deluge: issues, challenges and opportunities

11UKOLN R&D TL presentation, Bath, 5 June 2008

Thank you for your attention