xldbeuropeedinburgh-09-jun2011
DESCRIPTION
Personal views on what Research Infrastructures really need for data - a more comprehensive version of the 5 minute presentation I have at XLDB-Europe, 8-10th June 2011 in EdinburghTRANSCRIPT
What does Research Infrastructurereally need for Data?
An e-Science infrastructure for biodiversity and ecosystem science
Common Operations of Environmental Research InfrastructuresENVRI
Alex HardistySchool of Computer Science & Informatics
portal.lifewatch.eu www.lifewatch.eu portal.lifewatch.eu www.lifewatch.eu
What is LifeWatch?
• European Research Infrastructure for understanding biodiversity as a whole interacting system– Exploring patterns of biodiversity and
processes of biodiversity across space/time• A geospatial data e-Infrastructure
– Distributed observatories / sensors– Data mgmt., processing and analytical tools– Computational capability and capacity– Collaborative environments– Support, training, partnering, fellowship
1800 terrestrial Long-Term Ecological Research (LTER) sites: increasingly sensor instrumented
>200 Marine reference and focal sites, with more to come: increasingly sensor instrumented
Hundreds of millions of specimens in natural science collections: >275m now indexed, increasing at 20% p.a.
Challenge of SCALE: > 25,000 users
Plus: all kinds of small, personal, group, and departmental datasets that need to get published
portal.lifewatch.eu www.lifewatch.eu portal.lifewatch.eu www.lifewatch.eu
Challenge of HETEROGENEITY: Interconnected nature of biodiversity ideas, outputs, repositories
From
Pet
erso
n et
al (
2010
), Sy
st B
iodi
vers
8(2
), 15
9-16
8Fr
om G
ural
nick
and
Hill
(201
0), h
ttp://
ww
w.s
lides
hare
.net
/robg
ur/ie
vobi
o-ke
ynot
e-ta
lk-2
010
Common solutions to common challengesfaced by ESFRI environmental infrastructures
(left to right, top to bottom)
Global ocean observing infrastructureSvalbard arctic Earth observing systemAircraft for global observing systemTropospheric research aircraft
Polar research icebreakerBiodiversity and ecosystem research
Multidisciplinary seafloor observatoryUpgrade of incoherent scatter facilityPlate observing systemIntegrated carbon observation system
ENVRI
Source: EC
Data generators
Datausers
Data Services
Community – specificServices
Data transferFast data transmission
Operation at remote sites
User functionalitiesVirtual Environments & Collaborative organisations
Security & Protection
Data discovery & NavigationData submission tools (meta) data tagging tools
Operational Semantic Interoperability
Workflow GeneratorKnowledge management
Virtualisation
Persistant storage capacity24/7 operation
Preservation & Sustainability (digital asset management)
AuthenticityCertification & Integrity
GUIDs
ENVRI
Source: W.Los, UvA
• Common solutions to common problems– adopted by each infrastructure through its construction phase
• Common Reference Model providing multiple ‘views’ of RI:– Science business / enterprise view, Information view,
Computational / services view, Engineering view, Technology view
• Standards, Standards, Standards– Data capture from distributed sensors, Metadata definition,
Management of high volume data, Execution of workflows, Visualization of data, Provenance and annotation, Interoperability between assets
• Common tools e.g., for data discovery and access– in a federation of distributed data repositories and
interoperating infrastructures
What do RIs REALLY need for data?
ENVRI
• Report of the High Level Expert Group on Scientific Data
• Neelie Kroes, EC Vice-President for the Digital Agenda– “... use it as a
reference point when discussing the priorities of EU research investments.”
ENVRI