patterns for e-research dave berry, research manager e-research within the university of edinburgh,...

Post on 13-Jan-2016

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Patterns for E-Research

Dave Berry, Research Manager

E-Research within the University of Edinburgh, 2nd March 2005

E-Research

“The invention and application of computing methods to extend our capabilities in any research discipline”

“Research in any discipline which benefits from and often depends on the use of advanced facilities and methods for computation, data curation, digital communication and visualisation”

Technology Growth

Gilder’s Law(32X in 4 yrs)

Storage Law (16X in 4yrs)

Moore’s Law(5X in 4yrs)

Triumph of Light – Scientific American. George Stix, January 2001

Pe

rfo

rman

ce p

er D

olla

r S

pen

t Optical Fibre(bits per second)

Chip capacity(# transistors)

Data Storage(bits per sq. inch)

Number of Years0 1 2 3 4 5

9 12 18

Doubling Time(months)

Pattern 1: Distributed Collaboration

Groups in different sites working togetherSharing knowledge and ideas

Technologies:Shared repositories

Wikis, SourceForge/NeSCForge, Forums, …

VideoconferencingComputer Supported Cooperative Work (CSCW)

Technology: Access Grid

MicrophonesCameras

Pattern 2:Simulation & Modelling

Large variety of topics, e.g.Protein foldingPosition of atoms in semiconductorsHuman heartEcology of ice sheets

Multiple scalesRemote visualisation and control

Example:The TeraGyroid Scientific Experiment

High-density isosurface of the late-time configuration in a ternary amphiphilic fluid as simulated on a 643 lattice by LB3D.

Gyroid ordering coexists with defect-rich, sponge-like regions.

The dynamical behaviour of such defect-rich systems can only be studied with very large scale simulations, in conjunction with high-performance visualisation and computational steering.

See http://www.realitygrid.org/workshop-2004/presentations/blake.ppt

Example:Terrestrial Carbon Dynamics

Pattern 3:Data archives

Data archives maintain data for widespread use, e.g.

UK Borders, Go-Geo, … (EDINA)ArkDB (Roslin)Mouse Atlas (HGU)EMBL, UniProt, … (EBI)Census, … (MIMAS)

Client-server accessSchemas defined centrally

Often subject to change…… if they’re defined at all!

Infrastructure: Digital Curation Centre

Industry

research collaborators

standards bodies

testbeds& tools

communities of practice: users

community support & outreach

research

development co-ordination

service definition & delivery

management & admin support

Collaborative Associates Network of DataOrganisations

curation organisations eg DPC

Pattern 4: Federated data

Sites maintain their own dataRemote access to other sitesControl access to your site

Integrated viewsCommunity-defined schemasTranslation between schemas

Distributed algorithmsRun jobs remotelyDistributed data mining

Example:Mass-scale Data Mining

Pattern 5: Parameter Search

Run the same algorithm on different data, e.g.

Finding local minimaCombinatorial search

Allows the use of multiple machines, e.g.A clusterMultiple clustersDesktop PCs

Example:ClimatePrediction.net

See www.climateprediction.net

Composing Patterns

Patterns that compose…Complex problems require many inputs and many processesShared contributions compose indefinitely, accumulating knowledge

… and how to compose themA common infrastructure

Technologies, naming, schemas, …

Workflow languagesPortals and “problem-solving environments”

Example:BRIDGES (BioInformatics)

Glasgow Edinburgh

Leicester Oxford

London

Netherlands

Publically Curated Data

Private data

Private data

Private data

Private data

Private data

Private data

CFG Virtual Organisation Ensembl

MGI

HUGO

OMIM

SWISS-PROT

… DATA HUB

RGD

SyntenyGrid

Service

blast

+

Authorisation

Example: FireGrid (proposal)

Maps, models,scenarios

Super-real-time simulation (HPC)

KBS and Planning

Emergency Responders

1000s of sensors & gateway processing

PiperPiperAlphaAlpha

Mont Mont BlancBlanc

KobKobee

Kings CrossKings Cross WTCWTC

Practical Challenges

TechnicalA variety of partial answersStandardisation work is long and political

SocialSharing of resources means sharing YOUR resourcesContributor recognition and IPR Defining common schemas and ontologiesTraining, funding for software developers and sysadmins

Responsibility of data publishersCost, dependability, trustworthy, capable, flexibility, …

Management of infrastructureOperation – NGS (national), ACF (local)Funding

top related