Patterns for E-Research
Dave Berry, Research Manager
E-Research within the University of Edinburgh, 2nd March 2005
E-Research
“The invention and application of computing methods to extend our capabilities in any research discipline”
“Research in any discipline which benefits from and often depends on the use of advanced facilities and methods for computation, data curation, digital communication and visualisation”
Technology Growth
Gilder’s Law(32X in 4 yrs)
Storage Law (16X in 4yrs)
Moore’s Law(5X in 4yrs)
Triumph of Light – Scientific American. George Stix, January 2001
Pe
rfo
rman
ce p
er D
olla
r S
pen
t Optical Fibre(bits per second)
Chip capacity(# transistors)
Data Storage(bits per sq. inch)
Number of Years0 1 2 3 4 5
9 12 18
Doubling Time(months)
Pattern 1: Distributed Collaboration
Groups in different sites working togetherSharing knowledge and ideas
Technologies:Shared repositories
Wikis, SourceForge/NeSCForge, Forums, …
VideoconferencingComputer Supported Cooperative Work (CSCW)
Technology: Access Grid
MicrophonesCameras
Pattern 2:Simulation & Modelling
Large variety of topics, e.g.Protein foldingPosition of atoms in semiconductorsHuman heartEcology of ice sheets
Multiple scalesRemote visualisation and control
Example:The TeraGyroid Scientific Experiment
High-density isosurface of the late-time configuration in a ternary amphiphilic fluid as simulated on a 643 lattice by LB3D.
Gyroid ordering coexists with defect-rich, sponge-like regions.
The dynamical behaviour of such defect-rich systems can only be studied with very large scale simulations, in conjunction with high-performance visualisation and computational steering.
See http://www.realitygrid.org/workshop-2004/presentations/blake.ppt
Example:Terrestrial Carbon Dynamics
Pattern 3:Data archives
Data archives maintain data for widespread use, e.g.
UK Borders, Go-Geo, … (EDINA)ArkDB (Roslin)Mouse Atlas (HGU)EMBL, UniProt, … (EBI)Census, … (MIMAS)
Client-server accessSchemas defined centrally
Often subject to change…… if they’re defined at all!
Infrastructure: Digital Curation Centre
Industry
research collaborators
standards bodies
testbeds& tools
communities of practice: users
community support & outreach
research
development co-ordination
service definition & delivery
management & admin support
Collaborative Associates Network of DataOrganisations
curation organisations eg DPC
Pattern 4: Federated data
Sites maintain their own dataRemote access to other sitesControl access to your site
Integrated viewsCommunity-defined schemasTranslation between schemas
Distributed algorithmsRun jobs remotelyDistributed data mining
Example:Mass-scale Data Mining
Pattern 5: Parameter Search
Run the same algorithm on different data, e.g.
Finding local minimaCombinatorial search
Allows the use of multiple machines, e.g.A clusterMultiple clustersDesktop PCs
Example:ClimatePrediction.net
See www.climateprediction.net
Composing Patterns
Patterns that compose…Complex problems require many inputs and many processesShared contributions compose indefinitely, accumulating knowledge
… and how to compose themA common infrastructure
Technologies, naming, schemas, …
Workflow languagesPortals and “problem-solving environments”
Example:BRIDGES (BioInformatics)
Glasgow Edinburgh
Leicester Oxford
London
Netherlands
Publically Curated Data
Private data
Private data
Private data
Private data
Private data
Private data
CFG Virtual Organisation Ensembl
MGI
HUGO
OMIM
SWISS-PROT
… DATA HUB
RGD
SyntenyGrid
Service
blast
+
Authorisation
Example: FireGrid (proposal)
Maps, models,scenarios
Super-real-time simulation (HPC)
KBS and Planning
Emergency Responders
1000s of sensors & gateway processing
PiperPiperAlphaAlpha
Mont Mont BlancBlanc
KobKobee
Kings CrossKings Cross WTCWTC
Practical Challenges
TechnicalA variety of partial answersStandardisation work is long and political
SocialSharing of resources means sharing YOUR resourcesContributor recognition and IPR Defining common schemas and ontologiesTraining, funding for software developers and sysadmins
Responsibility of data publishersCost, dependability, trustworthy, capable, flexibility, …
Management of infrastructureOperation – NGS (national), ACF (local)Funding