science cloud paul watson newcastle university, uk [email protected]
TRANSCRIPT
![Page 2: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/2.jpg)
Research Challenge
Understanding the brain is the greatest informatics challenge
• Enormous implications for science:
• Medicine
• Biology
• Computer Science
![Page 3: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/3.jpg)
Collecting the Evidence
100,000 neuroscientists generate huge quantities of data – molecular (genomic/proteomic)– neurophysiological (time-series activity)– anatomical (spatial)– behavioural
![Page 4: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/4.jpg)
Neuroinformatics Problems
• Data is:• expensive to collect but rarely shared• in proprietary formats & locally described
• The result is:• a shortage of analysis techniques that can be applied
across neuronal systems• limited interaction between research centres with
complementary expertise
![Page 5: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/5.jpg)
Data in Science
• Bowker’s “Standard Scientific Model”
1. Collect data
2. Publish papers
3. Gradually loose the original data
The New Knowledge Economy & Science & Technology Policy, G.C. Bowker
• Problems:– papers often draw conclusions from data that is not
published– inability to replicate experiments– data cannot be re-used
![Page 6: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/6.jpg)
Codes in Science
• Three stages for codes
1. Write code and apply to data
2. Publish papers
3. Gradually loose the original codes
• Problems:– papers often draw conclusions from codes that are
not published– inability to replicate experiments– codes cannot be re-used
![Page 7: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/7.jpg)
Plan
• Neuroinformatics - a challenging e-science application• CARMEN – addressing the challenges• Cloud Computing for e-science
– Lessons we’ve Learnt• The Promise of Commercial Clouds
![Page 8: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/8.jpg)
cracking the neural code
neurone 1
neurone 2
neurone 3
raw voltage signal data typically collected using single or multi-electrode array recording
Focus on Neural Activity
![Page 9: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/9.jpg)
Epilepsy Exemplar
Data analysis guides surgeon during operation
Further analysis provides evidence
WARNING!The next 2 Slides show an exposed human brain
![Page 10: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/10.jpg)
![Page 11: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/11.jpg)
![Page 12: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/12.jpg)
CARMEN
enables sharing and collaborative exploitation of data, analysis code and expertise that are not physically collocated
![Page 13: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/13.jpg)
CARMEN Project
Stirling
St. Andrews
Newcastle
York
Sheffield
Cambridge
ImperialPlymouth
Warwick
Leicester
Manchester
UK EPSRC e-Science Pilot
$7M (2006-10)
20 Investigators
![Page 15: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/15.jpg)
CARMEN e-Science Requirements
• Store– very large quantities of data (100TB+)
• Analyse– suite of neuroinformatics services– support data intensive analysis
• Automate– workflow
• Share– under user-control
![Page 16: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/16.jpg)
Background: North East Regional e-Science Centre
• 25 Research Projects across many domains:• Bioinformatics, Ageing & Health, Neuroscience, Chemical
Engineering, Transport, Geomatics, Video Archives, Artistic Performance Analysis, Computer Performance Analysis,....
• Same key needs:
Store
Analyse
AutomateShare
![Page 17: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/17.jpg)
Result: e-Science Central
• Integrated Store-Analyse-Automate-Share infrastructure• Web-based• Generic
– CARMEN neuroinformatics & chemistry as pilots
![Page 18: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/18.jpg)
Science Cloud Architecture
Data storage
and
analysis
Access over Internet
(typically via browser)
Upload data &
services
Run analyses
![Page 19: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/19.jpg)
Cloud Services Continuum (based on Robert Anderson)
Platform(PaaS)
Infrastructure(IaaS)
Software(SaaS)
Google Apps
Google AppEngine
Amazon EC2 & S3
http://et.cairene.net/2008/07/03/cloud-services-continuum/
Microsoft Azure
Salesforce.com
![Page 20: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/20.jpg)
Science Cloud Options
Cloud Infrastructure:Storage & Compute
Scie
nce
Ap
p 1
....
Scie
nce
Ap
p n
Cloud Infrastructure: Storage & Compute
Science Platform
ScienceApp 1 .... Science
App n
Users
Service Developers
![Page 21: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/21.jpg)
CARMEN Cloud
Filestore with PatternSearch
Database
Metadata
ServiceRepositoryProcessing
Workflow
Enactment
Workflo
w
Secu
rit
y
Browsers &
Rich Clients
![Page 22: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/22.jpg)
Editing and Running a Workflow on the Web
![Page 23: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/23.jpg)
Viewing the output of Workflow Runs
Workflow
Result File
![Page 24: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/24.jpg)
Viewing results
![Page 25: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/25.jpg)
Blogs and links
Communicating Results
Linking to results & workflows
![Page 26: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/26.jpg)
What we learnt: Moving into a Cloud
• Moving existing technologies into a cloud can be difficult– some can’t run in a Cloud at all
![Page 27: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/27.jpg)
Raw Data Exploration with Signal Data Explorer
![Page 28: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/28.jpg)
What we learnt : Scalability
• Clouds offer the potential for scalability– grab compute power only when needed
• But developers have to write scalable code– for Infrastructure as a Service Clouds
![Page 29: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/29.jpg)
Dynasoar: Dynamic Deployment
29
C WSP
req
res
1
Host Provider
node 1s2, s5
…
node 2
node ns2
Web Service Provider
3
2: service fetch &deploy
SR
Service Repository
R
The deployed service remains in place andcan be re-used - unlike job scheduling
A request to s4
![Page 30: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/30.jpg)
Dynasoar
30
C WSP
req
res
Host Provider
node 1s2, s5
…
node 2
node ns2
Web Service Provider
Consumer
A request for s2 is routed to an existing
deployment of the service
![Page 31: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/31.jpg)
Adaptive Dynamic Deployment with Dynasoar
0
50
100
150
200
250
300
350
400
450
0.03
0.03
0.03
0.06
0.06
0.13
0.13
0.13
0.25
0.25 0.
5
0.5
0.5 1 1 1
Arrival Rate (messages per second)
Res
pons
e tim
e (s
econ
ds)
0
2
4
6
8
10
12
14
16
18
Proc
esso
rs in
poo
l
Response time(Seconds)
processors in pool
Adding Processors as you need them optimises resources and saves money in pay-as-you-go clouds
Commercial Pay-as-you-go cloudsWould allow us to avoid this limit
![Page 32: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/32.jpg)
Hot Off the Press..
• Recent experiments with Microsoft Azure Cloud– running Chemical analyses– Silverlight UI
Thanks to:
- Paul Appleby & Team at the Microsoft Technology Centre, Reading
- & MS e-Science Group
![Page 33: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/33.jpg)
![Page 34: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/34.jpg)
![Page 35: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/35.jpg)
Microsoft Azure Cloud for e-Science Demo
![Page 36: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/36.jpg)
Why are Commercial Clouds Important: Before
Research
1. Have good idea
2. Write proposal
3. Wait 6 months
4. If successful, wait 3 months
5. Install Computers
6. Start Work
Science Start-ups
1. Have good idea
2. Write Business Plan
3. Ask VCs to fund
4. If successful..
5. Install Computers
6. Start Work
![Page 37: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/37.jpg)
Why Use Commercial Clouds:
1. Have good idea
2. Grab nodes from Cloud provider
3. Start Work
4. Pay for what you used
• also scalability, cost, sustainability
![Page 38: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/38.jpg)
Commercial Clouds to the Rescue?
• Focus currently on infrastructure as a service
• But, this is only part of the stack
• Can we have pay-as-you-go Science Cloud Platforms?
![Page 39: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/39.jpg)
A Sustainable Science Cloud
Science Platform as a Service
ScienceApp 1
.... ScienceApp n
CommercialClouds
?
?
Problem:deliveringthe e-science platform
www.inkspotscience.com
e-Science Central
Cloud Infrastructure: Storage & Compute
![Page 40: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/40.jpg)
Summary: e-Science Central & CARMEN
Software as a Service
Cloud Computi
ng
Social Networki
ng
e-Science Central /CARMEN
• Dynamic Resource
Allocation• Pay-as-you-Go*
• Web based• Works anywhere
• Controlled Sharing
• Collaboration• Communities
![Page 41: Science Cloud Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649cd85503460f949a19ee/html5/thumbnails/41.jpg)
Summary
• e-Science Central– Store-Analyse-Automate-Share e-science platform– Adding content from a range of domains
• CARMEN is piloting this approach for neuroinformatics
• Cloud computing can revolutionise e-science– reduce time from idea to realisation