cloud infrastructure for training in life sciences manuel corpas the genome analysis centre
TRANSCRIPT
Cloud infrastructure for training in Life Sciences
Manuel Corpas
The Genome Analysis Centre
[egi.edu]The Genome Analysis CentreThe Genome Analysis Centre
@manuelcorpas
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
Bottleneck is NOT
• Production of data• Technology• Budget
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
Bottleneck IS
•TRAINING!
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
Bottleneck IS
•TRAINING!–Bioinformatics
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
Bioinformatics Training
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
The Genome Analysis Centre
Mick Watson
Roslin Institute
The Genome Analysis Centre@manuelcorpas
1. Most bioinformaticians are bad scientists
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
1. Most bioinformaticians are bad scientists
2. Most biologists are bad bioinformaticians: poor computer skills, bad at maths/statistics
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
1. Most bioinformaticians are bad scientists
2. Most biologists are bad bioinformaticians: poor computer skills, bad at maths/statistics
3. Short courses benefit no-one
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
The Genome Analysis Centre
Carole Goble
University of Manchester
The Genome Analysis Centre@manuelcorpas
• Students and trainers don’t like learning how to use new things
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
• Students and trainers don’t like learning how to use new things
• Trainees need to be eased in by using familiar stuff
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
How can we bridge the gap?
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
The Genome Analysis Centre
Titus BrownMichigan State University
The Genome Analysis Centre@manuelcorpas
1. Participants bring their laptops
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
1. Participants bring their laptops2. Pre installed machines
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
1. Participants bring their laptops2. Pre installed machines3. Cloud computing
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
Cloud + Bioinformatics + Training
=
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
Why Bioinformatics Training in the Cloud?
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
3 Advantages
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
[Adapted from Titus Brown]
1. Participants can use own – Computers–Web browser
2. Graphical interaction via– X Windowes– IPython– Knitr
3. Compute can be scaled up/down depending on what it’s being taught
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
1. Participants can use own – Computers–Web browser
2. Graphical interaction via– X Windows– IPython– Knitr
3. Compute can be scaled up/down depending on what it’s being taught
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
1. Participants can use own – Computers–Web browser
2. Graphical interaction via– X Windowes– IPython– Knitr
3. Compute can be scaled up/down depending on what it’s being taught
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
3 Challenges
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
[Adapted from Titus Brown]
1. Institutional resistance– Privacy of clinically sensitive data
2. Reliable network access and servers needed –> 30 people clicking at the same time!
3. Cost
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
1. Institutional resistance– Privacy of clinically sensitive data
2. Reliable network access and servers needed –> 30 people clicking at the same time!
3. Cost
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
1. Institutional resistance– Privacy of clinically sensitive data
2. Reliable network access and servers needed –> 30 people clicking at the same time!
3. Cost
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
Materials
Data
NM
Trainee Trainer
Registry
Genomics
VMs+tools
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
National eResearch Collaboration Tools and Resources (NeCTAR)
Watson-Haigh et al. 2013
MRC UK Microbial Genomicshttp://climb.ac.uk
• Open Stack• Each VM 32Gb RAM, 8 cores, 1Tb• Biolinux
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
Nick Loman, University of Birmingham
Why Cloud?
• Very little technical knowledge required
• Snapshot ready for replication• User can take instance home
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
Cloud + Bioinformatics + Training
=
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas
Rafael Jiménez
• Titus Brown
• Mick Watson
• Carole Goble
• Nick Loman
• Vicky
Schneider