cloud infrastructure for training in life sciences

36
Cloud infrastructure for training in Life Sciences Manuel Corpas The Genome Analysis Centre

Upload: garrison-neal

Post on 01-Jan-2016

20 views

Category:

Documents


0 download

DESCRIPTION

Cloud infrastructure for training in Life Sciences. The Genome Analysis Centre. Manuel Corpas. [ egi.edu ]. The Genome Analysis Centre @ manuelcorpas. The Genome Analysis Centre @ manuelcorpas. Bottleneck is NOT. Production of data Technology Budget. - PowerPoint PPT Presentation

TRANSCRIPT

Cloud infrastructure for training in Life Sciences

Manuel Corpas

The Genome Analysis Centre

[egi.edu]The Genome Analysis CentreThe Genome Analysis Centre

@manuelcorpas

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Bottleneck is NOT

• Production of data• Technology• Budget

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Bottleneck IS

•TRAINING!

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Bottleneck IS

•TRAINING!–Bioinformatics

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Bioinformatics Training

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

The Genome Analysis Centre

Mick Watson

Roslin Institute

The Genome Analysis Centre@manuelcorpas

1. Most bioinformaticians are bad scientists

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Most bioinformaticians are bad scientists

2. Most biologists are bad bioinformaticians: poor computer skills, bad at maths/statistics

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Most bioinformaticians are bad scientists

2. Most biologists are bad bioinformaticians: poor computer skills, bad at maths/statistics

3. Short courses benefit no-one

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

The Genome Analysis Centre

Carole Goble

University of Manchester

The Genome Analysis Centre@manuelcorpas

• Students and trainers don’t like learning how to use new things

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

• Students and trainers don’t like learning how to use new things

• Trainees need to be eased in by using familiar stuff

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

How can we bridge the gap?

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

The Genome Analysis Centre

Titus BrownMichigan State University

The Genome Analysis Centre@manuelcorpas

1. Participants bring their laptops

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Participants bring their laptops2. Pre installed machines

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Participants bring their laptops2. Pre installed machines3. Cloud computing

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Cloud + Bioinformatics + Training

=

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Why Bioinformatics Training in the Cloud?

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

3 Advantages

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Participants can use own – Computers–Web browser

2. Graphical interaction via– X Windowes– IPython– Knitr

3. Compute can be scaled up/down depending on what it’s being taught

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Participants can use own – Computers–Web browser

2. Graphical interaction via– X Windows– IPython– Knitr

3. Compute can be scaled up/down depending on what it’s being taught

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Participants can use own – Computers–Web browser

2. Graphical interaction via– X Windowes– IPython– Knitr

3. Compute can be scaled up/down depending on what it’s being taught

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

3 Challenges

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Institutional resistance– Privacy of clinically sensitive data

2. Reliable network access and servers needed –> 30 people clicking at the same time!

3. Cost

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Institutional resistance– Privacy of clinically sensitive data

2. Reliable network access and servers needed –> 30 people clicking at the same time!

3. Cost

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Institutional resistance– Privacy of clinically sensitive data

2. Reliable network access and servers needed –> 30 people clicking at the same time!

3. Cost

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Materials

Data

NM

Trainee Trainer

Registry

Genomics

VMs+tools

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

National eResearch Collaboration Tools and Resources (NeCTAR)

Watson-Haigh et al. 2013

MRC UK Microbial Genomics

• Open Stack• Each VM 32Gb RAM, 8 cores, 1Tb• Biolinux

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Nick Loman, University of Birmingham

Why Cloud?

• Very little technical knowledge required

• Snapshot ready for replication• User can take instance home

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Cloud + Bioinformatics + Training

=

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Rafael Jiménez

[email protected]