amazon resource for bioinformatics
DESCRIPTION
Walk through using CloudBioLinux, CloudMan, BioCloudCentral to do custom biological analyses on Amazon EC2 hardware.TRANSCRIPT
Amazon resources for bioinformatics
Brad Chapman
Bioinformatics Interest Group, 18 Oct 2012
Goals
Automate:Reduce stepsRemove activation energyIncrease abstraction
Improve:SharingReproducibilityTeaching
Installation
Easier installation
No installation
Challenge
Biology computing platform
Widely accessible
Customizable
Community driven
Not only Amazon
http://gigaom.com/cloud/what-google-compute-engine-means-for-cloud-computing/
CloudBioLinux
Amazon image with bioinformatics software andlibraries
Automated build framework
Community e�ort to maintain and extend
http://cloudbiolinux.org
CloudMan
SGE cluster plus automation
Web interface and monitoring
Persistence and sharing
Powers the Galaxy Cloud o�ering
http://usecloudman.org/
BioCloudCentral
Automate setup of Amazon instance
Launch CloudBioLinux and CloudMan
Provide easy ssh access, no key pairs
http://biocloudcentral.org
Acknowledgments
CloudBioLinux: Ntino Krampis, Tim Booth,Dawn Field, Pjotr Prins, John Chilton andCloudBioLinux community.
CloudMan: Enis Afgan, James Taylor
BioCloudCentral: Enis Afgan, John Chilton,Dannon Baker
Documentation
http://cda.currentprotocols.com/WileyCDA/CPUnit/
refId-bi1109.html
What we'll do
1 Sign up for Amazon
2 Start a CloudBioLinux/CloudMan instance
3 Add nodes to create a compute cluster
4 Run variant calling pipeline
Everything done through the web
Getting started
Sign up for Amazon Web Serviceshttp://aws.amzaon.com
Get security credentials: Access Key and Secret Keyhttp://portal.aws.amazon.com/gp/aws/
securityCredentials
Launch: http://biocloudcentral.org
Ready two minutes later
Login to CloudMan
Shared CloudMan images
Package a complete analysis environmentDataCustomizations
Sharable with other users
Share string with NGS analysis platform:
cm-b53c6f1223f966914df347687f6fc818/shared/2012-07-23--19-23/
Start CloudMan
CloudMan console
CloudMan admin page
CloudMan: managing a cluster
Associated Galaxy instance
Analysis data on shared instance
Graphical variant-calling pipeline
Analysis data linked to pipeline
Con�gure pipeline
Run pipeline
Shut everything down
What happened
1 Sign up for Amazon
2 Start a CloudBioLinux/CloudMan instance
3 Add nodes to create a compute cluster
4 Run variant calling pipeline
Everything done through the web
ssh to the machine
$ ssh [email protected]
[email protected]'s password:
Welcome to Ubuntu 12.04 LTS
(GNU/Linux 3.2.0-23-virtual x86_64)
ubuntu@ip-10-72-197-11:~$
NX graphical client: login
http://www.nomachine.com/download.php
NX graphical client: desktop
Summary
Use cloud resources to build:
Machines with standard software
Cluster management
Analysis pipelines
Reproducible, sharable instances
Web-based interfaces