cloud computing mick watson director of ark-genomics the roslin institute

18
Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

Upload: israel-hallaway

Post on 22-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

Cloud Computing

Mick WatsonDirector of ARK-Genomics

The Roslin Institute

Page 2: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

Structure• What is a computer– Desktops / servers / clusters– Clients / servers

• Virtualisation• The Cloud• Accessing the Amazon cloud• Costs etc

Page 3: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

What is a computer?

Page 4: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

What is a server?

Page 5: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

What is a cluster?

• A cluster is a connected set of computers (nodes)• Any job can be run on any of the nodes• A server may be a single PC, it may be cluster

Page 6: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

An operating system?• An OS is the software that runs the computer:– E.g. Windows– E.g. Linux– E.g. Mac OSX– E.g. Android– E.g. Solaris– E.g. iOS

Page 7: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

Virtualisation• You can run an entire computer inside a computer

• Take OS and data and all running processes• Create an “image”• Recreate that image inside another PC• Access it as if it was a (physical) normal PC

• http://en.wikipedia.org/wiki/Virtualization

Page 8: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

What is “the cloud”?• There is not just one!– Amazon EC2– Rackspace– Etc

• “The Cloud” refers to a large cluster of computers, in which you can create, for a fee, as many virtual computers as you like

Page 9: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

AMAZON EC2

Page 10: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

We will use Amazon EC2• Terminology:

– EC = “Elastic Compute”– Image – a preconfigured computer image. Like a template.– Instance – an virtual version of an image that you can log in to and use

• Select a pre-configured Amazon Machine Image (AMI) to get up and running immediately.

• Or create an AMI containing your applications, libraries, data, and associated configuration settings.

• Configure security and network access• Choose which instance type(s) you want, then start, terminate, and monitor as

many instances as you like.• Pay only for the resources that you actually consume, like instance-hours or data

transfer

• Close down the image(s) when finished

Page 11: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

Linux• Linux refers to an entire family of operating systems:– Red Hat– Ubuntu– Debian etc

• Linux is free• Many of the computers that power the internet run

Linux• Almost all bioinformaticians use it• Powerful, extendable and open

Page 12: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

The power is the command line• Don’t panic!

Page 13: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

BioLinux• http://nebc.nerc.ac.uk/tools/bio-linux/bio-linux-6.0• A Linux operating system designed for bioinformatics• More than 500 bioinformatics software programs

installed on top of Ubuntu 10.4 base

• There are BioLinux AMIs on EC2 – CloudBioLinux

• We have created our own AMI based on CloudBioLinux

Page 14: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

How powerful?• Once you have selected the “type” of computer (AMI) you

must then select the size and power• M1 Small Instance (Default) 1.7 GiB of memory, 1 EC2 Compute Unit (1 virtual core with 1 EC2

Compute Unit), 160 GB of local instance storage, 32-bit or 64-bit platform• M1 Medium Instance 3.75 GiB of memory, 2 EC2 Compute Units (1 virtual core with 2 EC2

Compute Units each), 410 GB of local instance storage, 32-bit or 64-bit platform• M1 Large Instance 7.5 GiB of memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2

Compute Units each), 850 GB of local instance storage, 64-bit platform• M1 Extra Large Instance 15 GiB of memory, 8 EC2 Compute Units (4 virtual cores with 2 EC2

Compute Units each), 1690 GB of local instance storage, 64-bit platform• M3 Extra Large Instance 15 GiB of memory, 13 EC2 Compute Units (4 virtual cores with 3.25

EC2 Compute Units each), EBS storage only, 64-bit platform• M3 Double Extra Large Instance 30 GiB of memory, 26 EC2 Compute Units (8 virtual cores

with 3.25 EC2 Compute Units each), EBS storage only, 64-bit platform

Page 15: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

How powerful?• Once you have selected the “type” of computer (AMI) you

must then select the size and power• Micro Instance 613 MiB of memory, up to 2 ECUs (for short periodic bursts), EBS storage only,

32-bit or 64-bit platform (free)• High-Memory Extra Large Instance 17.1 GiB memory, 6.5 ECU (2 virtual cores with 3.25 EC2

Compute Units each), 420 GB of local instance storage, 64-bit platform• High-Memory Double Extra Large Instance 34.2 GiB of memory, 13 EC2 Compute Units (4

virtual cores with 3.25 EC2 Compute Units each), 850 GB of local instance storage, 64-bit platform

• High-Memory Quadruple Extra Large Instance 68.4 GiB of memory, 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each), 1690 GB of local instance storage, 64-bit platform

• High-CPU Medium Instance 1.7 GiB of memory, 5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each), 350 GB of local instance storage, 32-bit or 64-bit platform

• High-CPU Extra Large Instance 7 GiB of memory, 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each), 1690 GB of local instance storage, 64-bit platform

Page 16: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

How powerful?• Once you have selected the “type” of computer (AMI) you

must then select the size and power• Cluster Compute Quadruple Extra Large 23 GiB memory, 33.5 EC2 Compute Units, 1690 GB of

local instance storage, 64-bit platform, 10 Gigabit Ethernet• Cluster Compute Eight Extra Large 60.5 GiB memory, 88 EC2 Compute Units, 3370 GB of local

instance storage, 64-bit platform, 10 Gigabit Ethernet• Cluster GPU Quadruple Extra Large 22 GiB memory, 33.5 EC2 Compute Units, 2 x NVIDIA Tesla

“Fermi” M2050 GPUs, 1690 GB of local instance storage, 64-bit platform, 10 Gigabit Ethernet• High I/O Quadruple Extra Large 60.5 GiB memory, 35 EC2 Compute Units, 2 * 1024 GB of SSD-

based local instance storage, 64-bit platform, 10 Gigabit Ethernet

Page 17: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

THE PRACTICAL

Page 18: Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute

• http://www.ark-genomics.org/events-online-training/eu-training-course

• In this course you will start a new Amazon EC2 instance and begin to learn some of the essentials of the linux command-line

• Don’t panic.