basics of grid and cloud computingeero/gtla/gridlecture1.pdf•6-9.02 python intro •13-16.02...

25
University of Tartu, Institute of Computer Science Basics of Grid and Cloud Computing Gridi ja pilvetehnoloogia alused (http://courses.cs.ut.ee/2012/cloud) [email protected] 2011/12 Spring

Upload: others

Post on 05-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

University of Tartu, Institute of Computer Science

Basics of Grid and Cloud ComputingGridi ja pilvetehnoloogia alused

(http://courses.cs.ut.ee/2012/cloud)

[email protected]

2011/12 Spring

Page 2: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

2 Practical Information

Lectures Wed 10:15 Liivi 2 - 1111-8: Eero Vainikko – Grid Computing9-16: Satish Narayana Srirama – CloudsComputer Classes:

• group 4: Mon 10:15 Liivi 2 - 205 ;

– Grid: Hardi Teder [email protected]

– Cloud: Reimo Rebane [email protected]

• group 3: Tue 8:15 Liivi 2 - 205; Pelle Jakovits [email protected]

• group 1: Thu 10:15 Liivi 2 - 205; Pelle Jakovits [email protected]

• group 2: Thu 14:15 Liivi 2 - 205; Riivo Talviste [email protected]

Page 3: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

3 Practical Information

• Final grade:

– Active participation at lectures (ca 10%)

* Devising questions for on-line study-questionary in 24h after eachlecture

– Solution of Computer Class exercises

– Cloud project

– Written exam (Wed, 30. May 2011) 50%

NB!Crucial to keep the deadlines for all home assignments!

Page 4: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

4 Syllabus

Lectures (1-8):

• Introduction to the subject (HPC history, supercomputers, clusters, Grid; exam-ples, visions, projects...)

• Grid architecture

• Grid Security concepts (PKI, Authorisation, CA, etc.)

• Globus Toolkit (what is virtual organisation., how to achieve it using GT etc),OGSA, WSRF

• Other Grids (UNICORE, LCG2, SunGE, ...)

• Condor, OpenPBS, Sun GE, LFS.

• NorduGrid, BalticGrid, Estonian Grid.

• Desktop-Grids (MiG, F2F)

• Examples of different grid solutions in the world

Page 5: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

5 Syllabus

Computer Classes (preliminary) schedule:

Exercises on Grid computing (Hardi Teder, Pelle Jakovits, Riivo Talviste)

• 6-9.02 Python intro

• 13-16.02 Hello, Grid! Grid information systems, submitting first grid job

• 20-23.02 Grid security. Breaking RSA code

• 27.03-1.04 Data management on grid

• 5-8.03 Job management on grid

• 12-15.03 Grid user interfaces and tools. POV-Ray rendering

• 19-22.03 Grids and clouds, the road ahead.

• 26.03-29.04 TBA

Page 6: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

6 Syllabus

Cloud Lectures (9-16):

• by: Dr Satish Narayana Srirama

Exercises on Cloud computing (Pelle Jakovits, Riivo Talviste, Reimo Rebane)

• 5.-9.04 Amazon EC2, Amazon S3, Elastic Fox, Google AppEngine

• 12-16.04 Eucalyptus, SciCloud, Auto Scaling & special features in EC2

• 19-23.04 Hadoop

• 26-30.04 Hadoop continued & Selecting the mini project topic

• 3-7.05

• 10-14.05 Preliminary results of project

• 17-21.05

• 24-28.05 Project delivery

Page 7: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

7 Literature

1. Fran Berman, Geoffrey C. Fox and Anthony J. G. Hey, Grid Computing. Makingthe Global Infrastructure a Reality, John Wiley & Sons, 2003, (Grid Computing(http://www.grid2002.org/)).

2. Ian Foster and Carl Kesselman (eds.), The Grid: Blueprint for a New ComputingInfrastructure, 2nd edition, Morgan Kaufmann Publishers, 2004.

3. Michael Di Stefano, Distributed Data Management for Grid Computing, JohnWiley & Sons, 2005.

4. F Travostino, J Mambretti, G Karmous-Edwards (eds.), Grid Networks: En-abling Grids with Advanced Communication Technology , John Wiley & Sons,2006.

5. Vladimir Silva, Grid Computing For Developers, Charles River Media, 2006.

6. A. Chakrabarti, Grid Computing Security, Springer 2007.

Page 8: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

8 Literature

7. R. Prodan, T. Fahringer, Grid Computing: Experiment Management, Tool Inte-gration, and Scientific Workflows, Springer, 2007.

8. Yang Xiao, Security in Distributed, Grid, Mobile, and Pervasive Computing,Auerbach Publications, 2007.

9. Introduction to Grid Computing (http://www.redbooks.ibm.com/redbooks/pdfs/sg246778.pdf),

10. Open Grid Forum (http://www.ogf.org).

11. The Globus Alliance (http://www.globus.org/).

12. Nordugrid (http://www.nordugrid.org/).

13. Estonian Grid (http://grid.eenet.ee/).

14. Baltic Grid (http://www.balticgrid.org).

Page 9: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

9 Literature

Python:

1. Jeffrey Elkner, Allen B. Downey, and Chris Meyers, How to Think Like a Com-puter Scientist. Learning with Python, 2nd edition, Book homepage (http://openbookproject.net/thinkcs/python/english2e/).

2. Hans Petter Langetangen, A Primer on Scientific Programming withPython, Springer, 2009. Book webpage (http://vefur.simula.no/intro-programming/).

3. Hans Petter Langetangen, Python Scripting for Computational Science. ThirdEdition, Springer 2008. Book homepage (http://folk.uio.no/hpl/scripting/).

4. Neeme Kahusk, Sissejuhatus Pythonisse (http://www.cl.ut.ee/inimesed/nkahusk/sissejuhatus-pythonisse/)

5. Python Documentation (http://www.python.org/doc/), for startPython Tutorial (http://docs.python.org/tut/tut.html)

Page 10: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

10 Literature

6. Mark Lutz and David Ascher, Learning Python, O’Reilly Media Inc. 2004,

7. Mark Lutz, Learning Python (4th edition), O’Reilly Media, Inc. (and SafariBooks), 2009

8. Travis E. Oliphant, Guide to NumPy (http://www.tramy.us), TrelgolPublishing 2006.

Some lecture slides:

1. Kent Engström, Python Introduction (slaidid) (http://www.nsc.liu.se/ngssc-grid/python-engstrom.pdf), NGSSC course in gridcomputing, January 10-18, 2005.

2. Chris Meers, An introduction to Python, with application to scientific comput-ing (slides (http://hughm.cs.ukzn.ac.za/~murrellh/bio/lit/pysci.pdf)), Cornell Theory Center.

3. Introduction to Scientific Computing with Python (slides (http://www.physics.rutgers.edu/grad/509/python1.pdf))

Page 11: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

11 Past and future of the course; related courses

About this course• First time Spring 2005 Basics of Grid and Cluster Computing

• Since 2009 (Spring): Basics of Grid and Cloud Computing

– Second part: Basics of Cloud Computing (3 eap)

Other related courses:MTAT.08.022 Parallel Programming Languages (6 eap) (2011)

• Distributed Systems Seminar Wed 14:15 (Fri 14:15) Liivi 2 - 315

– MsC students: MTAT.08.024 12eap (3+3+3+3)

– Bachelor students: MTAT.08.014 8eap (2+2+2+2)

– PhD students: Distributed Systems Research Seminar MTAT.08.01920eap (5+5+5+5)

Page 12: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

12 Past and future of the course; related courses

• Parallel Computing: MTAT.08.007 6eap Autumn 2012

• Scientific Computing: MTAT.08.010 6eap Spring 2014

• Introduction to Scientific Computing: MTAT.08.025 3eap April-May 2012(University-wide elective course for PhD students)

Page 13: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

13 Introduction 1.1 Driving forces of computational science

1 Introduction

1.1 Driving forces of computational science

High Performance Computing (HPC)

• Environment simulation; some examples:

– Climate changes

– Prediction of amount of fish in Norwegian fjords

– Ice glacier flow simulation

• Solving fluid dynamic problems

– Weather predictions

– Design of hypersonic airplanes

– Design of more efficient cars

Page 14: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

14 Introduction 1.1 Driving forces of computational science

– Extremely quiet submarines

– Design of efficient and safe nuclear power stations

* solution bisection, turbulence

• Simulation of nuclear explosions

• Satellite data analysis

• Data analysis of DNA-sequences

• Simulation of 3D proteine molecules

• Simulation of global economical processes

• etc. in more and more fields

Page 15: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

15 Introduction 1.1 Driving forces of computational science

Common to all examples: need for larger than usual set of resources:

• CPU cycles

• data volume

• special devices producing data

=> parallel processing

=> questions:

• how to store data?

– Data repositories

– Data repository services

• how to move data?

– Networks

– Internet and private networks

• which algorithms can be used?

– Theory and practice of paral-lel algorithms

Page 16: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

16 Introduction 1.2 History of HPC

1.2 History of HPC

pre-history (human arrays):1929 – parallelisation of weather predictionsA bit similar:≈1940 – Russian war defense - parallel computing (tank T40

calculations)Some expert’s predictions:1947 - computer engineer Howard Aiken: USA will need in the future at most 6

computers!1977 - Seymour Cray: The computer Cray-1 will attract potentially only ca 100

clients

Reality: how many Cray-1 class computer powers do you carry with you today?

Page 17: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

17 Introduction 1.2 History of HPC

Gordon E. Moore’s law:(1965: the number of switches dou-bles every second year )

1975: - refinement of the above:[ The number of switches / Perfor-mance ] of a CPU doubles every18 months

Page 18: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

18 Introduction 1.2 History of HPC

first processors 102 100 Flopsmodern desktop computers 109 Gigaflops (GFlops)

modern supercomputers 1012 Teraflops (TFlops)we are about to achieve soon 1015 Petaflops (PFlops)

next step 1018 Exaflops (EFlops)

History of Computers (http://smashinghub.com/history-of-computers.htm)

Supercomputers→ Clusters 99K Grids Clouds

Page 19: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

19 Introduction 1.3 Data Challenges

1.3 Data Challenges

How large is 1 petabyte?

• Some high resolution pictures abouteach person on the Earth

• (5 years ago: An example of

petabyte storage device:

– train wagon full of high resolu-tion magnetic tapes

– About 3 years to read throughwith a fast tape-reader)

• Today: Largest tape drives store 5TBunpacked data (StorageTek T10000)=> 1PB takes ca 205 tapes. Readingone tape takes ca 6.1h => 52 days toread all. These tapes would pile up intotower of about 5.2m in height, weight-ing less than 60 kg, in volume, ca 40%of it would fit into hand-baggage on aplane.

* Largest commercial databases today ≈a few terabytes (1012 bytes)

Science’s needs in the near future (anexample): Particle physics experimentsproduce around 10 Petabytes a year

Page 20: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

20 Introduction 1.3 Data Challenges

Prediction for the needs: Around the year 2015 there is a need for Exabyte (1018)storage databanks and Petaflops processing power

How large is Exabyte?All the information generated in 1999 – 2 ExabytesAll spoken words by all people ever: 5 Exabytes!

One of the most challenging problems – data updates!

Page 21: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

21 Introduction 1.4 Classification of Parallel Computers

1.4 Classification of Parallel Computers

• Architecture

Flynn’s classification

Instruction SISD SIMDstream (MISD) MIMD

Data stream

Abbreviations:

S - Single

M - Multiple

I - Instruction

D - Data

For example: Single InstructionMultiple Data stream

– Single processor computer

– Multicore processor

– distributed system

– shared memory system

• Network

– topology

* ring, array, hypercube...

– properties

* bandwidth, latency

• Memory access

– shared , distributed , hybrid

Page 22: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

22 Introduction 1.4 Classification of Parallel Computers

• operating system

– UNIX,

– LINUX,

– (OpenMosix)

– WIN*

• Algorithm realisation

– using only hardware modules

– mixed modules (hardware andsoftware)

• Control type

– synchronous

– dataflow-driven

– asynchronous

• scope

– supercomputing

– distributed computing

– real time sytems

– mobile systems

– grid and cloud computing

– etc

Page 23: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

23 Introduction 1.5 Supercomputers

But impossible to ignore (implicit or explicit) parallelism in a computer or a set ofcomputers

1.5 Supercomputers

• Last word in computer hardware

– one step ahead in technology

– Note: today’s supercomputers are tomorrow’s commodity systems!

• expensive

• shipped with OS

• Supercomputers→ Clusters

Page 24: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

24 Introduction 1.6 Computer Clusters

1.6 Computer Clusters

Workstation groups connected with LAN with uniform softwareExample: Linux-clusters (Beowulf Clusters) , University of Tartu HPC

aurumasin

Special network solutions

• Myrinet (Clos-networks)

• Scali

• *-Ethernet

• Infiniband

Page 25: Basics of Grid and Cloud Computingeero/GTLA/gridlecture1.pdf•6-9.02 Python intro •13-16.02 Hello, Grid! Grid information systems, submitting first grid job •20-23.02 Grid security

25 Introduction 1.6 Computer Clusters

Top 500

Top500 (http://www.top500.org)

• Also, http://www.bbc.co.uk/news/10187248