comp3019 coursework: introduction to m-grid
DESCRIPTION
COMP3019 Coursework: Introduction to M-grid. Steve Crouch [email protected], stc@ecs School of Electronics and Computer Science. Objectives. To equip students to drive a lightweight grid implementation to solve a problem that can benefit from using grid technology. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/1.jpg)
COMP3019 Coursework: Introduction to M-gridSteve [email protected], stc@ecs
School of Electronics and Computer Science
![Page 2: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/2.jpg)
Objectives To equip students to drive a lightweight
grid implementation to solve a problem that can benefit from using grid technology.
To develop an understanding of the basic mechanisms used to solve such problems.
To develop a general architectural and operational understanding of typical production-level grid software.
To develop the programming skills required to drive typical services on a production-level grid.
![Page 3: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/3.jpg)
Overview Part 1: m-grid
– m-grid: lightweight software illustrating grid concepts in use
– Develop a program with m-grid’s Java API to solve a simple problem, submit it to m-grid with input data, collect results
Part 2: Google MapReduce & GridSAM– MapReduce: framework for distributed processing of
large datasets using many computers– GridSAM: job submission web service interface to a
computational resource (e.g. compute cluster, single machine)
– Extend code stubs to submit jobs to GridSAM and monitor them to completion
– Extend pseudocode that implements a basic MapReduce framework
![Page 4: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/4.jpg)
Where to get stuff/help?
Can obtain coursework materials from website– Ready for Wednesday
Software documentation
Coursework help lecture 19th March
Myself: [email protected]
Building 32: Level 4 lab 4067 Bay 23
![Page 5: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/5.jpg)
Background
![Page 6: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/6.jpg)
The Problem
Basically, want to run compute-intensive task
Don’t have enough resources to run job locally– At least, to return results within sensible
timeframe
Would like to use another, more capable resource
![Page 7: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/7.jpg)
Distributed Computing in Olden Times• Small number of ‘fast’ computers
– Very expensive– Centralised– Used nearly all the time– Time allocations for users– Not updated often
Cray X-MP(Cray -1 successor)
Univac 1710
• Punched cards• Wait time huge• MailNet, SneakerNet,
TyperNet, etc…• Mainframes• Cray-1 1976 - $8.8
million, 160 megaflops, 8MB memory
![Page 8: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/8.jpg)
The Present… Now… large number of slow computers:
– Cheap– Distributed
Computation Ownership
– Not used all the time– Exclusive access to users– Updated often– e.g. desktop computers, PDAs, mobile phones
Low utilisation of computing power
e.g.: institutional/university resources…
![Page 9: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/9.jpg)
It’s About Scaling Up…
• Compute and data – you need more, you go somewhere else to get it
• Then… the march towards localisation of computation, the Personal Computer
• Computational Science develops in laboratories
• Is this changing again?
Images: nasaimages, Extra Ketchup, Google Maps, Dave Page
![Page 10: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/10.jpg)
The Grid - a Reminder The grid – many definitions!
“Grid computing offers a model for solving massive computational problems by making use of the unused CPU cycles of large numbers of disparate, often desktop, computers treated as a virtual cluster embedded in a distributed telecommunications infrastructure” – Wikipedia
“A service for sharing computer power and data storage capacity over the Internet.“ – CERN (European Organisation for Nuclear Research)
Two components of grid computing:– Computational/data resource – e.g. computational cluster,
supercomputer, desktop machine– Infrastructure for externalising that resource to others
![Page 11: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/11.jpg)
Some Examples… Grid (i.e. internet-accessible)
examples:– SETI@Home -
http://setiathome.ssl.berkeley.edu/ Process data from Arecibo Radio
Telescope, Puerto Rico 2 million volunteers installed software
– Univa.org- http://www.univa.org/ Projects such as Cancer Research,
Smallpox 2.5 million volunteer systems Sells processing time to organisations
Computational resource (i.e. intranet-accessible):
– Cluster managers, supercomputer, single machine
![Page 12: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/12.jpg)
The Idea - as a Provider… Goal: I want others to access my resources &
applications
I want to provide secure controlled access to:– My applications:
Specify who can access which applications
– My computational or data resources: I can limit external usage of my resources
Provides an interface that allows remote users to access my resources
Enable collaboration with other partners
![Page 13: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/13.jpg)
The Idea - as a User (or Client) Goal: I want to use other resources &
applications
Through a network of service providers I can…:
– Gain access to applications that I do not have installed locally
– Use remote machines [larger resource] with more CPU, memory or storage Process larger problem sizes
– Transparently switch between different service providers No exposure to underlying OS, queuing policy, disk
layout etc.
![Page 14: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/14.jpg)
Cluster Computing & the Grid
Grid is predominantly built on Cluster Computing solutions
University B
University A University C
GridCluster Computin
g
![Page 15: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/15.jpg)
The General Idea…
Abstract ‘virtualisation’ of local network resources– Infrastructure manages many machines– Visualisation as a single resource– Submitted jobs get put on queue(s)
Coordinator Executor
ExecutorClient
Client
……
![Page 16: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/16.jpg)
Condor – Background Begun in 1988, based on Remote-Unix (RU)
project
Predominantly makes use of idle cycles on machines
![Page 17: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/17.jpg)
Condor Components Four main machine ‘roles’ (daemons):
– Submit Client (condor_schedd): used to submit resource requests, monitor, modify and delete jobs.
– Central Manager, Server condor_collector: collects information about
pool resources. condor_negotiator: negotiates (match-makes)
between resources and resource requests.– Job Executor (condor_startd): executes jobs,
advertises resources. Enforces local policy.– (Checkpoint Server (condor_ckpt_server):
services requests to store and retrieve checkpoint files.)
![Page 18: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/18.jpg)
Shared Disk
Condor Architecture
1. Client submits job (executable + input data) to local queue
2. Client schedd advertises job request to server collector
3. Server negotiator gets next priority request from collector
4. Negotiator negotiates w/ client schedd to match resource/job
5. Client removes job from queue and sends it to executor
6. Job runs on executor
7. Job output results returned to client
Client
Server
Executor
Submit client (condor_schedd,
condor_shadow…)
Negotiator (condor_negotiator)
Collector (condor_collector) Executor (condor_startd,
condor_starter…)
Queue
…
2
3
46
5
Queue
…
1 7
![Page 19: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/19.jpg)
M-grid
An overview
![Page 20: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/20.jpg)
Computational Grids - in General
Users supply tasks to be performed via client
Execution nodes contribute processing power
Coordinator node sends tasks to execution nodes, ensuring results returned
Existing grid tech. sophisticated -> significant complexity– To what extent can this be reduced?
Coordinator Executor
ExecutorClient
Client
……
![Page 21: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/21.jpg)
Java Applets?
How about Java applets as a program unit?– Browsers could act as execution nodes
Security concerns?– Web browsers execute foreign code– Java applets executed within a ‘sandbox’ virtual
machine– Stringent security restrictions imposed– In-built security configuration in browsers– Applet can only contact originating server
Risk significantly reduced
![Page 22: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/22.jpg)
M-grid: A Lightweight Grid I
M-grid:– Execution node = Java-applet enabled browser– Client = browser– Coordinator = web server– Tasks distributed as Applets in web pages
Execution node browser opens web page on server Runs task applet Uploads results to server
Coordinator Executor
ExecutorClient
Client
……
![Page 23: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/23.jpg)
M-grid: Overview Implemented on:
– Microsoft’s IIS (Internet Information Server) using ASP– Apache Tomcat – we’ll use this one!
Client– Develops applet class as extension to MGridApplet class– Can run applet locally in appletviewer for testing– Compiles and packages applet with input parameters file into
a jar file– Submits jar to web server via JobSubmit web page– Eventually collects results via ViewJobs web page
Execution node– Requests a job via JobRequest page– Applet submits results from job using SubmitResults page
Security provided by session authentication
![Page 24: COMP3019 Coursework: Introduction to M-grid](https://reader035.vdocuments.us/reader035/viewer/2022062808/56815238550346895dc07e3d/html5/thumbnails/24.jpg)
Architecture