the ourgrid project ourgrid
DESCRIPTION
The OurGrid Project www.ourgrid.org. Walfredo Cirne [email protected] Universidade Federal de Campina Grande. Computational Grid ( source of computational resources and services ). What is a Grid?. Solving a real problem. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/1.jpg)
The OurGrid Projectwww.ourgrid.org
Walfredo Cirne [email protected]
Universidade Federal de Campina Grande
![Page 2: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/2.jpg)
What is a Grid?
Computational Grid (source of computational resources and services)
Computational Grid (source of computational resources and services)
![Page 3: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/3.jpg)
Solving a real problem
•To finish my Ph.D., I had to run hundreds of thousands of independent simulations
•Since my simulations were independent, I had the perfect application for the grid
• I was in top grid research lab, but could not use the grid−Grid solutions are not in place yet
![Page 4: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/4.jpg)
The motivation for MyGrid
•Users of loosely-coupled applications could benefit from the Grid now
•However, they don´t run on the Grid today because the Grid Infrastructure is not widely deployed
•What if we build a solution that does not depend upon installed Grid infrastructure?
![Page 5: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/5.jpg)
MyGrid
•MyGrid allows a user to run Bag-of-Tasks parallel applications on whatever resources she has access to−Bag-of-Tasks applications are those parallel
applications formed by independent tasks
•One’s grid is all resources one has access to−No grid infrastructure software is necessary−Grid infrastructure software can be used
(whenever available)
![Page 6: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/6.jpg)
Bag-of-Tasks Applications
•Data mining•Massive search (as search for crypto keys)•Parameter sweeps •Monte Carlo simulations•Fractals (such as Mandelbrot)• Image manipulation (such as tomography)
•And many others…
![Page 7: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/7.jpg)
What is MyGrid?
•A broker (or application scheduler)
•A set of abstractions hide the grid heterogeneity from the user
![Page 8: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/8.jpg)
An Example: Factoring with MyGrid
• initmg-services put $PROC ./Fat.class $PLAYPEN
•grid1java Fat 3 18655 34789789798 output-$TASK
•collectmg-services get $PROC $PLAYPEN output-$TASK
•grid2java Fat 18655 37307 34789789798 output-$TASK
![Page 9: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/9.jpg)
Defining my personal Grid
proc:name = ostra.lsd.ufcg.edu.br attributes = lsd, linux type = user_agent
proc:name = memba.ucsd.edu attributes = lsd, solaris type = grid_script rem_exec = ssh %machine%command copy_to = scp %localdir/%file %machine:%remotedir copy_from = scp %machine:%remotedir/%file %localdir
[...]
![Page 10: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/10.jpg)
Making MyGrid Encompassing
HomeMachine
Scheduler
GridMachine Interface
GlobusProxy
UAProxy
GridScript
...
Grid Machine
GlobusGRAM
Grid Machine
UserAgent
Grid Machine
...
![Page 11: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/11.jpg)
Dealing with Firewalls, Private IPs, and Space-Shared Machines
Scheduler (Home Mac.)
User Agent
Grid Script
Globus Proxy
Grid Machine Gateway
Space-Shared Gateway
![Page 12: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/12.jpg)
The Scheduling Challenge
•Grid scheduling typically depends on information about the grid (e.g. machine speed and load) and the application (e.g. task size)
•However, getting grid information makes it harder to build an encompassing system−The GridMachine Interface would have to be
richer, and thus harder to implement•Moreover, getting application information makes the system harder to use and more complex−The user would have to provide task size
estimates
![Page 13: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/13.jpg)
Scheduling with no information
•Work-queue with Replication−Tasks are sent to idle processors−When there are no more tasks, running tasks
are replicated on idle processors−The first replica to finish is the official
execution−Other replicas are cancelled−Replication may have a limit
•The key is to avoid having the job waiting for a task that runs in a slow/loaded machine
![Page 14: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/14.jpg)
Work-queue with Replication
•8000 experiments
•Experiments varied in−grid heterogeneity−application heterogeneity−application granularity
•Performance summary: Sufferage DFPLTF Workqueue WQR 2x WQR 3x WQR 4x
Average 13530.26 12901.78 23066.99 12835.70 12123.66 11652.80 Std. Dev. 9556.55 9714.08 32655.85 10739.50 9434.70 8603.06
![Page 15: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/15.jpg)
WQR Overhead
•Obviously, the drawback in WQR is cycles wasted by the cancelled replicas
•Wasted cycles:
WQR 2x WQR 3x WQR 4x Average 23.55% 36.32% 48.87%
Std. Dev. 22.29% 34.79% 48.93%
![Page 16: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/16.jpg)
Data Aware Scheduling
•WQR achieves good performance for CPU-intensive BoT applications
•However, many important BoT applications are data-intensive
•These applications frequently reuse data−During the same execution−Between two successive executions
•There are knowledge-dependent schedulers that explore data reutilization
![Page 17: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/17.jpg)
Storage Affinity
•The “affinity” between a task and a site is the number of bytes within task input that is already stored at there−The heuristic is based on easy-to-get static
information (size and location of data)
•The task with largest “affinity” is prioritized −The idea is avoid unnecessary data transfers
•Replication is used to cope with mistakes
![Page 18: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/18.jpg)
Storage Affinity Results
• 3000 experiments• Experiments varied in
− grid heterogeneity− application heterogeneity− application granularity
• Performance summary:
Storage Affinity
XSufferage WQR
Average (seconds) 57.046 59.523 150.270Standard Deviation 39.605 30.213 119.200
![Page 19: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/19.jpg)
Proof of Concept
•During a 40-day period, we ran 600,000 simulations using 178 processors located in 6 different administrative domains widely spread in the USA
•We only had GridScript and WorkQueue•MyGrid took 16.7 days to run the simulations
•My desktop machine would have taken 5.3 years to do so
•Speed-up is 115.8 for 178 processors
![Page 20: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/20.jpg)
HIV research with MyGrid
B,c,F
HIV-2HIV-1
M
O ABCD FGHJK
N?prevalent in Europe and Americasprevalent in Africa
majority in the world
18% in Brazil
![Page 21: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/21.jpg)
HIV protease + Ritonavir
Subtype B
RMSD
Subtype F
![Page 22: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/22.jpg)
The HIV Research Grid
•55 machines in 6 administrative domains in the US and Brazil−The machines were accessed via User Agent,
UA + Grid Machine Gateway, UA + ssh tunnel, and Grid Scripts
•Task = 3.3 MB input, 1 MB output, 4 to 33 minutes of dedicated execution
•Ran 60 tasks in 38 minutes•Speed-up is 29.2 for 55 machines
−Considering an 18.5-minute average machine
![Page 23: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/23.jpg)
MyGrid Status
•MyGrid is open source and is available at http://www.ourgrid.org/mygrid−About 150 downloads−2.0 version released two months ago−Base of the PAUÁ Grid, currently being
deployed by HP Brazil
•Bag-of-tasks parallel applications can currently benefit from the Grid −However, firewalls, private IPs and the such
make it much harder than we initially thought
![Page 24: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/24.jpg)
•More resources−People want to use more resources than
they have access to
•Good “debugging”−Abstractions are wonderful when they
work, but when they fail... :-(
•More security−Local resources−Use of grid machine as attack launchpad
•Richer programming model
Demands of MyGrid Users
OurGrid
MyGridDoctor
SWAN
![Page 25: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/25.jpg)
OurGrid: A Network of Favors
• Getting access to resources is out of scope of today’s grid solutions− Grid economy will solve this problem some day
• But BoT applications can use lots of resource now
• Let’s trade off generality for simplicity− There are at least 2 resource providers− Applications that shall use the resources need no
QoS guarantees
• P2P resource sharing community− Network of Favors
![Page 26: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/26.jpg)
Making people collaborate
• It’s important to encourage collaboration within OurGrid (i.e., resource sharing)−In file-sharing, most users free-ride
•OurGrid uses a P2P Reputation Scheme−All peers maintain a local balance for all
known peers−Peers with greater balances have priority−The emergent behavior of the system is that
by donating more, you get more resources−No additional infrastructure is needed
![Page 27: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/27.jpg)
A
B
C
D
E
OurGrid resource sharing [1]
ConsumerQuery(broadcast)
ProviderWorkRequest
ConsumerFavorProviderFavorReport
*
*
* = no idle resources now
broker
B 60
D 45
![Page 28: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/28.jpg)
OurGrid resource sharing [2]
A
B
C
D
E
B 60
D 45
E 0
ConsumerQuery
ProviderWorkRequest
*
* = no idle resources now
*
broker broker
![Page 29: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/29.jpg)
Free-rider consumption
•Epsilon is the fraction of resources consumed by free-riders
![Page 30: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/30.jpg)
Equity among collaborators
![Page 31: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/31.jpg)
Sandboxing for BoT applications
• In the OurGrid Community, a peer runs unknown code that comes from the Grid
•This an obvious security concern−Threat to local data and resources−Use of machine as drone to attack others
•We leverage from the fact BoT applications communicate only to receive input and send output to run the guest application in a very tight sandbox, with no network access
![Page 32: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/32.jpg)
Adding a second line of defense
•We also reboot to add a second layer of protection to the user data and resources
•This has the extra advantage of enabling us to use an OS different from that chosen by the user−That is, even if the user prefers Windows, we
can still have Linux
•Booting back to the user OS can be done fast by using hibernation
![Page 33: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/33.jpg)
SWAN architecture
reboot
Guest OS
Grid OS
Grid Middleware
Grid Application
Host OS
NativeApplication
reboot
Guest OS
Grid OS
Grid Middleware
Grid Application
Guest OS
Grid OS
Grid Middleware
Grid Application
Host OS
NativeApplication
Host OS
NativeApplication
![Page 34: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/34.jpg)
OurGrid overall architecture
1, ... ,
n
User Interface
User Interface
Site ManagerSite Manager
SWAN
SandboxingSandboxingMyGri
d
SWAN
![Page 35: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/35.jpg)
Collaboration/Interest on OurGrid
•HP Brazil R&D
•HP Labs Bristol
•HP Partners−LNCC, UniSantos, UniFor, Instituto Atlântico−CESAR/UFPE, Instituto Eldorado, IPT, AMR−PUCRS, UniSinos, UFRGS, USP
•Others−UCSD, UnB, UFBA, UCS, UniCap, UFPB,
UFAL ...
![Page 36: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/36.jpg)
Questions?
![Page 37: The OurGrid Project ourgrid](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c51550346895da5d161/html5/thumbnails/37.jpg)
Thank you!Merci!Danke!Grazie!Gracias!
Obrigado!
More at www.ourgrid.org