volunteer computing using boinc

Post on 12-Apr-2017

411 Views

Category:

Software

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Cover from Linux magazine, BOINC

Volunteer computing using

Berkeley Open Infrastructure for Network Computing

http://bit.ly/boinc_srbiauTo get more references visit:

Pooyan Mehrparvar

i. A brief introduction to volunteer computing & BOINC

ii. BOINC’s applicationsiii. BOINC’s architectureiv. How to join a popular BOINC projectv. How to set-up your own BOINC project

This prensentation covers:

I – A brief Introduction to volunteer computing & BOINC

Why do we choose this topic? What is volunteer computing? What is BOINC?

What is volunteer computing

An established technology that enables ordinary citizens donate their computing resources to one or more "projects".

BOINC is the most widely-used middleware system

a client program runs on the volunteer's computer

BOINC was introduced by David P. Anderson

BOINC is designed to support applications thathave large computation requirements, storagerequirements, or both. The main requirement ofthe application is that it be divisible into a largenumber (thousands or millions) of jobs that canbe done independently.

Goal:Use all the computers in the world, allthe time, to do worthwhile things

What is BOINC

Basic overview of BOINC jobs

Security: BOINC uses code signing to prevent distribution of malicious executables. Each project has a key pair for code signing and the private key kept on network-isolated machine

Virtualization as a solution for client’s security

Project: • An entity that does distributed computing using BOINC. Application: • A project can include multiple applications• an application includes several programs (for different platforms) and a set of

workunits and results

Application versions: • A particular application program version,compiled for a particular platform

Workunit: a computation to be performed. • associated with an application, not with an application version

Result: • an instance of a computation, either unstarted, in progress, or completed. • Each result is associated with a workunit.

Basic concepts of BOINC:

When BOINC operates? Cycle scavenging BOINC uses computers’s idle cycles of CPU/GPU and other resources to

operate (by default) To avoid high battery consumption/3G charge the BOINC android app runs on

A/C power & WiFi connection (by default)

Client can run BOINC as:Screensaver (with fancy graphics)Window service (running in the background)Application (displaying results in tabular form)

Biref history of BOINC SETI@home development (1998) Ifrastructure issues, United devices (2000) Uited devices falling out, SETI@home failed, hacked(2001) BOINC was introduced (2002) Climateprediction.net, SETI@home, LHC@home were implemented on BOINC (2004) Rosseta@home, Einstein@home, IBM World community Grid, Primegride (2005) BOINCstats (BAM!), Gridrepublic, BOINC wrapper (2006) GPU support, Multi-core apps (2008) BOINC packages for debian (2010) Apps in virtual machines, vbox wrapper (Server-side) (2011) Android, Condor/OSG collaboration, Git (2012) Virtualbox client (2013) Samsung app (Power sleep), HTC app (Power to give) (2014)

BOINC supported platforms (until december 2014) Server: Unix-based operating sytems (Debian based linux is recommended) Microsoft Windows Vista or later (POSIX ready) Client:

Linux (x86/x64) Microsoft windows (x86/x64) MacOS (x86/x64) Playstation3 Android

Virtualization as a solution for cross-platform distributed systems (Virtualbox, Vmware, VirtualPC,...)

Performance Benchmarking: FLOPS as a measure for computer performance in scientific fields (Floating point operations per second) Some computer systems are unable to to run FLOPS benchamrk MIPS/MOPS: suitable for database query, word processing,

spreadsheets, or to run multiple virtual operating systems

Benchmarking records : Fastest supercomputer: China's Tianhe-2 running 33.86 petaflops (June 10, 2013)

BOINC: Active: 232,691 volunteers, 718,577 computers. 24-hour average: 8.308 petaFLOPS (December 19, 2014)

Why to use GPU & Playstation General-purpose computing on graphics processing units (GPGPU) Playstation 4 (ATI) :

18 compute units, 64 cores per unit = 1,152 coresTheorical peak performance = 1.84 TFLOPS

Playstation 3’s CUDA (NVIDIA) is widley used in BOINC projects e.g Folding@home

Computational Science : rosseta@home

Virtual campus supercomputing : univ. of Westminster in London

Desktop grids for business : Slicify project

Integration with HTCondor to allow Globus-based grids to run jobs for BOINC projects : Einstein@OSG

II – BOINC’s aplications

What is computational Science?

What is computational Science?

Computational science is concerned with constructing mathematical models and quantitative analysis techniques

and using computers to analyze and solve scientific problems.

BOINC is used in:Physics, Astrophysics MathematicsBiology and medicince (Protein folding)Distributed sensingClimate modelingGames, 3D animation Rendering,...

BOINC popular projects:Physics, Astrophysics: LHC@home, Einstein@home, SETI@homeMathematics: SZTAKI Desktop Grid, PrimegrideBiology and medicince: Folding@home, Rosseta@homeDistributed sensing: Quake Catcher NetworkClimate modeling: Climateprediction.netGames, 3D animation Rendering,... : Chess@home,

Enigma@home

Find more projects on:http://boinc.berkeley.edu/projects.php

LHC@homeSETI@home

What is protein fodling?

Amino Acids, Proteins, DNA Protein structure prediction / Modeling proteins

structure Rosseta@home helps to find a cure for:

Alzheimer’s diseaseHIVMalariaAnthraxHerpes simplex virus 1

Rosetta@home’s screensaver

Crowdsourcing Crowdsourcing is the process of getting work or funding,

usually online, from a crowd of people. Human vs Computer

Foldit : Solve puzzles for science! From the creators of Rosseta@home (Washington Univ.) Similar to Rosetta@home, Foldit is aimed as a means of

discovering native protein structures faster, through a combination of crowdsourcing and distributed computing.

Game with a purpose In 2011 gamers helped to decipher the crystal structure of

M-PMV (virus causing AIDS in monkeys)

Snapshot of Foldit game

III – BOINC’s architecture

Challenges while using a BOINC system:

How can we depend on clients? Are they permanent?Will they reach the deadlines? Is the produced result valid?How can we trust a BOINC server?How can we trust a BOINC client?How to send jobs to various platforms?

Some features of a BOINC system:

homogeneous redundancy (sending workunits only to computers of the same platform—e.g.: Win XP SP2 only.)

workunit trickling (sending information to the server before the workunit completes)

locality scheduling (sending workunits to computers that already have the necessary files and creating work on demand)

work distribution based on host parameters (workunits requiring 512 MB of RAM, for example, will only be sent to hosts having at least that much RAM, We send more jobs to multi-core CPUs/GPUs)

Double checking: (server sends the same workunit to at least two clients, then it compares the result by validation techniques (bitwise, sample trivial, fuzzy) or a customized validation technique

Job scheduling is needed in both server & client: priority between different tasks, reaching deadlines

To avoid cheating, credits are given when after the job is validated by the server

BOINC server DAEMONS:In multitasking computer systems, a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user

Generator Transitioner Feeder Scheduler Validator Assimilator File deleter

BOINC server DAEMONS:

IV – How to join a popular BOINC project

Regiter at Boincstats (BAM) Choose your project(s) Join/create a team (optional) Download the BOINC client (BOINC manager) Log into software Immediately the project(s) will be attached Tasks will be downloaded Completed tasks will be uploaded to the server You will gain due to your performance New job will be downloaded

Step by step manual is available at: http://bit.ly/boinc_srbiau

BOINC client manager running four tasks (three projects)

V – How to set-up your own BOINC project

Project configuration in a nutshell!:

Configure a LAMP (Linux, Apache server, Mysql, Php/Phyton) A virtualbox debian linux is available at BOINC’s website

Develope your own project (mostly done in C++ (GCC), Fortran)

Attach your project url in boincmanager (Client) “your server ip/projectname” (exp: http://192.168.1.80/testproject)

Monitor your project perfomance/administaration at “your server ip/projectname_ops” (exp: http://192.168.1.80/testproject_ops)

Step by step video is available at: http://bit.ly/boinc_srbiau

BOINC server (left) + BOINC manager client (right) using virualization

Conceived to be used by scientists, not IT professionals

BOINC offers tools for◦ Creating, starting, stopping and querying projects◦ Adding new applications, new platforms, …◦ Creating workunits ◦ Monitoring server performance

(All these procedures could be done by UNIX shell commands directly)

Server administration’s page:

What you probably need to implement a BOINC ecosystem A precise evaluation before choosing BOINC as a solution, comparing

to other alternatives (JPPF, other cloud or distributed systems) Fundamental undrestaning of BOINC architecture including it’s

daemons, repositories, shell-script commands, etc Good knowledge and experience of linux environment (shell) C/C++, Fortran programming skills (GCC, VC++, etc) Server/network configuration Security issues

BOINC’s weakness: The architecture is centeriliezed in contrast of most Grid systems Insufficient Interest from Computer Science Insufficient Interest from scientists Insufficient Interest from funding agencies According to official reports volunteers are not increasing Complexity of server and job submission

Insufficient documentations (obsolete docs) Rare experiment in web society (You can’t find your answers in BOINC

forums, Stack-overflow, etc), Risk of failure

top related