volunteer computing with boinc dr. david p. anderson university of california, berkeley sc10 nov....
TRANSCRIPT
![Page 1: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/1.jpg)
Volunteer Computingwith BOINC
Dr. David P. AndersonUniversity of California, Berkeley
SC10Nov. 14, 2010
![Page 2: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/2.jpg)
Goals
Explain volunteer computing Teach how to create a volunteer computing
project using BOINC
Target audience: High-throughput computing users Technical skills:
Basic Linux/Apache sysadmin, familiarity with PHP, SQL and XML, C/C++ (optional)
![Page 3: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/3.jpg)
Outline Why use volunteer computing? Basic concepts of BOINC Developing BOINC applications
(15 minute break) Deploying a BOINC server Deploying applications Submitting jobs Organizational issues
![Page 4: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/4.jpg)
Part 1:
Why use volunteer computing?
![Page 5: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/5.jpg)
The Consumer Digital Infrastructure
1 billion PCs current GPUs: 1 TeraFLOPS (1,000 ExaFLOPS
total) Storage: ~1,000 Exabytes
Commodity Internet: 10-1,000 Mbps to home Consumers pay for
hardware sysadmin network costs electricity
![Page 6: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/6.jpg)
Volunteer computing
PC owners donate computing resources to projects (e.g., computational science)
Applications run at zero priority while PC in use, and/or while PC is not in use
![Page 7: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/7.jpg)
Examples
Project start where area peak #hosts
GIMPS 1994 math 10,000distributed.net 1995 cryptography 100,000SETI@home I 1999 UCB SETI 600,000Folding@home 1999 Stanford biology 200,000United Devices 2002 commercial biomedicine 200,000CPDN 2003 Oxford climate change 150,000LHC@home 2004 CERN physics 60,000Predictor@home 2004 Scripps biology 100,000WCG 2004 commercial biomedicine 200,000Einstein@home 2005 LIGO astrophysics 200,000SETI@home II 2005 UCB SETI 850,000Rosetta@home 2005 U. Wash biology 100,000SIMAP 2005 T.U. Munich bioinformatics 10,000... ... ... ... ...
![Page 8: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/8.jpg)
Current status
~50 projects 500,000 vounteers 800,000 computers
Processor type0
0.51
1.52
2.53
3.54
4.55
4.6
2.4 2.2
1.2
NVIDIA
CPU
PS3 (Cell)
ATI
![Page 9: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/9.jpg)
High-throughputcomputing
High-performancecomputing
cluster(MPI)
supercomputer
cluster(batch)
Grid
Commercialcloud
Volunteercomputing
single job
# processors
multiple jobs
10K-1M
1000
100
1
![Page 10: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/10.jpg)
Volunteer computing is different
You don’t buy resources; you ask for them Resources are:
heterogeneous sporadically available and connected untrusted and not private behind firewalls/NATs/proxies
![Page 11: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/11.jpg)
Part 2:
Basic concepts of BOINC
![Page 12: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/12.jpg)
About BOINC
Funded by NSF since 2002 Open-source (LGPL) Based at UC Berkeley Few staff, but lots of volunteers
software testing translation documentation support (email lists, message boards, Skype)
![Page 13: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/13.jpg)
Volunteers and projects
volunteers projects
CPDN
LHC@home
WCGattachments
![Page 14: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/14.jpg)
BOINC software overview
client
apps
screensaver
GUI
scheduler
MySQL
data server
daemons
volunteer host
project serverHTTP
![Page 15: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/15.jpg)
BOINC schedulerapplications
Win32 + NVIDIA
Win64
Mac OS X
app versions
jobs
instances
Win32 N-core
Win32
- HW, SW description- existing workload- per resource type: # of instances requested # of seconds requested
- app version descriptions- job descriptions
![Page 16: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/16.jpg)
Job replication
Job instances may fail or return wrong results Job replication: do 2, see if they agree
“agree” may be fuzzy Homogeneous replication
numerical equivalence of hosts Adaptive replication
reduce replication for hosts that seem trustworthy
![Page 17: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/17.jpg)
The job pipeline
work generator
BOINC
validator
assimilator
![Page 18: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/18.jpg)
The BOINC data model
App versions, job inputs, job output can consist of arbitrarily many files
Each file has a physical name (unique, immutable); each reference to a file has a “logical name”
Files have various attributes (e.g., sticky) Each file can have one or more URLs, and are
transferred via HTTP App version files are digitally signed
![Page 19: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/19.jpg)
What kinds of jobs can BOINC handle?
Pretty much anything you’d run on a Grid Bag of tasks (but IPC support soon) Short/long jobs Data intensive, up to a point Geared towards
Few apps, many jobs (high startup cost per app)
Jobs with high slack time
![Page 20: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/20.jpg)
Part 3:
Application development for BOINC
![Page 21: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/21.jpg)
The BOINC runtime environment
processes
files
![Page 22: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/22.jpg)
Native BOINC applications
boinc_init() create runtime system thread
boinc_finish() write finish file
boinc_resolve_filename(logical, physical) boinc_fraction_done(x)
![Page 23: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/23.jpg)
Checkpointing
bool boinc_time_to_checkpoint() call when in checkpointable state
boinc_checkpoint_done()
![Page 24: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/24.jpg)
The BOINC wrapper
Can use for legacy apps XML input file lists sub-jobs
executable, input files What it does:
interfaces to BOINC client copies files to/from slot directory runs executables does checkpointing at sub-job level
![Page 25: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/25.jpg)
Building app versions
Linux gcc
Windows Visual Studio minGW (gcc)
Mac OS X xcode
![Page 26: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/26.jpg)
Multithread apps
boinc_init_parallel() Allows suspend/resume of all threads
Unix: fork/exec Windows: direct thread control
![Page 27: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/27.jpg)
GPU app versions
Develop for NVIDIA or ATI, with CUDA, CAL, OpenCL, etc. (BOINC supplies samples)
Each version has a “plan class” For each plan class, supply a function that
determines can app run on this host?
hardware, driver version, etc. what resources will it use?
#CPUs, #GPUs, GPU RAM, etc.
![Page 28: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/28.jpg)
VM apps
Develop apps on your favorite OS Create a VirtualBox VM image App version consists of
VM wrapper (supplied by BOINC) VM image app executable
![Page 29: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/29.jpg)
Part 4:
Deploying a BOINC server
![Page 30: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/30.jpg)
Hardware options
Native Linux host download/compile BOINC software
BOINC server VM (VMware/Debian) BOINC Amazon EC2 image
![Page 31: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/31.jpg)
Components of a project
Master URL name MySQL database Directory hierarchy A set of daemon processes and cron jobs
![Page 32: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/32.jpg)
Processes
work generator
validator
assimilatorfeeder
MySQL DB
scheduler
transitioner
file deleter
DB purger
clients
![Page 33: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/33.jpg)
Project directory hierarchy
apps/ application files
bin/ daemon programs
cgi-bin/ BOINC scheduler and upload GCI
config.xml configuration file
download/ downloadable files
html/ web site; master URL points here
keys/ keys for code signing, upload auth
log_(hostname) daemon log files
project.xml list of platforms and apps
upload/ uploaded files
![Page 34: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/34.jpg)
BOINC database
platform
app
app_version
user
host
workunit
result
...
![Page 35: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/35.jpg)
Creating a project
make_project name creates
directory hierarchy DB mods for httpd.conf crontab entry
![Page 36: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/36.jpg)
Project configuration and control
config.xml scheduling and other options list of daemons list of periodic tasks
project control bin/start: start daemons, enable scheduler bin/stop: stop daemons, disable scheduler bin/status
![Page 37: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/37.jpg)
Scaling a BOINC server
Components can run on different machines sharing a file system
Each component can be distributed MySQL server is typically the bottleneck 1 server machine can issue ~100K jobs/day; 4
machines can issue > 1 million
![Page 38: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/38.jpg)
Part 5:
Deploying applications
![Page 39: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/39.jpg)
Adding an application
edit project.xml
run bin/xadd
<app> <name>multi_thread</name> <user_friendly_name>Test multi-thread apps</user_friendly_name> </app>
![Page 40: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/40.jpg)
Adding an application version
Create application version directory
Sign files on offline computer run bin/update_versions
apps/uppercase/
uppercase_6.14_windows_intelx86__cuda.exe/uppercase_6.14_windows_intelx86__cuda.exegraphics_app=uppercase_graphics_6.14_windows_intelx86.exe logo.jpgHelvetica.txf
![Page 41: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/41.jpg)
Part 6:
Submitting jobs
![Page 42: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/42.jpg)
Describing job inputs Input template file
<file_info> <number>0</number></file_info><workunit> <file_ref> <file_number>0</file_number> <open_name>in</open_name> </file_ref> <target_nresults>1</target_nresults> <min_quorum>1</min_quorum> <command_line>-cpu_time 60</command_line> <rsc_fpops_bound>446797000000000</rsc_fpops_bound> <rsc_fpops_est>279248000000000</rsc_fpops_est></workunit>
![Page 43: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/43.jpg)
Describing job outputs Output template file
<file_info> <name><OUTFILE_0/></name> <generated_locally/> <upload_when_present/> <max_nbytes>5000000</max_nbytes> <url><UPLOAD_URL/></url></file_info><result> <file_ref> <file_name><OUTFILE_0/></file_name> <open_name>out</open_name> </file_ref></result>
![Page 44: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/44.jpg)
Submitting a job
Stage input files
Submit job
create_work –appname A –wu_name B –wu_template C –result_template D
cp test_files/12ja04aa `bin/dir_hier_path 12ja04aa`
![Page 45: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/45.jpg)
Part 7:
Organizational issues
![Page 46: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/46.jpg)
Single-scientist projects
Need to: Port apps Get publicity interface with public maintain servers
Not many research groups have the resources And it creates a lot of competing “brands”
![Page 47: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/47.jpg)
Umbrella projects
Example: IBM World Community Grid
Projectpublicityweb developmentsysadminapp porting
![Page 48: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/48.jpg)
The Berkeley@home model
• A university has
– scientists
– a powerful “brand”
– PR resources
– IT infrastructure
– lots of alumni (UCB: 500,000)
![Page 49: Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649efd5503460f94c116ab/html5/thumbnails/49.jpg)
Hubs• nanoHUB: “science portal” for nanoscience
– social network + “app store”
– sharing of ideas, data, software
– computational portal
• HUBzero: generalization to other areas
– currently ~20 hubs
• Integration of BOINC with HUBzero
– each hub has a volunteer computing project