volunteer computing with boinc david p. anderson space sciences laboratory university of california,...
TRANSCRIPT
Volunteer Computingwith BOINC
David P. Anderson
Space Sciences LaboratoryUniversity of California, Berkeley
High-throughput computing
• Goal: finish lots of jobs in a given time
• Paradigms:– Supercomputing– Cluster computing– Grid computing– Cloud computing– Volunteer computing
Cost of 1 TFLOPS-year
• Cluster: $145K– Computing hardware; power/AC infrastructure;
network hardware; storage; power; sysadmin
• Cloud: $1.75M
• Volunteer: $1K - $10K– Server hardware; sysadmin; web development
Performance
• Current– 500K people, 1M computers– 6.5 PetaFLOPS (3 from GPUs, 1.4 from PS3s)
• Potential– 1 billion PCs today, 2 billion in 2015– GPU: approaching 1 TFLOPS– How to get 1 ExaFLOPS:
• 4M GPUs * 0.25 availability
– How to get 1 Exabyte:• 10M PC disks * 100 GB
History of volunteer computing
Applications
Middleware
1995 2005distributed.net, GIMPS
SETI@home, Folding@home
Commercial: Entropia, United Devices, ...
BOINC
Climateprediction.netPredictor@homeIBM World Community GridEinstein@homeRosetta@home ...
20052000 now
Academic: Bayanihan, Javelin, ...
Applications
The BOINC computing ecosystem
volunteers projects
CPDN
LHC@home
WCGattachments
• Projects compete for volunteers
• Volunteers make their contributions count
• Optimal equilibrium
What apps work well?
• Bags of tasks– parameter sweeps– simulations with perturbed initial conditions– compute-intensive data analysis
• Native, legacy, Java, GPU– soon: VM-based
• Job granularity: minutes to months
Data size issues
CommodityInternet
Institution~ 1 Gbpsnon-dedicatedunderutilized
~ 1 Mbps (450 MB/hr)possibly sporadicnon-dedicatedunderutilized
• Can handle moderately data-intensive apps
Example projects
• Einstein@home
• Climateprediction.net
• Rosetta@home
• IBM World Community Grid
• GPUGRID.net
Creating a volunteer computing project
• Set up a server
• Port applications, develop graphics
• Develop software for job submission and result handling
• Develop web site
• Ongoing:– publicity, volunteer communication– system, DB admin (Linux, MySQL)
How many CPUs will you get?
• Depends on:– PR efforts and success– public appeal
• 12 projects have > 10,000 active hosts
• 3 projects have > 100,000 active hosts
Security
• Code signing
• Client: account-based sandbox
Project
Volunteer
Hacker
Organizational issues• Creating a volunteer computing project has
startup costs and requires diverse skills
• This limits its use by individual scientists and research groups
• Better model: umbrella projects– Institutional
• Lattice, VTU@home
– Corporate• IBM World Community Grid
– Community• AlmereGrid
Summary
• Volunteer computing is an important paradigm for high-throughput computing
– price/performance– performance potential
• Low technical barriers to entry (due to BOINC)
• Organizational structure is critical
• Use GPUs if developing new app