An open source approach for grids
Bob Jones
CERN
EU DataGrid Project Deputy Project Leader
EU EGEE Designated Technical Director
Bob Jones (CERN) Delivery of industrial-strength Grid middleware: establishing an effective European approach 21 January 2004 - n° 2
9.8 M Euros EU funding over 3 years www.eu-datagrid.org
90% for middleware and applications (Physics, Earth Observation, Biomedical)
3 year phased developments & demos
Total of 21 partners Research and Academic institutes as well
as industrial companies
Extensions (time and funds) on the basis of first successful results:
DataTAG (2002-2003) www.datatag.org
CrossGrid (2002-2004) www.crossgrid.org
GridStart (2002-2004) www.gridstart.org
The EU DataGrid Project Project started on Jan. 2001
Testbed 0 (early 2001) International test bed 0 infrastructure deployed
Globus 1 only - no EDG middleware
Testbed 1 ( early 2002 ) First release of EU DataGrid software to defined
users within the project
Testbed 2 (end 2002) Builds on Testbed 1 to extend facilities of DataGrid
Focus on stability
Testbed 3 (2003) Advanced functionality & scalability
Currently being deployed
Project stops on March 2004 Final review 19th & 20th February 2004
DataGrid is a project funded by the European Commission under contract IST-2000-25182
Bob Jones (CERN) Delivery of industrial-strength Grid middleware: establishing an effective European approach 21 January 2004 - n° 3
Software
50 use cases
~20 software releases
Current release 2.1
~900K lines of code
People
>350 registered users
12 Virtual Organisations
20 Certificate Authorities
>500 people trained
~380 man-years of effort
~138 years funded
Scientific applications5 Earth Obs institutes9 bio-informatics apps6 HEP experiments
Testbeds
~20 regular sites
>40 sites using EDG sw
>10’000s jobs submitted
>1000 CPUs
>15 TeraBytes disk
3 Mass Storage Systems
DataGrid in Numbers
Bob Jones (CERN) Delivery of industrial-strength Grid middleware: establishing an effective European approach 21 January 2004 - n° 4
DataGrid open source license Open Source license developed for all grid middleware produced by the project
The project has developed an open-source style software license to protect the IPR of the project
in line with GGF recommendations http://www.edg.org/license.html
The license is similar the "modified BSD License“ but subsequent contributors providing modifications, enhancements or derivative works are
free to decide on the license conditions that apply to his or her own contributions
This license was accepted by all project partners including commercial partners working on software development (IBM UK, DATAMAT, CSSI)
This license has subsequently been used as the basis of the Globus contributors’ license and is being considered by other EU projects for their own use
The license does not apply to applications’ software running on the grid Sites must acquire necessary licenses before such application software is installed
Sites then advertise the software via the grid information service and users specify it in their “job description”. The resource broker performs the matchmaking
Bob Jones (CERN) Delivery of industrial-strength Grid middleware: establishing an effective European approach 21 January 2004 - n° 5
DataGrid Exploitation
DataGrid middleware is used by many sites to connect to the grid Take-up would have been slower if the sites had to pay a license fee or enter
into a written agreement to access the middleware
DataGrid middleware has been adopted/enhanced by many other projects CrossGrid, GridPP, INFNgrid etc. have adopted the EDG middleware and
deployed it on their own grids
DataGrid Industrial partners are investigating commercial exploitation CSSI: working with SMEs as a basis of providing large computing facilities for
making technology break-thrus
LHC Computing Grid (LCG) has enhanced DataGrid middleware to provide a production facility (currently 26 sites) for physics data processing
This will be the basis for the proposed EGEE project grid facility
Bob Jones (CERN) Delivery of industrial-strength Grid middleware: establishing an effective European approach 21 January 2004 - n° 6
The next generation of grids - EGEE
• Create a European-wide Grid production quality infrastructure for multiple sciences
• Profit from current and planned national and regional Grid programmes, building on the results of existing projects such as DataGrid, LCG and others
the EU Research Network Geant and work closely with relevant industrial Grid developers and NRENs
• Support Grid computing needs common to the different communities
• integrate the computing infrastructures and agree on common access policies
• Exploit International connections (US and AP)
• Provide interoperability with other major Grid initiatives such as the US NSF Cyberinfrastructure, establishing a worldwide Grid infrastructure
EGEE is proposed as a project funded by theEuropean Union under contract IST-2003-508833
Bob Jones (CERN) Delivery of industrial-strength Grid middleware: establishing an effective European approach 21 January 2004 - n° 7
EGEE - Strategy
Start with LCG (enhanced DataGrid) middleware to provide a service from day 1
In parallel, middleware will be re-engineered based on implementations of the Open Grid Services Infrastructure (an extension of web services) that have demonstrated scalability via large scale deployment and adoption for many applications
This new middleware will be distributed under an open source license
New application groups will be expected to contribute resources consistent with their requirements
project must demonstrate that this is less than is needed without the grid (I.e. by making use of idle resources at different resources centres)
Based on experience with DataGrid and LCG, we expect to attract many resources
From 3000 CPUs in 2004 to 8000 CPUs across 50 sites in 2006
Bob Jones (CERN) Delivery of industrial-strength Grid middleware: establishing an effective European approach 21 January 2004 - n° 8
Summary
DataGrid has developed an open source license suitable for EU projects and acceptable to industrial partners
DataGrid has demonstrated the validity of an open source approach for software development in EU projects
Adopting an Open Source approach has facilitated the take-up of DataGrid middleware, structures and processes by resource providers, other projects and industrial partners
An open source approach has been adopted for the future EGEE project