Download - Me, Myself and GD
EGEE-II INFSO-RI-031688
Enabling Grids for E-sciencE
www.eu-egee.org
EGEE and gLite are registered trademarks
Me, Myself and GD
Felix Ehm
Felix Ehm, CERN 2008 2/16
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Little Introduction
• Introduction– Fellow– IT-GD-ITR– Oliver, Laurence as supervisors– Andreas, Joachim, Gergo, Louis, Maria, Di,
Juha, Ricardo– (currently) 28 years old
• Activities– EGEE information System– GLUE 2.0– GSTAT 2.0– VOMRS
HARIBO
Location
Felix Ehm, CERN 2008 3/16
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
The EGEE Information System
Felix Ehm, CERN 2008 4/16
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
The EGEE Information System
• What is it ?– Decentralized information system providing ways to discover services in a
Grid Infrastructure
– Hierarchical structure of BDIIs (Berkley Database Information Index)
– Uses the OpenLDAP server (and Berkley database) internally
– Essential part of the instrastructure
• What is it used for ?– Publishing Grid resource/service status information
– Matchmaking of jobs/resources– Monitoring– Accounting
• Who uses it ?– Nearly every gLite component (SE, CE, WMS, UI, ..)
Felix Ehm, CERN 2008 5/16
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Information Flow
The EGEE Information System
• Architecture– Resource-, Site- and Top-Level BDII– One core component (BDII)– Each level caches information– Information flow follows ‘pull’ principle– Data format is LDIF– DB is recreated every 2min
• Facts and Figures– 70 top level BDIIs worldwide– 256 site level BDIIs– Refresh rate 2-5 min for top level– Data size ~32 MByte LDIF data (30sec to add)– Top-level at CERN handles 4 Million requests/day and serves the full database within
2,5sec; 8 head nodes at 20-30% averaged CPU load– BDII scales linearly (~32% speedup/additional core)– Requesting big data sizes >1MByte slows the system down– Many supported platforms
Felix Ehm, CERN 2008 6/16
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
The EGEE Information System
• How am I involved ?– Performance tests
– Developments Compressed Content Exchange mechanism
(32MByte reduced to 3.3MByte) BDII statistics Monitoring Information System infrastructure using
Nagios
Client Execution time when requesting same information (169 Entries) from MySQL, Oracle RAC and OpenLDAP
0
0,2
0,4
0,6
0,8
1
1,2
1,4
1,6
9 18 27 36 45 54 63 72 81 90 99
parallel Requests
Tim
e [
sec]
OpenLDAP (138KB) MySQL (35KB) Oracle LB (35KB)
Comparison of OpenLDAP using LDBM and BDB and backend with 100KB requested Datasize
00,5
11,5
22,5
33,5
44,5
5
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
parallel requests
avg
. re
spo
nse
tim
e [s
ec]
LDBM BDB LDBM Timeouts BDB Timeouts
o=infosysEntriesTimeToAddTimeToSearchServedRequests
o=grid…
Felix Ehm, CERN 2008 8/16
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
GLUE Schema
• What is it ?– Grid Laboratory Unified Environment– Defines a common conceptual data model to be used for Grid
resource/service discovery– Working group part of the OpenGridForum (OGF)– Available as Version 1.3 (http://forge.ogf.org/)
• Latest News :– GLUE 2.0 in progress :
Elaborated in respect of 1.3 problems Not backward compatible to 1.3 Goes to public comments end of April
• How am I involved ?– Chairing storage schema discussion– Render relational model of schema
Felix Ehm, CERN 2008 10/16
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
GSTAT
• What is GSTAT ?– Displays summarized information about a Grid infrastructure– Content checking on IS– Monitor dynamic values from the information system– Database driven architecture where data is gathered by agents
developed by Academica Sinica (Taipei)
Felix Ehm, CERN 2008 13/16
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
GSTAT
• What is GSTAT ?– Displays summarized information about a grid infrastructure– Content checking on IS– Monitor dynamic values from the information system– Database driven architecture where data is gathered by agents
developed by Academica Sinca (Taipei)
• Why another version ?– “evolved over the past few years from a simple cgi script which
displayed the summary of the grid infrastructure to a production service„
– Optimizing the visualization: CERN-PROD html page 500KB (only source)+744 pictures each 14KByte = 10,5MByte
– New Requirements/Use Cases
• What is the plan ?
Felix Ehm, CERN 2008 15/16
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
VOMRS
• Just started– Take over work from Lanxin– Nothing to report (yet)