rod walker ic 13th march 2002 sam-grid middleware sam. jim. runjob. conclusions. - rod...

16
Rod Walker IC 13th March 2002 SAM-Grid Middleware http://d0db.fnal.gov/sa SAM. JIM. RunJob. Conclusions. - Rod Walker,ICL.

Upload: olivia-walker

Post on 17-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

SAM-Grid Middleware

http://d0db.fnal.gov/sam

SAM.

JIM.

RunJob.

Conclusions.

- Rod Walker,ICL.

Page 2: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

Page 3: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

SAM stands for “Sequential Access to Data via Metadata”. Sequential access within files – order of files isn’t important, e.g. HEP data.

History of SAMProject started in 1997 by FNAL Computing Division(not just physicists).Meant for FNAL experiments, and recently taken up by CDF. So far ~20 FTE years – a lot of effort.

State of the art in Data ManagementNo-one else has tried to deliver TB’s of user selected data on demand.

Page 4: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

Global file routing

• Many remote stations want files– SAM allowed free-for-all to gridftp server.

– MSS access only from FNAL site, cache on private network,...

• Needed control and routing

• Solution: All sites can route files, eg. – Get fnal files from fnal-router

– route=fnal.gov::nijmegen and nijmegen station has route=fnal.gov::fnal-router

• Janet - Geant – Esnet – FNAL, 155Mbit bottleneck.

• Janet - Geant – Surfnet – FNAL, Gbit(?)

Page 5: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

SAM Status

•Middleware Development•Global routing.

•Diverse deployments, e.g. private network, firewall, shared vs local disk cache.

•CDF deployment – GridPP

•Bug fixes.

•GridFTP and Authentication – GridPP

•Outlook

• Decreasing development. FNAL CD support for RunII

Page 6: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

Page 7: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

JIM history

•Purpose: to build on SAM’s data handling, to create a real grid.

•Job definition & management•Information & Monitoring

•Novel concepts•Already have DH system.•ups/upd packaging and deployment.

•rpm functionality plus multi-platform, tailoring.•little dependence on native installation, e.g.python v2.1f•hugely simplified deployment.

•Use Condor as resource broker.

Page 8: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

JIM components

• User Interface•Job Definition language based on classadds

• RB reduced to making MMS ranking function

•Static & dynamic constraints:os,code version,freecpu,…

•Plus external function to query DH system.

• Collaboration with Wisconsin.

•Choose gatekeeper, use external function, separate submission server from negotiator.

Page 9: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

Page 10: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

JIM components

•Information & Monitoring.

• Currently: grid sensors > ldap > MDS > PHP

• Developing: grid sensors > xml > native Db > PHP, other.

• Reliability, flexibility, persistency.

• Same model works for grid system book-keeping and user level monitoring.

Page 11: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

Information FlowUser Interfac

e

User Interfac

e

Condor-G

InformationAnd

Monitoring

Gatekeeper

Batch Syestem

Grid Sensors

Compute Resource

GRAM

CondorNegotiator

CondorCollector

CondorGrid Manager

External Code

Execution Site

ParserParserJDLClassAd

ClassAd

CinCout

User Interfac

eParser

CondorScheddCondorSchedd

CondorSchedd

CondorCollector

CondorCollector

Grid Sensors

Grid Sensors

CondorNegotiator

CondorNegotiator

External Code

External Code

CondorGrid Manager

CondorGrid Manager

GatekeeperGatekeeper

Batch Syestem

Batch Syestem

Compute Resource

Compute Resource

Page 12: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

RunJob

• Vital tool for d0 MC productions on farms.

•Chains, steers and parallelizes d0 executables. Creates metadata. Use SAM to store to MSS.

• Now interfaced to SAM for input, and can handle real data and any d0 executables.

•Will be used for skimming, re-processing datasets, and user analysis.

•Fully automate monitoring, checking and storage.

•Work underway by UK.

Page 13: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

RunJob status

• Maintenance & development of RunJob, and interface to SAM-Grid entirely by UK.

• CMS using branch of RunJob for production.

• Dave Evans and Greg Graham collaborating on merging branches.

•Goal: Single package with EDG and SAM-Grid interfaces.

• Runjob “server” or job-manager.

Page 14: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

SAM-Grid Logistics

SiteSite SiteSite SiteSite

Resource Selector

Info Collector

Info Gatherer

Match Making

User InterfaceUser Interface User InterfaceUser Interface

SubmissionGlobal Job Queue

Grid Client

SubmissionSubmission

User InterfaceUser Interface User InterfaceUser Interface

Global DH ServicesSAM Naming Server

SAM Log Server

Resource Optimizer

SAM DB ServerRC MetaData Catalog

Bookkeeping Service

SAM Stager(s)

SAM Station(+other servs)

Data Handling

Worker Nodes

Grid Gateway

Local Job Handler(CAF,RunJob,Vanilla, ...)

JIM Advertise

Local Job Handling

Cluster

AAA

Dist.FS

Info Manager

XML DB server

Site Conf.Glob/Loc JID map...

Info Providers

MDS

MSS Cache Site

Web ServGrid Monitoring

User Tools

Page 15: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

Conclusions

o Core SAM supported by FNAL CDo Operational support via software shifts.o UK currently contributes 2 experts on shift.

o JIM post-development support,o bug fixing, deployment issues (like SAM).o will need software support shifts.

o RunJob is and will be UK supported.o Expanding functionality – analysis,reprocessing.o Increasing deployment – d0 sites, CMS.

o On target for end-March deliverable, and production Grid in April.

Page 16: Rod Walker IC 13th March 2002 SAM-Grid Middleware   SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL

Rod Walker IC 13th March 2002

JIM V1: Package dependencies

jim_broker_client

xml_meta_configurator

sam_common

jim_info_providers

jim_broker

orbacus

sam_config

globus jim_www

server_run

jim_advertise

galax

samgrid

jim_client

jim_jobmanagers jim_sandbox