1 proof the parallel root facility gerardo ganis / cern chep06, computing in high energy physics 13...
Post on 05-Jan-2016
214 Views
Preview:
TRANSCRIPT
1
PROOFThe Parallel ROOT Facility
Gerardo Ganis / CERN
CHEP06, Computing in High Energy Physics13 – 17 Feb 2006, Mumbai, India
Bring the KB to the PB and not the PB to the KB
2
What is this talk about
G. Ganis, CHEP06, 13 Feb 2006
End-user analysis of HEP data on distributed systems usingthe ROOT data model
ROOT ? Package providing:
Efficient data storage supporting structured data sets Efficient query system to access the information Complete set of tools for scientific analysis Advanced 2D / 3D visualization and GUI systems C++ interpreter
HEP data ? Collections of independent events
3
The ROOT data model: Trees & Selectors
G. Ganis, CHEP06, 13 Feb 2006
Begin()•Create histos, …•Define output list
Process()
preselection analysis
Terminate()•Final analysis (fitting, …)
output listSelector
loop over events
OK
event
branch
branch
leaf
leafleaf
branch
leafleaf
1 2 n last
n
read neededparts only
Chain
branch
leaf leaf
4
End-User Analysis scenarios
G. Ganis, CHEP06, 13 Feb 2006
Analysis performance typically I/O bound
Analysis requirements of LHC experiments (~100 users) CPU: ~ MSI2k (~ 250 Intel Core Duo) Storage: ~ 0.3-1.6 PB storage Bandwidth: ~ 100 MB/s
Intrinsic parallelism of data classically exploited by splittinglong analysis jobs in smaller sub-jobs addressing differentportions of data and submitted (run) concurrently
Farms of fewhundred nodes
Good accessto data required
5
“Classic” approach
G. Ganis, CHEP06, 13 Feb 2006
StorageBatch farm
queues
manager
outputs
catalog
query
“static” use of resources jobs frozen: 1 job / worker node
“manual” splitting, merging limited monitoring (end of single job)
submit
files
jobsdata file splitting
myAna.C
mergingfinal analysis
6
The PROOF approach
G. Ganis, CHEP06, 13 Feb 2006
catalog StoragePROOF farm
scheduler
query
farm perceived as extension of local PC same macro, syntax as in local session
more dynamic use of resources real time feedback automated splitting and merging
MASTER
PROOF query:data file list, myAna.C
files
final outputs(merged)
feedbacks
(merged)
7
PROOF – Multi-tier Architecture
G. Ganis, CHEP06, 13 Feb 2006
good connection ?VERY importantless important
Optimize for data locality or efficient data server access
adapts to clusterof clusters orwide area virtual clusters
Geographically separated domains;
heterogenous machine types
proofd
8
PROOF: ingredients
G. Ganis, CHEP06, 13 Feb 2006
PROOF servers are full-featured ROOT applications communication layer setup via light daemons ( (x)proofd )
authentication: password-based, GSI, Kerberos, … Dynamic load balancing
Pull architecture: workers ask for work when finished faster workers get more work
Merging infrastructure via Merge() implemented for standard objects: histograms, trees, … user-defined strategy by overloading / defining Merge()
Feedback at tunable frequency: Standard statistics histograms (events/packets per node, …) Temporary version of any output object
Package manager for optimized upload of additional libraries needed by the analysis
9
PROOF – Scalability
G. Ganis, CHEP06, 13 Feb 2006
32 nodes: dual Itanium II 1 GHz CPU’s,2 GB RAM, 2x75 GB 15K SCSI disk,1 Fast Eth, 1 GB Eth nic (not used)
Each node has one copy of the data set(4 files, total of 277 MB), 32 nodes:8.8 Gbyte in 128 files, 9 million eventsEfficiency ~ 90 %
8.8GB, 128 files1 node: 325 s32 nodes in parallel: 12 s
Case of data locality
10
PROOF: data access and scheduling issues
G. Ganis, CHEP06, 13 Feb 2006
Low latency in data access is essential Minimize file opening overhead (asynchronous open) Caching, asynchronous read-ahead of required segments Supported by xrootd
Scheduling of large numbers of users: Interface to generic resource broker to optimize the load Batch systems have this already:
concrete implementations exist for LSF, Condor planned for BQS, Sun Grid Engine, …
On the GRID, use available services to determine the session configuration
F. Furano, #368, Feb 15th, 16:20A. Hanushevsky, #407, Feb 15th, 17:00
11
PROOF @ GRID
G. Ganis, CHEP06, 13 Feb 2006
PROOFPROOFMASTERMASTERSERVERSERVER
USER SESSIONUSER SESSION
Guaranteed site access throughPROOF Sub-Masters calling outto Master (agent technology)
Grid/Root Authentication
Grid Access Control Service
TGrid UI/Queue UI
Proofd Startup
GRID Service Interfaces
Grid File/Metadata Catalogue
Client retrieves listof logical files (LFN + MSN)
Slave servers access data via xrootd from local disk pools
PROOF PROOF SLAVE SLAVE SERVERSSERVERS
PROOFPROOFSUB-MASTERSUB-MASTER
SERVERSERVER
PROOF PROOF SLAVE SLAVE SERVERSSERVERS
PROOF PROOF SLAVE SLAVE SERVERSSERVERS
PROOFPROOFSUB-MASTERSUB-MASTER
SERVERSERVER
PROOFPROOFSUB-MASTERSUB-MASTER
SERVERSERVER
ROOTROOT
Demo’ed by ALICEat SC03, SC04, …
12
What is our goal?
13
Typical end-user job-length distribution
G. Ganis, CHEP06, 13 Feb 2006
Interactive analysis usinglocal resources, e.g.- end-analysis calculations- visualization
Analysis jobs with well defined algorithms (e.g. production of personal trees)
Medium term jobs, e.g.analysis design and development using alsonon-local resources
Goal: bring these to thesame level of perception
14
Sample of analysis activity
G. Ganis, CHEP06, 13 Feb 2006
AQ1: 1s query produces a local histogram
AQ2: a 10mn query submitted to PROOF1
AQ3->AQ7: short queries
AQ8: a 10h query submitted to PROOF2
BQ1: browse results of AQ2
BQ2: browse temporary results of AQ8
BQ3->BQ6: submit 4 10mn queries to PROOF1
CQ1: Browse results of AQ8, BQ3->BQ6
Monday at 10h15ROOT sessionon my laptop
Monday at 16h25ROOT sessionon my laptop
Wednesday at 8h40
Browse from any web browser
15
What do we have now?
G. Ganis, CHEP06, 13 Feb 2006
Most of the ingredients in place support for multi- sessions
asynchronous (non-blocking) running mode support for disconnect / reconnect
Use xrootd as launcher of server sessions query-results classification and management
retrieve / archive / remove Command-line controlled via TProof API …
… but also GUI controlled Virtual Demo
16
A real PROOF session - connection
G. Ganis, CHEP06, 13 Feb 2006
Predefined session
Define new session
Session startup progress bar Session startup
statusready
17
A real PROOF session - package manager
G. Ganis, CHEP06, 13 Feb 2006
Package tab
PAR (Proof ARchive) ROOT-INF directory, BUILD.sh, SETUP.C
Control setup of each worker
18
A real PROOF session - query definition and running
G. Ganis, CHEP06, 13 Feb 2006
nameExecute to
create chain
Select chain
Choose selector
Feedback histograms
Processing information
19
A real PROOF session: query browsing and finalization
G. Ganis, CHEP06, 13 Feb 2006
Details about the query
Folder with output objects
raw histogram
finalizationfinalization
20
A real PROOF session: disconnection / reconnection
G. Ganis, CHEP06, 13 Feb 2006
Running sessions kept alive by server side coordinator Reconnection is much faster: no process to fork
disconnectreconnect The query is now
terminated
21
A real PROOF session: chain viewer
G. Ganis, CHEP06, 13 Feb 2006
Chain viewer
Right-click
22
Ongoing activities / Near Future Plans
G. Ganis, CHEP06, 13 Feb 2006
Data file upload manager optimally distribute data on the cluster storage keep track of existing data sets for optimized re-run
Dynamic cluster configuration come-and-go functionality for worker nodes olbd network to get info about the load on the cluster
Optimizations packetizer (re-assignment of being-processed packets to fast idle slaves) data access (fully exploit asynchronous features of xrootd)
Monitoring of cluster behaviour MonAlisa: allows definition of ad hoc parameters, e.g. I/O / node / query
Improve handling of error conditions identify cases hanging the system, improve error logging, … exploit olbd control network for better overview of the cluster
Testing and consolidation Documentation
23
Who’s using PROOF?
G. Ganis, CHEP06, 13 Feb 2006
ALICE (see, e.g., PROOF@AliEn at CHEP04; LHCC review Nov 2005) PHOBOS CMS analysis prototype
Development test-beds Currently using CERN phased out machines:
35 dual Pentium III 800 MHz / 512 MB RAM 100 MBit/s Ethernet 600 GB total hard disk
Contact with Lyon / CC-IN2P3 (ALICE) up to 16 dual Xeon 2.8 GHz, 200 GB scratch each
Request for new test-bed at CERN (LCG / ALICE / CMS)
M. Ballantijn, #374, Feb 15th, 16:40
I. Gonzáles, #267, Feb 14th, 16:00
24
People working on the project
G. Ganis, CHEP06, 13 Feb 2006
B. Bellenot, M. Biskup, R. Brun, G. Ganis, J. Iwaszkiewicz, G. Kickinger, P. Nilsson, A. Peters, F. Rademakers (CERN) M. Ballintijn, C. Loizides, C. Reed (MIT) D. Feichtinger (PSI) P. Canal (FNAL)
25
The End
G. Ganis, CHEP06, 13 Feb 2006
Questions?
Links http://root.cern.ch/root/PROOF.html Discussion topic at the ROOT forum http://root.cern.ch/phpBB2/
26
PROOF – backup slides
G. Ganis, CHEP06, 13 Feb 2006
Additional recent improvements Stateless connection with XrdProofd
XrdProofd basics Connection layer
Pull architecture PROOF@AliEn: command-line session
27
Additional recent improvements
G. Ganis, CHEP06, 13 Feb 2006
Session startup Parallel startup with threads Optimized sequential startup Startup status and progress bar
New progressive packetizer: open files as needed continuously re-estimate # of entries can order files based on availability
Draw() and viewer functionality via PROOF
28
Interactive batch: stateless connection with XrdProofd
G. Ganis, CHEP06, 13 Feb 2006
Needs coordination of alive sessions when clients disconnect existing proofd required deep re-design xrd (networking, work dispatching layer of xrootd)
generic framework to handle protocols used already to launch rootd for TNetFile clients
Dedicated protocol to launch and manage PROOF sessions forked in separated processes to protect from client bugs talking to coordinator via UNIX socket
Disconnect / reconnect handled naturally Asynchronous reading allows to setup a control interrupt network independent of OOB Xrd/olbd control network used to query status information
A.Hanushevsky, #407,
Feb 15th, 17:00
29
XrdProofd basics
G. Ganis, CHEP06, 13 Feb 2006
Prototype based on XROOTD
XROOTD
links
XrdXrootdProtocol
files
MT stuff
user 1
XPD
links
XrdProofdProtocol
proofserv
XrdProofdProtocol: client gateway to proofserv static area for all client information and its activities
staticarea
MT stuff Wor
ker
serv
ers
user 1
user 2……
user 2
30
…
clientxc
slave n
XrdProofd
XS
slave 1
XrdProofd
XS
master
XrdProofd XS
XS
xcXRD links
TXSocket
xcproofserv
xcfork()
proofslave
xc
fork() proofslave
xc
fork()
XrdProofd communication layer
G. Ganis, CHEP06, 13 Feb 2006
31
Workflow: Pull Architecture
G. Ganis, CHEP06, 13 Feb 2006
dynamic load balancing naturally achieved
32
PROOF @ AliEn: command-line session
G. Ganis, CHEP06, 13 Feb 2006
TGrid: abstract interface for all services
// ConnectTGrid *alien = TGrid::Connect(“alien://”);
// QueryTString path= “/alice/cern.ch/user/p/peters/analysis/miniesd/”;TGridResult *res = alien->Query(path, ”*.root“);
// Create chain from list of filesTChain chain(“Events", “session“, res->GetFileInfoList());
// Open a PROOF sessionTProof *proof = TProof::Open(“proofmaster”);
// Process your querychain.Process(“selector.C”);
top related