chep031 analysis of cms heavy ion simulation data using root/proof/grid jinghua liu for pablo yepes,...
Post on 17-Dec-2015
213 Views
Preview:
TRANSCRIPT
CHEP03 1
Analysis of CMS Heavy Ion Simulation Data Using
ROOT/PROOF/Grid
Jinghua Liu for
Pablo Yepes, Jinghua Liu Rice University, Houston, TX
Maarten Ballintijn, Gunther Roland, Bolek Wyslouch, Jinlong Zhang
MIT, Cambridge, MA
Supported by NSF grants #0218603, #0219063
CHEP03 2
Outline
From data analysis user’s point of view
Why: ROOT/PROOF/Grid How: Step by Step
What: Test Result Summary
Other PROOF talks in this conference:
Fons Rademakers
Maarten Ballintijn
CHEP03 3
ROOT/PROOF
ROOT as a data analysis toolPROOF: Parallel ROOT Facility ,based on and part of ROOTon clusters of heterogeneous machines
• parallel analysis of objects in a set of files• parallel execution of scripts
Transparency, Scalability, Adaptability, Error handling, Authentication“Bring the KB to the PB not the PB to the KB” KB: code-->CPU, PB: data
Use distributed CPUs to analyze distributed data
CHEP03 4
PROOF/Grid Interface
Use a Grid Resource Broker to detect which nodes in a cluster can be used in the parallel session
Use Grid File Catalogue and Replication Manager
Utilize Grid Monitoring Services Support Globus Authentication Abstract Grid interface
CHEP03 5
Step by Step
Setup PC cluster(s) (for PROOF/Grid) Prepare the data files Write analysis code (algorithm) Compile a data set for PROOF Run a PROOF job Get the results
CHEP03 6
PC Clusters Client machine (desktop) P4 @ 1.8GHz /512MB/40GB
Cluster1: 2 Dual Xeon @ 2.4GHz /1GB/360GB 1 Dual Athlon @ 1.73GHz /1GB/240GB
8 Dual PIII @ 400MHz /512MB/60GB Cluster 2:
3 Dual Athlon @ 1.67GHz /2GB/200GB Operating systems:
RedHat 6.1, RedHat 7.3, Slackware 8.1 Globus version: 2.2
CHEP03 7
CMS Heavy Ion Simulation
Jet & high-pT particle angular correlation Use Calorimeters only
CHEP03 8
CMS Heavy Ion Simulation
Pythia (event generator): 10,000 jet events Hijing (Heavy Ion event generator): 1000
events Each Hijing event (dN/dy~5000) was divided into ~500
sub-events Randomly re-combine 500 sub-events (from different
events) to form a new Hijing event, a cheap way to obtain more Monte Carlo events
CMSIM (GEANT 3 based simulation program for CMS)
CHEP03 9
Data Production: Globus Jobs
Globus Gate Keeper (PBS)
Work node Work node Work node
Globus Gate Keeper (Condor)
Work node Work node Work node Work node
Client PC
Globus used to submit & manage the jobs
No data replication (files were intentionally stored locally)
CHEP03 10
Build ROOT Tree
Superimpose jet events on top of Hijing events and generate ROOT Tree Standalone code linked with ROOT libraries
CMS: Ecal (Electromagnetic Calorimeter): barrel 61200 cells, endcap 14648 cells HCal (Hadronic Calorimeter): 14616 cells (multi-layer) 4032 towers calotree--Ecal cells (energy, position) Hcal towers (energy, position) 10,000 events were split into 100 files, 100 events
each, file size ~160MB, total data 16GB
Data distributed, each node got some local files
CHEP03 11
TSelector – The Algorithms Create TSelector from TTree
$ root
root[0] TFile f(“heavyion001.root”)
root[1] calotree->MakeSelector(“myselector”)
root[2] .q
$ ls
myselector.C myselector.h
Add the analysis code (algorithm) into TSelector
$ vi myselector.h
$ vi myselector.C
CHEP03 12
TSelector – The Algorithms myselector.h
Class myselector : public TSelector {
public:
TTree *fChain;
.
.
private:
TH1F *hist1d;
TH2F *hist2d;
.
.
.
}
CHEP03 13
TSelector – The Algorithms myselector.C
void myselector::Begin(TTree *tree) {
hist1d = new TH1F(“DeltaPhi”,”DeltaPhi”,100,180.,180.);
Hist2d = new TH2F(“EtaPhi”,”EtaPhi”,100,-5.,5.,100,-4.,4.);
fOutput->Add(hist1d);
fOutput->Add(hist2d);
}
Bool_t myselector::Process(Int_t entry) {
user’s analysis code goes here!
for(i=0; i< nclusters; i++) {
if (Et1>5)
for(j=i+1; j< nclusters; j++) {
if(Et2>5) {
DeltaPhi= …
hist1d->Fill(DeltaPhi);
}
CHEP03 14
TDSet – Data Location
Specify a collection of TTrees or files[] TDSet *ds = new TDSet(“TTree”, “calotree”);
[] ds->Add(“/data1/cms/cmsim/heavyion001.root”);
[] ds->Add(“/data1/cms/cmsim/heavyion002.root”);
…
[] ds->Add(“lfn://pcs21.rice.edu/data5/heavyion110.root”);
[] ds->Add(“lfn://pcs11.rice.edu/cms/cmsim/heavyion230.root”);
…
[] ds->Print();
Returned by DB or File Catalog query etc
It’s better to put these into a macro
CHEP03 15
Running a PROOF Job$ root
[] gROOT->Proof(“proofmaster.rice.edu”);
[] TDSet *ds = new TDSet(“TTree”, “calotree”);
[] ds->Add(“. . .”);
. . .
[] ds->Process(“myselector.C+”, “options”, nentries, first);
(note: options must be pre-coded in myselector.C)
[] TH1F *h1=(TH1F *)gProof->GetOutput(“DeltaPhi”);
[] h1->Draw();
CHEP03 16
Angular Correlation
CHEP03 17
Scale plot Analysis speed vs. CPUs (PIII 1GHz equivalent)
CPU power/data size balanced
CPU intensive calculations
CHEP03 18
Summary
CMS Heavy Ion Analysis implemented and tested with PROOF
Scales well with CPUs PROOF/Grid can provide the data
analysis power unavailable otherwise. This power can be achieved without much extra effort
PROOF/Grid interface is under rapid development. The plan is to extend the presented study to use Grid interface
CHEP03 19
The End
top related