lecture 1.1 what is pat and how to use it? · what is the physics analysis toolkit it serves as...

19
Lecture 1.1 PAT Tutorial June 2010 A short reminder of the CMS EDM and Analysis Workflow Content The answer to the question: What is PAT? An introduction to the PAT DataFormats Configuration of the PAT DataFormats An introduction to the PAT Workflow Support and Documentation What is PAT and How to use it?

Upload: others

Post on 19-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Lecture 1.1

PAT Tutorial June 2010

● A short reminder of the CMS EDM and Analysis Workflow

Content

● The answer to the question: What is PAT?

● An introduction to the PAT DataFormats

● Configuration of the PAT DataFormats

● An introduction to the PAT Workflow

● Support and Documentation

What is PAT and How to use it?

Page 2: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Reminder of the Event Data Model

● Configurable edm::Modules communicate with/via the EventContent

● Same file structure (i.e. root) for: Gen-Sim-Digi-Reco-Analysis

● Single framework for Reconstruction (POGs) and Analysis (PAGs)

Page 3: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Typical CMS Analysis Workflow

● Prompt reconstruction at Tier-0.● Central skims at Tier-1's.● Users run cmsRun at Tier-2's:

● Perform high level analysis steps.● Preselect events.● Write their own user defined

EventContent to private T2/T3 space.

● The latter step might be iterated.● Copy reduced datasets to your

favorite machine.● Run your final analysis/produce plots.

user-defined EventContentPAT helps you to create a

Page 4: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

What is the Physics Analysis Toolkit

● It serves as well tested and supported common ground for group and user

PAT is a toolkit as part of the CMSSW framework

● It facilitates reproducibility and comprehensibility of analyses.

analyses.

● It is an interface between the sometimes complicated EDM and the simple mind of the common user.

● You can view it as a common language between CMS analysts:

● If another CMS analyst describes you a PAT analysis you can easily knowwhat he/she is talking about

Page 5: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Three Aspects of PAT

● simplifies access via DataFormats

● b/w RECO expertise & Analysis Level

contacts)● crossing point between POGs & PAGs

Interface

● canalizes expertise (via POG & PAG

('vertical integration')

Common Tool

● quick start into analysis for beginners

● approved algorithms & sensible defaults

● synergy (everybody can profit from recent developments)

Common Format

● facilitates transfer & comparisons

● sustained provenance

● PAG common configurations

Page 6: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Facilitated Access to Event Information

● Note: Each PAT Candidate IS a corresponding reco::RecoCandidate (and more)

● Do you know how to access this event information within the EDM?

Isolation(different fromdefaults)

Object Id,Cluster shapes

BTag Algorithms,TagInfos

Associated Tracks,JetCharge

JetFlavor

Generator Match,Trigger Match

Correction Factors,Object Resolutions

More, ...

reco::Candidate

● With PAT Candidates you get this just by calling member functions!

Page 7: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

The PAT Data Formats

● A PAT Candidate is a reco::RecoCandidate PLUS more.

● All pat::Objects inherit from their corresponding reco::RecoCandidates

Page 8: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

PAT Candidate Member FunctionsCheck the Documentation: SWGuidePATDataFormats

Page 9: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Flexibility User Friendliness

MaximalConfiguration

Combine Flexibility and User Friendliness

● You can choose yourself whether you really need all the extra informationthat the PAT Candidates provide.

● Still you don't need to know, how EDM/PAT manages this access for you underthe hood.

● The key is: configuration of DataFormats by cfi file! (E.g. for pat::Jets).

Page 10: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Configuration of PAT DataFormats

import FWCore.ParameterSet.Config as cmspatJets = cms.EDProducer("PATJetProducer", ... # embedding of AOD items embedCaloTowers = cms.bool(False), embedPFCandidates = cms.bool(False), # jet energy corrections addJetCorrFactors = cms.bool(True), jetCorrFactorsSource = cms.VInputTag("patJetCorrFactors"), # btag information addBTagInfo = cms.bool(True), addDiscriminators = cms.bool(True), discriminatorSources = cms.VInputTag( ... ), # clone tag infos ATTENTION: these take lots of space! # usually the discriminators from the default algos # are sufficient addTagInfos = cms.bool(True), tagInfoSources = cms.VInputTag( ... ), # track association addAssociatedTracks = cms.bool(True), trackAssociationSource = "ak5JetTracksAssociatorAtVertex", # jet charge addJetCharge = cms.bool(True), jetChargeSource = cms.InputTag("patJetCharge"), # add jet ID addJetID = cms.bool(True), jetIDMap = cms.InputTag("ak5JetID"),

Size: 14kb/event (for ttbar)

You can configure the content of the DataFormats yourself (example: pat::Jet)!

Page 11: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

The PAT Workflow

Have a look at:

Resembled by the structure of the python directory

Pre-Production steps

PAT Candidate creation

Main collection (w/o cleaning)

Main collection (with cleaning)

SWGuidePATWorkflow

before PAT Candidatecreation

in the PatAlgos package (don't be shy, check it out!)

Page 12: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

EventContent of the default PAT Tuple● Have a look to patEventContent_cff.py:

● Have a look to patTemplate_cfg.py:

Size: 20kb/event (for ttbar)

● But decide yourself how your PAT Tuple should look like (add reco::Tracks orreco::GenParticles to the Event Content or BTag information to the jets, etc ... ).

Page 13: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

The concept of Maximal Configuration

● Configure your ownDataFormats via embedding(see Lecture 2.2/Exercise 06).

● Configure your workflow viatools that PAT provides (seeLecture 2.1/Exercise 05).

● Apply selections via theStringCutParser.

● Add any extra infoyou need the the EventContent.

Page 14: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

The Code Location

● Definition of all PAT Candidates.

DataFormats/PatCandidates

PhysicsTools/PatAlgos

PhysicsTools/PatUtils

PhysicsTools/PatExamples

● pat::Photon, pat::Electron, pat::Muon, pat::Tau, pat::Jet, pat::MET, ...

● Implementation and filling of all data formats.● Definition of common workflow and PAT tools.

● Definition of common tools and helper functions used in PatAlgos.

● Location of many examples e.g. all non-trivial examples used during this Tutorial.

Page 15: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Development

PAT is part of any CMSSW release. We recommend to use it from the release!

Have a look at:SWGuidePATRecipes

Page 16: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Development (cont'd)

In case you want already to use features/fixes that will go into the next releasefollow the Pat release notes in the corresponding development branch.

Page 17: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Support

● Hypernews

● Community● POG/PAG contacts● Developers

● Lecturers & Tutors

Check the the main entry page of PAT in the software guide: SWGuidePAT

A short extract of possiblesupport:

● The quite developed PATDocumentation!

Page 18: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Documentation

● SWGuidePAT/WorkBookPAT

● SWGuidePATRecipes

● WorkBookPATTutorial

● WorkBookPATDataFormats

● WorkBookPATConfiguration

● SWGuidePATEventSize

● SWGuidePATTools

And last but not least: This Tutorial and/or former Tutorials...

Main documentation pages

Installation recipes

Tutorials and examples to get started.

Description of all PAT Candidate.

Description of the configuration of PAT.

Tools for event size estimate

Description of all PaT tools.

● WorkBookPATWorkflow Description of the PAT workflow.

Page 19: Lecture 1.1 What is PAT and How to use it? · What is the Physics Analysis Toolkit It serves as well tested and supported common ground for group and user PAT is a toolkit as part

Exercises

By now you should be prepared to do the following Exercises on

WorkBookPATTutorial: Have Fun!

● Exercise 1 (WorkBookPATDocNavigationExercise)The PAT Documentation is one of the most looked after parts of the WorkBook. To

know your documentation and how to use it can speed up your learning curve

enormously. Learn more about the PAT Documentation and how to make effective

use of it.

● Exercise 2 (WorkBookTupleCreationExercise)Learn how the default PAT tuple is produced to be prepared to produce your own PAT

tuples.

● Exercise 3 (WorkBookTupleCrapExercise)This is the part of the crab tutorial. Once you are doing large sceal analyses you will

need crab.