sct dq training 11.02.11 dq training 2011 dr. petra haefner
Post on 18-Jan-2016
227 Views
Preview:
TRANSCRIPT
SC
T D
Q T
rain
ing
11
.02
.11
DQ Training 2011
Dr. P
etr
a H
aefn
er
SC
T D
Q T
rain
ing
11
.02
.11
Introduction
What has not changedThe tools / plots to check the data quality
DQMDSCTDQWebToolDCSelog…
What has changedThe DQ flags were dismissedWe have a defects database nowThe shifter only checks if defects are presentThe shifter does not decide on the severity of a defect (tolerable / intolerable)
New Duty36h calibration loop checks 2
Dr. P
etr
a H
aefn
er
SC
T D
Q T
rain
ing
11
.02
.11
2010 Data Quality Flags
5 types of DQ flagsGreen Data goodYellow Data has recoverable problemRed Data badGrey Too few statistics / undefinedBlack Disabled in Atlas partition
3 DQ Regions in SCTBarrelEndcap AEndcap C
Dr. P
etr
a H
aefn
er
3
SC
T D
Q T
rain
ing
11
.02
.11
2010 Data Quality Flags
5 types of DQ flagsGreen Data goodYellow Data has recoverable problemRed Data badGrey Too few statistics / undefinedBlack Disabled in Atlas partition
3 DQ Regions in SCTBarrelEndcap AEndcap C
Dr. P
etr
a H
aefn
er
4
DISCONTINUED
Only DQ region “SCT”
SC
T D
Q T
rain
ing
11
.02
.11
2010 DQ Duties
Shifters:Look at every run to check for data flawsDecide if data is recoverable (e.g. during reprocessing) yellow flagDecide if problem is severe enough that data should not be used anywhere red flag
Experts:Define guidelines for shiftersHelp in cases of doubtFollow up unclear problemsMake sure “recoverable” problems are recovered during reprocessingD
r. P
etr
a H
aefn
er
5
SC
T D
Q T
rain
ing
11
.02
.11
Limits of DQ Flags
Most of the problems are reoccuring onesShould have a one-click mechanism for flagging
Unusual problems not always clear if they cause issues or not (e.g. in tracking)
Could be ok for some analyzers but not for others
Difficult to handle new problemsDifficult to change policy
Dr. P
etr
a H
aefn
er
6
SC
T D
Q T
rain
ing
11
.02
.11
Idea of Defects Database
Put problem = defect itself into a databaseDecide further downstream (virtual flags) if defect causes issues or notEasily create new defects “on the fly”Changes of policy much easierCombined performance / physics groups can require “perfect” data, i.e. no defects at all or allow for certain flaws
Created defects based on 2010 experience
Dr. P
etr
a H
aefn
er
7
SC
T D
Q T
rain
ing
11
.02
.11
Defect “Definition”
A defect is everything that is out of the nominal data taking
What happens if “the nominal” changes (e.g. HV settings, large detector losses)?
This changes the baseline!It is not a defect!A new data period has to be startedThe new nominal has to go into the MC…
Dr. P
etr
a H
aefn
er
8
SC
T D
Q T
rain
ing
11
.02
.11
Defect Policy
Numerical defects will have two limitsThe “perfect” data limit (no defects)The “crap” data limit (intolerable defects)
The operation policy is accordinglyNo need to restart run for tolerable defects (e.g. 1 ROD off)For intolerable defects action should be taken to loose as few as data as possibleOperation shifters must be aware of intolerable defects!Decision is always taken by Run Coordinators!
9
Dr. P
etr
a H
aefn
er
SC
T D
Q T
rain
ing
11
.02
.11
Intolerable Defects
We did not implement any defects (in advance) that were not present in 2010That means most of the intolerable ones!Will give us time to set limits in accordance with e.g. tracking studiesList is prepared and a new defect can be created within a day!
Created “UNKNOWN” defect to assess these cases by the shifterExpert intervention needed!
10
Dr. P
etr
a H
aefn
er
SC
T D
Q T
rain
ing
11
.02
.11
Defect Definitions
Each defect consists ofAn identifier (“name”)A (short) descriptionA virtual flag (tolerable / intolerable)
Recoverable flag is set case by caseIntolerable means: should not be used by any analysis (unless for very good reasons)Tolerable means: can be used by most analyses without problems2 Defect Types
Boolean: either present or not (e.g. standby)Numerical: limit on certain numerical value (e.g. hit efficiency)
Dr. P
etr
a H
aefn
er
11
SC
T D
Q T
rain
ing
11
.02
.11
Common Detector Defects
SCT_UNCHECKED intolerableRun has not been checked for defects, yetHas to be cleared by the shifter with sign-off, otherwise data is not usable!Necessary to distinguish from defectless runs
SCT_GLOBAL_UNKNOWN intolerable2 reasons
A) new defect that is not in the database, yetB) shifter does not know which defect is appropriate
Needs expert helpA) create new defect, decide on tolerable or notB) map to existing defects (i.e. change entry)
SCT_GLOBAL_LOWSTAT tolerableNot enough statistics for DQ checks (< 200 tracks)Old “grey” flag
SCT_GLOBAL_DISABLED intolerable(part of) SCT not in Atlas partition (should never happen!)
Dr. P
etr
a H
aefn
er
12
SC
T D
Q T
rain
ing
11
.02
.11
SCT Configuration Defects
SCT_GLOBAL_STANDBY intolerableThe “classic” one ;-)
SCT_HV_NOTNOMINAL tolerableHV neither 50 V nor 150 VPut actual voltage in comment field
SCT_THR_NOTNOMINAL tolerableThreshold not at 1 fCPut actual threshold in comment field
SCT_TIMINGSCAN tolerableCan probably be used for most analysesLet tracking decide if they see flawsPut description (e.g. coarse timing scan) in comment field
Dr. P
etr
a H
aefn
er
13
SC
T D
Q T
rain
ing
11
.02
.11
SCT Numerical Defects
SCT_RODOUT_1 tolerableone or more RODs off during data takingLet tracking look for inefficiencies
SCT_EFF_LT99 tolerableEfficiency < 99.5 %Put rounded efficiency in comment field (e.g. 99%)Yes, we have a very good detector!Certainly no problem for most analysesEfficiency is normally a symptom for another defect! find out what’s the underlying problem
Dr. P
etr
a H
aefn
er
14
SC
T D
Q T
rain
ing
11
.02
.11
SCT 2010 Efficiencies
160954; ECA; 1 ROD off166658; SCT; compressed mode167844; B; high # of errors layer 1,2167963; SCT; toroid off, 50 ns bunch spacing, high beam backgrounds
Barrel
ECA ECC
Dr. P
etr
a H
aefn
er
15
SC
T D
Q T
rain
ing
11
.02
.11
SCT Module Defects - Noise
SCT_MOD_NOISE_GT40 tolerable# of noisy modules > 40Limit: Noise Occupancy >100 x 10-5
Put actual number of modules in comment field
Dr. P
etr
a H
aefn
er
16
SC
T D
Q T
rain
ing
11
.02
.11
SCT Module Defects - Errors
SCT_MOD_ERR_GT40 tolerable# of modules with bytestream errors > 40Limit: >50% errorsPut actual number of modules in comment field
Dr. P
etr
a H
aefn
er
17
SC
T D
Q T
rain
ing
11
.02
.11
SCT Module Defects - Disabled
SCT_MOD_OUT_GT40 tolerable# of modules excluded in DAQ > 40Put actual number of modules in comment field
Dr. P
etr
a H
aefn
er
18
SC
T D
Q T
rain
ing
11
.02
.11
Settings / Cuts for Defects
Realized that settings/cuts were different for different offline tools (DQMD / web display)Different settings as well between online / offlineDocumentation contains inconsistencies as well
Unify all settings/cuts on same/similar plots/quantities and update documentation
Please tell me if you find inconsistencies! 19
Dr. P
etr
a H
aefn
er
SC
T D
Q T
rain
ing
11
.02
.11
Suggested Limits
20
Dr. P
etr
a H
aefn
er
Quantity Yellow Limit
Red Limit
Cut
Noisy Modules 40 120 100 x 10-
5
Error Modules 40 120 50%
Disabled Modules
40 120
Efficiencies 99.5% 97%Should be implemented in DQMD (online / offline), OHP, SCTGUI, web display,…Yellow limits correspond to tolerable defectsRed limits correspond to intolerable defects
Limits are not fixed, yet maybe subject to change!
SC
T D
Q T
rain
ing
11
.02
.11
SCT Defects List
21
Dr. P
etr
a H
aefn
er
SC
T D
Q T
rain
ing
11
.02
.11
SCT Prepared Defects List
22
Dr. P
etr
a H
aefn
er
As said before, the limits are suggestions!Limits will be fixed if serious defects occur!Implemented lower limits in DQ checks to have a “pre-warning”
SC
T D
Q T
rain
ing
11
.02
.11
“Expert” Work
Add new defects to databaseOverwrite defect flagging (i.e. go from defect “present” to “absent”)Decide what happens with “recoverable” defectsDecide what happens to “unknown” defects
Create new DB defectchange to already existing defectsNOTHING should remain in this category, even if you think it can never happen again!!!
Dr. P
etr
a H
aefn
er
23
SC
T D
Q T
rain
ing
11
.02
.11
Dr. P
etr
a H
aefn
er
Defect Entry Tool
How-To
24
SC
T D
Q T
rain
ing
11
.02
.11
Defect Entry Tool
25
Dr. P
etr
a H
aefn
er
https://atlasdqm.cern.ch/defectentry-dev/?filter=SCT
SC
T D
Q T
rain
ing
11
.02
.11
When you start…
26
Dr. P
etr
a H
aefn
er
You are prompted for your nice accountChoose database
Test for test purposes (training / development)Production the “real” thing for DQ assessment
Choose the tagHead for normal operation with Tier-0 data
SC
T D
Q T
rain
ing
11
.02
.11
Defects Overview
27
Dr. P
etr
a H
aefn
er
Check if there are already defects presentFrom automatic checks (e.g. DCS flags)SCT_UNCHECKED always present before sign off
If you agree with all defects in (or none present)Go to “Sign off a run” directly
If you do not agree / if there are defects missingGo to “Upload”
SC
T D
Q T
rain
ing
11
.02
.11
A “Real” Run Overview
28
Dr. P
etr
a H
aefn
er
All 2010 LBs that had yellow / red DQ flags got a “2010NONGREEN” defect
SC
T D
Q T
rain
ing
11
.02
.11
Removed Defects (“absent”)
29
Dr. P
etr
a H
aefn
er
If a defect was deleted ( “absent”), you will see it stroke through in the list of defects
SC
T D
Q T
rain
ing
11
.02
.11
30
Dr. P
etr
a H
aefn
er
Defect Entry Tool
Filter “SCT” shows only the SCT defect subtreeExpand the subtree if necessarySelect the defect you want to add by clicking its checkboxEnter all information on the right side of the window
SC
T D
Q T
rain
ing
11
.02
.11
Enter Defect Information
31
Dr. P
etr
a H
aefn
er
Select Run for which the defect was presentOnly one run can be entered at a time!
Enter LBs for which the defect was presentSeveral LBs can be entered like this:“1-12,42,43,415-End”“1-End” selects the full runNote: LB ranges are inclusive
SC
T D
Q T
rain
ing
11
.02
.11
Run Information
32
Dr. P
etr
a H
aefn
er
As soon as you enter a run number, you will see the run information pop-up in the window
Project tagRun start / stop time# of LBs, # of evtsStable beams?Online integrated luminosity
SC
T D
Q T
rain
ing
11
.02
.11
Enter Defect Information
33
Dr. P
etr
a H
aefn
er
Enter a comment like# of modules / RODs disabled (e.g. “42”)Efficiency (e.g. “Barrel, 96%”, “ECA, Disk 7, 93%”)Not nominal setting (e.g. “coarse timing scan”, “HV=100 V”, “threshold = 1.2 fC”)No trivial comments needed! (e.g. “Standby”)
SC
T D
Q T
rain
ing
11
.02
.11
Enter Defect Information
34
Dr. P
etr
a H
aefn
er
Select “present” (default) or “absent”Shifters should only use “present”“absent” is for overwriting / deleting defects (should be only done by experts!)
Check “Expected recoverable?” box, if you think the defect can be recovered later (e.g. during reprocessing) old “yellow” flag
SC
T D
Q T
rain
ing
11
.02
.11
Upload Defect to Database
35
Dr. P
etr
a H
aefn
er
Enter system password (see next page)Popup window summarizes defect infoClick “OK” to store defect in database
SC
T D
Q T
rain
ing
11
.02
.11
Sign Off a Run
36
Dr. P
etr
a H
aefn
er
Type in the Run number you wish to sign offSelect “SCT” from drop-down menuEnter system password
“Inner Detector” at the momentShould be changed to an SCT password soon!
SC
T D
Q T
rain
ing
11
.02
.11
Dr. P
etr
a H
aefn
er
“Homework”
37
SC
T D
Q T
rain
ing
11
.02
.11
Dr. P
etr
a H
aefn
er
Test Defect Entry Tool
Take some of the 2010 runs with defects and load them
into the defects (test) database!
38
SC
T D
Q T
rain
ing
11
.02
.11
List of 2010 Problems
Runquery command: find time 31.03.2010-06.12.2010 and ready and ev 20k+
only 14 / 194 runs have defects other than Standby!
Run LB Region Reason Flag152777 63-88 Timing Scan green159224 418-814 no significant physics effect green160800 52-87 B 1 ROD off green160954 226-273 EA 1 ROD off green160958 215-226 EA 5.1% bad due to errors red165732 1-86, 500-542, 565-589 same as DCS red165767 1-193, 576-606 same as DCS red166055 1-9 Pix test, SCT cluster size small, zero eff EA, EC red166056 1-21 SCT L2 stress test, non-standard data red167963 64-231 need to check eff calculation, ignoring 100 contamination yellow169837 7 B error ROD (many modules) red169837 8-53 B 1 ROD off green169838 1-57 B 1 ROD off green165927 1-250, 832-847 same as DCS red169961 1-77, 410-428 same as DCS red159927 1-26 not enough stats grey159934 1-101 not enough stats grey155569 224-470 EC 1 module with timeout errors green169750 274 B Barrel green but EA, EC red! greenD
r. P
etr
a H
aefn
er
39
SC
T D
Q T
rain
ing
11
.02
.11
Outlook – Planned Changes
Include automatic checks into defects DBe.g. copy DCS red flag to standby defecte.g. automatic defects based on DQMD flagsWhat CAN we implement?What do we WANT to implement?Needs thorough control by DQ shifters!
Create automatic history plotse.g. efficiencies, noise occupancies,…Monitor long-term changesAgain, what do we want?
Stopless recoveryLess / shorter cases of RODs in error / excluded
Automated module recoveryLess modules with errors 40
Dr. P
etr
a H
aefn
er
SC
T D
Q T
rain
ing
11
.02
.11
Dr. P
etr
a H
aefn
er
Backup
41
SC
T D
Q T
rain
ing
11
.02
.11
ID Subsystem Defects
42
Dr. P
etr
a H
aefn
er
test
SC
T D
Q T
rain
ing
11
.02
.11
Documentation
Database concept & designATL-COM-DAPR-2010-002
Talk by Peter Onyisi at DQ workshophttp://indico.cern.ch/getFile.py/access?contribId=12&resId=0&materialId=slides&confId=117855
See following slides for a summary
Demonstration of new tools in DQ Meetinghttp://indico.cern.ch/conferenceDisplay.py?confId=121253
SCT Defects TWikihttps://twiki.cern.ch/twiki/bin/view/Sandbox/SCTDefectsDatabase
Dr. P
etr
a H
aefn
er
43
top related