an error reporting system for the upgraded cdf data ... · events are read [rom the detector by cl...
TRANSCRIPT
•
•
•
An error reporting system
for the upgraded
CDF data :}cquisition system
Peter J. M usgrave Department of Physics, McGill University
Montréal, Québec Canada
November, 1993
A Thesis submitted to the Faculty of Graduate Studies and Research
in partial fulfillment of the requirements for the degree of Master of Science
@Peter J. Musgrave, 1993
------------- - -----
•
•
•
Abstract
ThIS thesis describes the data acquIsItion error monitoring system developed for the
1993-94 physics run of the collider detector at Fermilab (CDF). It presents an overview
of the CDF data acqUIsItion system mdicating the role that the error monitoring
system plays ln the experiment. It then describes the custOIIl software and software
packages used to meet the error monitormg requirements of the CDF data acquisition
system
11
•
•
•
\' ,,,,'1 ./1\' 4" li', .. '~..,) 1U.J...!"J
Cette tl .... ~fie d,écrit le nouveau systeme de surveillance du système d'arquisitlOll rie
dc,l'mé€'s (~JAF .'e l'expérience CD F. récemment amélioré pour la pénode d(' pnst'
de donnp ", ,," 1994 Aprèù un 3urvol du SI\ D on explIque le rôle rlu sySt<>TTH' rit>
;!Uf\' ,,' ,;xlgences auxquellef, Il dOlt repondle. FlDdlement, on di>cnL le
our répon dre à ces e:ogeI1 ces
111
•
•
•
Acknowledgernents
Paradtse
zs exactly lzke where you are rzght now,
only much,
Tnuch,
better
Laune Anderson
1 would like to thank Ken Ragan for supervising this work. His .sense of humor
and flexibihty made this project fun. Thanks also to Klaus Strahl for his help with
Murmur and his willingne.ss to discuss this project at great Iength. 1 am aiso gratefui
to Kurt Biery for his patience with my questions about the DAQ.
The members of the onlme software group at Fermilab developed an error report
ing system (Murmur) whlch allowed this project to be completed in a timely manner
and saved us a huge amount of work Thanks
The software package DAQERI was developed by me. The remaining bugs shouid
be mterpreted as desIgn intent
1 enJoyed the company and conversation of my fellow CDF graduate students frorn
McGIlI and Toronto durmg my stays at FERMILAB. Happy histogramming folks!
My parents continue to be a perpetuaI source of support regardless of the direction
in which my aspirations and delusïons propel me. Many thanks.
Finally, 1 thank Kathryn Adeney for being my partner on the journey up the
mountain .
IV
----------
• Contents
Abstract Il
Résumé III
Acknowledgements IV
1 Introduction
2 The CDF Data Acquisition System '1
2.1 The Run lA System 4
• 2.2 The Run lB System ... 7
3 Error Reporting Objectives 10
3.1 Requirements . . . ... 10
3.1.1 Summary of Requirements Il
3.2 Desirable Features .. . . . 12
4 The CDF Error Reporting System 13
4.1 Overview. .... · .. 13
4.2 Murmur 15
4.2.1 Error Generation 15 4.2.2 Error Monitoring 16
4.3 DAQERI. .... · .. 18
5 DAQERI 24
5.1 DAQKER ..... · .... 24
5.1.1 The Murmur Interface 25 5.1.2 DAQGUI Interface 25
• 5.1.3 The DAQKER Data Structure. 27
v
•
•
•
CONTENTS
5.2 DAQGUI .
5.3 Browse.
5.4 How an Error lS Handled
6 Conclusions
llibiliography
VI
30
31
32
34
35
• List of Figures
2.1 Run lA data acqUlsltlOn system ()
2.2 Run lB data acqUlsltlOn system 9
4.1 CDF error reportmg system 1·1
4.2 Murmur error display wmdow 17
4.3 DAQERI main window 20
4.4 DAQERI no de wmdow 21
4.5 DAQERI node error window 22
4.6 Logfile browser 2~l
• 5.1 DAQERI kernel data structure. 28
• VIl
------------------_ .
•
•
•
Introduction
Our present. understanding of matter, as composed of the particle farmlies of quarks
and leptons, is called the standa.rd model This mocle! :s a re'imlt of decades of
theoretlcdl dnd expenmental analysl:'. The standard model has survlved all of the ex
pC'rirncntal tests to WhlCh il has beeu sub.~ected. It predldH thal, SIX flavors of quarks
eXIst and Hve of these have been v';ldied expenrnentally. The seaI'ch for the sD.th
quark, known as the top quark, is underway at the Ferrm lab Teva'cron The Tevatron
lS presently the only operatlOnal collider wll1Ch can collidt" partiel es at hlgh enough
energy tü s('arch for the creatlOn of a top- antitop pau
The Tevatron mamtamR bunches of pJotons and antI-protons ln a cm:ular orbi t
VIa superconductlllg rnagnets. lt. c'lrl'ently operates wlth SIX bunches (If counter ro
tatmg protons and antl-protons At several points in the Tevatron nng tbe bunches
of protons and antiprotons are focused to a common intera.ctIon poi nt where collisi()1ls
occur at a center of mass enel'gy of 1.8 Te V at a frequenc:y of 285 KHz
The Tevatron has two detectors which e;.,amme the proton- a.ntiproton collisio1ls
and extract sIgnatures of interesting phySlCS Events. This thesls describes work telat
ing to one of these, the Colhder Detector at F,~rmila.b (CDF). The CDF detectoJ' 18 a.
large cylindncally symmetnc detector. It. lS descn b(~d in detail elsewhere [1 J. Briefly
il consists of (from the mteractlOn point outward) a SIlIcon vertex detector, vertex
time projectIOn chamber, drift cbarnber, electro-rnagnetic and hadronic calorimeters
surrounded by an array of muon detect.ors. A superconducting soleIlOld magnet sur
rounds the vertex and drift chambers and creates a magnetic field of 1.4 Telsa. Eacb
of these detedor clements provides the data they rneasure in the form of electronic sig
naIs A complete record of an event in the detector consists of approximately 250,000
1
•
•
•
-------- - ------------
Ch3.pter 1: [ntroductlOn
K bytes It IS !lot feasJble Lo record thls dmotmt of data for t'Mil beam lll'SSl11g .IIHI
since only ccomparatJvely few bt'am crossmgs produce mtC'restmg physlcs (t)f orel!'1 ,1
few Hertz) much of the recorded data would be dlscarded. A tflggl'r syst l'In 18 uset! CP
examine eve]'~ts in real time and select those whl( h arc "interest.mg" Events s('[I'('1I'11
by the t.rigger are then read by t,~e data acqUlsltl Hl !>ystt'lll (DAQ) Thcs!' cVI'Ills .IIC
bufferecl and then passed on to further trigger processors wlllch ('xaml1l(' the t'vent 111
TIlore detal1 and determine If It should be retamed.
Events are read [rom the detector by cl vandy of modules operatlIlg ITI para.llel,
so that the time to read out the detector 15 mlIumized This 18 crucml h!'( ,tl1S(, whtle
the detector is holding the event da.ta Il, cannot callect data {rom ongolllg (011181\111:-'
This is one source of deadtime, t1me dUflng whlch colbslOns are occurmg but (',1Il1l0!.
be recorded The readout modules are part of the DAQ wlnch t<lkes t'venl da ta ,wei
passes Jt through various buffenng, filtermg and formattlIlg proct'SSOfS Ille' udcd III
the DAQ lS rtnather level of tnggermg, WhlCh can 1.ake aclvant<Lge of t!H' IqWef d,ü<t
rate from the detector ta spend more tlme examllllllg each eveni The DAQ syst.elll IS
analogous ta a computer network, wlth a vanety of processors ail handling sepM<Lll'
parts of sorne task.
Correct operation of the DAQ system Îs clearly vital tü the operatIOn of t.he CDF
detector. Monitoring the performance of the DAQ system 15 a key aciivlt.y whde
collecting data The fallure or performance degradabon of a DAQ node rcqlll reH UII
rnediate attention from the opera tors ln the physics run whlch cndreJ 1T1 .1 UrIe' of
1993 (termed run lA) the error momtonng of the DAQ nodes was (iIstnbutcd. Nod('H
reported problems Hl dlfferent ways to physlcally dlstmct plc1ces ln tbe CD'" c.onl/Il]
room. In the 1993-94 phyS1CS run (fun lB) of the Tevatron the CDF DAQ will he
upgraded and new components will be mtroduced. Error mOnItormg for th(~He new
components must be provided. It was suggested that the error momtonng syst.em
developed for the run lB components could also handle error reports from the rest ()f
the DAQ.
This thesis describes the development and implementation of this central DAQ
error monitoring system. It is structureù as follows. We first present an overvicw ()f
the data acquisition system which was previously employed and the enhanccrncnts
for run lB. We then discuss the reqUirements and objectives for a monitorJTIg system
------------------------- ~~ ---~~-
•
•
•
Chapter 1: Introduction 3
Followlllg that we describe the solutlOn which was implemented in general terms. We
then discuss the graphlcal mterface used ln the CDF control room in more detail.
Fmally wc compare the solutJOn ta the stated Objectives and summanze the current
status of the Murmur/DAQERI system
•
•
•
Chapter 2
The CDF Data Acquisition System
In this chapter we seek to provide an overVlew of the CDF data acqmsltion system
primarily to identify those elements which are sources of error and diagnostIc mfor
mation. Details of the CDF DAQ can be found in the references provlded in [11, a
description of the upgraded system IS found ln [2] .
The goal of the DAQ is to take the signaIs from the detector elements, encode
them, apply further trigger critena and wnte the accepted events to a storage medIUm.
In order to maximize the chance of recording interesting events the DAQ system must
have as little deadtime as possible. Towards this end improvements will be made to
the detector DAQ system prior to run lB. We first describe bnefty the system uscd
for run lA then we indicate the changes which will be implemented for run lB.
2.1 The Run lA System
Data collection necessarily begins with the SignaIs from the various detector compo
nents (see Figure 2.1). These signaIs must be digitized and collected together in a
buffer so that the detector and pre-readout tngger can go back to searching for inter
esting events. In the run lA system this task is handled in two ways. The calorimetry
and central muon chambers are read out by an Redundant Analog Bus-Based Infor·
mation Transfer (RABBIT) analog interfdoce in conjunction with an MX scanner and
Multiple Event Port (MEP) interface to FASTBUS [3]. Tracking detectors are read
out by a FASTBUS based time-to-digital conversion (TOC) module in conjunction
4
•
•
•
Chapter 2: The CDF DAQ System 5
wlth a SLAC Scanner Processor (SSP) [4j Both the MX and SSP scan:lers have
buffers to hold four events each. Event data from the scanners is passed to the event
bUilder over FASTBUS. FASTBUS crates are inter-connected via Segment Intercon
nect (SI) modules In the run lA system there are approximately 60 MX scanners,
25 SSP scanners and 53 FASTBUS crates.
Data acquisition begins when an event passes the pre-readout trigger criteria.
When thlS occurs digitIzatJon begins and the MX and SSP scanners are instructed to
begin moving data from the front ('nd electronics to one of their data buffers. Once
an the scanners have completed this readout the detector can return to the task of
searching for another event. The event stored in the scanners is next transferred over
four parallel FASTBUS segments to an event builder. The event builder has the task
of collectmg a complete event description from the buffers of aU the scanners and
reformatting the data. for the subsequent level 3 trigger. Once this task has been
completed the event is sent over FASTBUS to a VME interface to one of the proces
sors in the level 3 processor farm This processor evaluates the complete event and
applies selectIOn criteria. If the event passes then it is passed to a "consumer" VAX .
Software on this VAX sends the event to a tape dnve and may also send it to one or
more processes which monitor data quality during the run.
The allocation of buffers in the scanners, evcnt builder and leve13 nodes is handled
by a buffer manager process running on a Micro-Vax. This process allocates buffers
to events when the detector is read out and then sends instructions to move the event
inte the event builder and on to level 3 as buffers in those elements become available.
The buffer manager directs the operation of the DAQ elements via FASTBUS inter
rupts
Control and configuration of the DAQ elements is via a VAX process called run
control. The detector operators use run control to bring the DAQ elements online.
The detector Can operate in a partitioned mode with severél.l iun control pro cesses
USillg different parts of the detector DA 1 (this is particularly useful during testing
and calibration). The run control VAXen each have a FASTBUS interface over which
control messages and software downloadir g of FASTBUS components is done. Note
that both data and control messages require FAS'l'BUS .
•
•
•
Ohapter 2: The ODF DAQ System
(FASTBUS) (FAS'I'BUS)
Event Builder
Level 3 Processor Farm
Consumer VAXen
Run Control
Figure 2.1: Run lA data acquisition system
6
A portion of the CDF DAQ for run lA. Data flow is from top to bottom starting with the detector calorimetry and tracking chambers and ending when the event is
written to tape .
•
•
•
Chapter 2: The CDF DA Q System 7
Errer messages from run control are handled within the run control terminal dis
play. Error messages from the event buîlders are dîsplayed on two other screens.
These screens are controlled hy direct RS-232 links to the event buîlders. MX errors
are reported to the event builder over FASTBUS and included in the event builder
errar display The level3 system has a separate error monitor ,,'V'hich provldes a graph
Ical overvlcw of the buffer status m the level 3 noùes
2.2 The Run lB System
Experienc~ with the DAQ system has shown that there are a number of performance
bottlenecks. Each event builder has only one link to level 3 and this restricts the
maximum data rate of one event builder into level 3 to of order 20 Hz [2] and with
two event builders, 30 Hz Another bottleneck is due to the buffer mauager's use of
FASTBUS Interrupts via a VAX-FASTBUS interface to direct the data flow. Lim-
1 tatJons of VAX message queues cause these Instructions to get backed up and the
fact that such control mformatlOn lS carned on the same medium on which data lS
passed results in data transfers beîng iaterrupted by control messages. Performance
limitatIons also arise in the consumer VAXen due to the fact that data from level 3
1S carned by FASTBUS whlch 1S bmIted by the VAX/FASTBUS interface banclwidth
of 350 KB/sec
Enhancements to the Tevatron will increase the luminosity and hen~e the collision
rate for run lB. In run 2 (1997) accelerator improv(:ments will allow the production of
more antiprotons, allowing a greater number of antiproton bunches, in turn allowing
the time between beam collisions to be reduced to 400 ns. These improvements will
increase the rate of interesting physics events and a DAQ with a higher throughput
will be required. The run lB system 1S a step in this direction.
One of the goals of the run lB DAQ (Figure 2.2) is to eliminate the event builder
bottleneck. This is done by moving the event building function to level 3 and intro
ducing parallel data paths from the scanners to the level3 processors. On the scanner
side a new module, the FASTBUS readout controller (FRC), will be used to read out
the MX sca mers. These will connect via a scanner bus to VME based scanner CPUs
(SCPUs) which will reformat and forwaI'd FRC data fragments to level3 processors
•
•
•
Cbapter 2: The CDF DAQ System 8
via an ULTRANET hub. The use of multIple SCPUs and a cross connect ULTRA
NET hub a\lows the transmISSIOn of data ta be highly parallehzed. Events wlil tll(~n
be built in the level 3 nodes where trigger algonthms will be apphed as before.
This process will be coordinated by sever al new elements, in place of the butTer
manager. A scanner manager (SM) will direct the data from the FRCs to sranner
CPU sand into the level 3 processes Control of event building will oe handled by
processes running in the level 3 processor farm These control clements will no lotlgt'r
use FASTBUS for control and error informatIOn. The scanner manag('r WIll be nll
plemented on a commercIal VME processor Ta allow for commumca,tIOu between
the scanners, scanner manager and level 3 farm (without usmg FASTHUS) a reflec
tive memory board will be present in each VME crate contaimng a~l SCPU or SM
and each level 3 box. CommunicatIOn will occur via this shared memory. These
modules will aiso have Ethernet connectlOns to allow for software downloadmg and
error reporting. The VAX run control pracess will communicate wlth the SM Vld. a
User Control Interface (itself a VME board) which will then send control rnessagl.'s
to SCPUs III other crates VIa reflectlve memory
To surnmanze, the new DAQ system will reqUlre the introduction of the follo" ... ing
components:
• FASTBUS Readout Controller (FRC)
• Scanner CPU (SCPU)
• Scanner Manager (SM)
• User Control Interface (UCI)
These DAQ clements require a mechanism to report errors. Before wc discuss the
specific solution which was chosen, wc first discuss the general rcquirements for an
error reporting system for the DAQ .
•
•
•
Chapter 2: The GDF DAQ System
Calorimetry TracJ<~ing
Ampl & shaping
F R
C
(FASTBUS)
F
R
C 5 (Scanner Bus) 5 a .............. III ......... III ........... "I" ....... II'
i
TDC
F
(reflectlve memory) @ SCPU .......... I., •••••• " •••• !............ '" (te other SCPUs) M
Run Contro1
Leve1 3 Processor Farm - ....
Consumer VAXen
Figure 2.2. Run lB data acquisition system
Onlya portion of the DAQ is shown. Data flows from calorimetry and tracking detectors at the top of the figure to the tape drive and consumer VAXen at the
bottom .
9
•
•
•
Chapter 3
Error Reporting Objectives
3.1 Requirements
There are a number of objectives whlch an error reporting system for the CDF dl'
tector should meet. The new error reporting system IS being developed for two mam
reasons. Firstly, the upgrüded DAQ to be used in run lB contains new DAQ elemt'u1.s
and we require a mechalllsm to commUlllcate then error and stat us TIIeSsdges tü the
detector operators Secondly, this opens the door to making an effort to ct'ntml
ize the error reporting from eXIsting DAQ elements Implicit 1Il the reql11rerncnt for
a central error reporting system is the need for the message sending romponent of
the system to be hlghly portable. A varîety of software envlronments are used !Tl the
DAQ system and message generation routines must be available for each cnVlfODJTI('nt
Monitoring the DAQ health IS an important activlty and information about prab
lems in the DAQ should be easily accessible to the detector operators. This is bcst
handled by a graphical display WhlCh provides an "at a glance" summary of the cur
rent status of the entire DAQ. When errors do arise it should be possible to acccss the
specifie error information quickly_ If a large number of errors oceur in a short time
span the operator should not be overwhelmed with details unless they are specifically
requested. In those cases where a DAQ element is recognized as faulty the operator
should be able to disable error reports from the node.
Obviously we do not wish the presence of the error reporting system to ID any
way impede the normal operation of the DAQ. It must be possible to shut down and
restart the error reporting system without affecting normal DAQ operat.ion. Con
versely, the purpose of the error reporting system is to monitor the DAQ a.nd record
10
•
•
•
Chapter 3: Objectives 11
the errors whlch have occurred. It must therefore continue to function normally as
elements of the DAQ start, reset or encounter any "reasonable" failure.
Ideally, error reports should be sent directly from a DAQ element to the central
monitor. ThIs elimmates the posslbility that error reports may be lost because an
intermediate no de in the error reporting chain has failed. In other words, we wish
the network topology to be that of a "star".
The error monitoring system should also provide a log of the complete error history
of the detector for a given run. The complete error history is essential for diagnosing
faults m the DAQ. This error log should be centralized so that all DAQ errors are in
a single file in the sequence in which they occurred.
Error monitoring is done in real time. The monitoring system must operate in
real-tIme.
As descnbed in chapter 2 the CDF detector can be operated in a partitioned mode
with control of the different elements handled by separate run control processes. In
such cases it is valu able to have an error display wmdow available for each run con
trol while maintaming a central log so that the complete system health can be easily
rnonitored.
3.1.1 Summary of Requirements
We provide a summary of the requirements here for later reference. The error report
ing system must be able to:
• handle error messages from the new DAQ elements
• centralize error reports from the remaining DAQ elements where possible
• provide an "at a glance" summary of the state of the DAQ
• allow specifie error reports to be accessed quickly and easily
• permit disabling of error reports from nodes
•
•
•
Chapter 3: Objectlves 12
• never interfere in data taking
• be impervious to "standard" DAQ failures
• use a "star" topology
• maintain a central log file
• allow multIple error displays
• perform real-bme momtoring
3.2 Desirable Features
In this section we present a number of features (in no particùlar order) whlch are
desirable ln an error reporting system but are not reqUlred.
(i) The meaning of error messages from the DAQ may not be dear tü the operators
of the detector. The ability to access help text for a specifie error message wüuld
be valuable .
(ii) For sorne errors the corrective action required will be known. In such cases it is
desirable to allow the receipt of the errür message to trigger the corrcctIvl" actIOn.
(iii) Trouble shooting faults in DAQ elements frequently Involves solicItmg the opm
ions of people who have detailed knowledge of a specifie DAQ element If the
error monitoring system supported monitoring of the DAQ from rcmote dlS
plays then these experts could provide assistance to the operators In a faster
and more convenient fashion. Adding and deleting su ch displays should not
affect the displays in the control room In any way.
(iv) In sorne cases it is not the errar message, but rather the frequency with which
it occurs, that is an indication of an element's status. The capability to alert
an operator only when an error message exceeds a frequency threshold may he
useful.
•
•
•
Chapter 4
The CDF Error Reporting System
4.1 Overview
To provlde a centralized error reporting momtor for the diverse elements of the CDF
DAQ it was declded that each DAQ element would use Ethernet to report errors to a
central process This pro duces a logical star configuratlOn, although it is of course a
physical bus by vlrtue of the Ethernet standard This also eliminates the need to send
error messages on any of the buses used for moving event data through the system .
The use of Ethernet allows the error reporting system to be develop~d and run on
any processor wJth an Ethernet connectIOn. CDF elected to use a SUN workstation
for thlS purpose.
The CDF error reporting system for run lB is composed of two software packages:
Murmur and DAQERI. Murmur [5, 6] is a software package developed by the on-line
software group at Fermilab. DAQERI [7, 8] is an extensIOn to Murmur written by
the CDF collaboration. Murmur han dIes the generation of error reports, transmis
sion over Ethernet and provides a central server far collecting an the error reports. It
al10ws for logging and display of the error messages, but dùes not provide any sum
mary of the current status of the DAQ. This task is performed by DAQERI. DAQERI
interprets the error messages from Murmur and uses them to keep a record of the
status of the DAQ and provide a graphicai summary. The inter-relationship of the
software companents of the error reporting system is illustrated in figure 4.1.
In this chapter we describe these software packages in general terms, discussing
what portions of the error reporting problem they solve .
13
•
•
•
Chapter 4: The CDF Error Reporting System
DAQ
Element
DAQ
Element
DAQ
Element
... a ......................................................... n ....................................................................... ..
Murmur Server
DAQERI
Kernel
:SUN Workstation
Murmur t--~_~
X client
Murmur X client
DAQERI GUI
DAQERI GUI
j ~ !
1 .......................... " ..................................... 04 ............................................. 11 ............... ..
Figure 4.1: CnF error reporting system
14
•
•
•
Chapter 4' The CDF Error Reporting System 15
4.2 Murmur
M urmur 15 a u)J]ection of software routines and tools to handle the geneIation and
record mg of erroI messages over Ethernet. The error sources (termed Murmur clIents)
make use of Murmur subroutines to package theIr errors in a form the centrallogging
entity (the Murmur server) wIll recogmze We now descnbe the capabIhtie:s and op
eratJOn of Murmur in more detaIl, begmmng wlth the specification and generatlOn of
errors
4.2.1 Error Generation
A DAQ element sends error messages ta the Murmur server via caUs to one of a varwty
of Murmur client subroutines. Subroutme libranes exist for aIl the software environ
ments found III the CDF DAQ ( Vax VMS, VxWorks and assorted umx flavoes).
Prior to us mg these messagmg subroutmes the clIent programmer must predefine the
error messages in a Murmur message file ThIS file is used by Murmur to prod.uce
umque 32 bIt error codes Each error code carries mformatJOn about the loglcal j,ype
of the client (faclbty number), the f'rror number and the seventy of the error. These
message files are used to produce files to be mcluded in the client code as well aB
records for the central error data base Each error in the message file may have as
sociated with it several additional pieces of informatioil. The programmers provide
a one line descnptive text stnng tu mdicate the meaning of the erroI. Optionally
they may elect to provide a reference to a help file for the error, or a scnpt file con
tammg commands to be executed when the error 1S rece1ved. This information is
appended ta the database used by the Murmur server ThIS approach relieves the
client of the task of sen ding error text with each message. It suffices for the client to
send the error code and optionally parameters to be inserted in the error message text.
Each error described in a message file contains a field to indicate the severity of
the message. Murmur provides four severity types:
• informational
• success
• warnmg
•
•
•
----- -------------------
Chapter 4. The CDF Error Reparting System t6
• severe
When the error messages are displayed by Murmur, the display can be conf1guff'd
ta display only messages of a certain type or to display the vanous types !TI dJffert'nt.
colors. Apart from these display options Murmur handles al! erraf types in I.h(' sanlt'
manner
Murmur also provldes for stacks of errar messages. A clIent. may wlsh I.a senel
a group of errors which are assoclated wIth a common cause To ensure these elfe
entered in the logfile as a contiguous group they are sent as a message stark This
ensures that errors from other nodes wIll not be interleaved Wlth messages from the
error stack.
4.2.2 Error Monitoring
The central error monitoring process, the Murmur server, performs a number of fUJl(
tions. It records aU error messages in a central logfile and optlOnally may dm'( L
error messages to one of a number of wmdows based on preferences mdlcated by the
user prior to startIng the server These preferences are spcclfied by uSlI1g {t MurrJlur
tool mur9u~ (Murmur GraphIcal User Interface) MurgU1 allows users Lu spcClfy t.he
number and location C)f display wmdows for error text Rauting of ('rrors (MI also
be speClfied For example, one wmdow can be dedlcated to only scv('re crrors (wei
another to errors from only FRC DO des The error mformatlOn fi('lds C{lfI .t1:,o [H'
configured by murgm aUowmg a user to dH;play only a portlOn of the error H'( ord
This display and routing information is then stored in the central errar database An
example of a Murmur error dlsplay wmdow IS provIded III FIg t1 2 DIsplays can 1)('
added or altered whIle Murmur is runnmg but the changes do not I.ake errec1. untd
aIl the eXIsting dlsplays are closed a.nd the new dlsplay configuration used tu re-open
them
In addition to dlsplay configuratIOns the user may also specify how the lagriles
are to be handled. The current logfile always has the same name and If Murrnur if!
run for extended periods of time the file wIll become unmanagabley large M tlfInur
allows for the logfile to be "recycled" The user can specify a time or BIze interval
after which the contents of the existmg logfile will be preserved in a separate file and
•
•
•
Chapter 4: The CDF Error Reporting System 17
Figure 4.2' Murmur error display window
the current logfile cleared. This also reduces the chance that an error in writing to
the current logfile wIll destroy the complete error hlstory of the system.
A typical Murmur logfile entry lS of the form:
FRC_S_SUCCESS Successful Execution
NODE: tor02, APPNAME: FRC, CTIME: 14:07:26,
CZONE: EDT, PID: 6049 STIME: 14:07:26,
SDATE: Wed Sep 22, CODE: 80008009
Only the error code, time and parameters (to be inserted in the text string) are
sent when a client reports an error The error text is retrieved from the central error
database. The Ethernet address, process ID and other informatIon which is fixed for
the client is sent in the initial communicatIOn with Murmur. This initial communica
tion consists of a request from the client to connect to the Murmur server combined
with the client information which needs to be sent only one. The server responds by
allocating a Tep /IP connection for the client. AIl subsequent communication takes
place on this new connection.
Murmur plays an important role in the new error monitoring system for the CDF
DAQ. lt can be thought of as providing the "link layer" for the error reports as weIl
as central error logging. lts one shortfall is that it provides only error text and the
•
•
•
Chapter 4: The CDF Error Reporting System 18
status of the system must be deduced by reading back through the error hlstory lt
does not provlded "status at a glance" ln order to provide thls CDF developed an
extension to Murmur, DAQERI.
4.3 DAQERI
DAQERI (Data Acquisition Error Reporting Interface) is a software package devel
oped to provide an intuitIve "at a glance" summaryof the error state and status of the
CDF DAQ. It relies on error informatIOn passed from Murmur. ThIs reqU1red makmg
custom modIficatIOns to Murmur 50 that the Murmur server would start DAQERI
and then send error informatIOn through a unix pipe to the DAQERI kernel ThIS
pipe also provides a means for sending supplement al information indicating when
logfiles have been recycled and client connectjdisconnect events occur DAQERI and
Murmur run on the same platform to allow aU this mformatIOn to be passed VI<l a
pipe, instead of echoing aIl the error reports over Ethernet.
The DAQERI system has two components (see Figure 4.1): a kernel proccss w~lIch
examines the error messages from Murmur and keeps track of the DAQ status, and
Graphical User Interface (GUI) processes whlch provlde this information to detector
operators in a graphicai format. The DAQERI kernel allows multIple GUI proccsses
so that several operators can monitor the DAQ performance simultaneously.
The status on the GUI display of DAQ elements IS extracted from the type of the
Murmur message. Status of nodes is indlcated by color. DAQERI supports the four
Murmur error types presented above, and allows an additlOnal error category, that of
frequency checked errors. A frequency checked error has an assoclated tune constant
and thresholds for warning and error states. The errors which occur are intcgrated
over the time period specified by the time constant and then cornpared to the error
threshoids to determine the state of the node. If the number of errors exceeds the
warning threshold then the no de is treated as bemg m a warnmg state. If no further
errors arrive after entering the warning state, then the prevlOUS errors will "age-out"
and the node status will revert to normal.
In the graphical display the following node states are distmguished. (hsted m
or der of increasing severity with color indicated in parentheses):
•
•
•
Chapter 4: The CDF Error Reporting System 19
• GHOST' (grey) the node has not yet reported to Murmur, and no information
on it is available.
• DISABLED: (blue) a user has disabled DAQERI error reports for this node.
The errors will still he processed by Murmur and written to the logfile.
• OK: (light green) the node is operating norrnally. Informational messages have
been received or previous errors were cleared in DAQGUI.
• SUCCESS: (bright green) a Murrnur success message has been received. This
clears any prevlOUS errors or warmngs
• WARNING: (yellow) a Murmur warning message has heen received, or a
frequency checked error message has exceeded its warning threshold.
• ERROR: (red) a severe Murmur error has occured or the error threshold of a
frequency checked message has been excceded .
If a given no de has received several messages then the color of the node reflects
the most senous of these.
A warnmg or error state of a node in the DAQERI display can be cleared in one
of two ways, the node recovers and and sends a success message or the operator, via
DAQGUI, instructs DAQERI to clear the status of the node.
The DAQERI main window (Figure 4.3) contains a nurnher of icons which rep
resent aU the nodes in the CDF DAQ. Below this a status window holds other in
formation of mterest to the operator (tape drIve status, percent deadhme etc.). The
colors of the icons in the main window reflect the error status of the underlying nodes.
Clicking on an icon results in a new window, the node window (Figure 4.4). This
window shows the nodes represented by the icon and the color of the node buttons
reflects their status.
Clicking on anode hutton within the node window pro duces another window pro
viding more detailed information (Figure 4.5). The resulting node error window lists
aIl the possible errors which could he generated by the node in question. These errors
Chapter 4: The CDF Error Reporting System 20
•
•
Figure 4.3: DAQERI main window
•
Chapter 4: The CDF Error Reporting System 21
•
•
Figure 4.4: DAQERI node window
•
•
•
•
Chapter 4: The GDF Error Reporting System 22
Figure 4.5· DAQERI node error window
typically require several pages. The user can select a page of errnfR hy clicking on
the corresponding page button. The color of the page button reftects the most severe
errors on that page. In addition to providing detailed mformation on the current
error status of the node this window provides buttons which allow for the disabhng
or clearing of errors either individually or collectively. This wmdow aiso provides a
hutton to Iaunch a Iogfile browser.
The logfile browser (Figure 4.6) is a general purpose utility for browsing through
Murmur logfiles to extract and display errors satisfying selection cuts. For example
it can he used to examine only errors from FRC 20 after 8:00 a.m. When launchcd
from the DAQGUI no de erroI window the browser selection criteria are automatically
set to extract only those errors for the node for which the error window was opened .
These cuts may be changed interactively to either narrow or widen the search for
•
•
•
Chapter 4: The CDF Error Reporting System 23
Ju.t ln .-e _'re ctrlOUS RPPTXT : Control pr-oc ... 0 .trtlng bOS\j06, CTItE: 2a63d STltE·
Ju.t 111 .-e _'re ctrlOUS RPPTKT: FRC r.!dout routIne 0 strtlng bOs\j06. CTlt1E: 2a63d STItE:
Figure 4.6 Logfile browser
errors to he displayed. The hrowser displays the most recent 100 errors meeting the
specified criteria.
The DAQERI kernel supports multiple GUI processes. The error history of each
of the GUIs is independent in the sense that if one GUI clears or disables errors this
does not affect the displays of any other GUIs. Requests to open a GUI session can
be made while the DAQ monitoring system is in operation and (unlike Murmur) do
not have to be configured in advance .
•
•
•
Chapter 5
DAQERI
DAQERI consists of three independent code modules: the kernel (DAQKER), tht'
gn ,hical user interface (DAQGUI) and the logfile browser. While DAQERI was dt'
veloped specifically for the CDF DAQ an effort was made to keep the program as
general as possible. The description of the DAQ elements and the topology of the
icons/node structure in the GUI lS read into DAQERI from configuratIOn files The
error information for the nodes withm DAQERI is read from files den veel irom the
Murmur error files. The code ta handle status messages and update the titpe drive
displays is specifie to CDF. To maintain generality this code can be excluded by set
ting an appropriate complle-bme constant.
This chapter is intended ta provide an overview of the lmplementation of the
modules. We first provide an overview of the functionality of each module, and thcn
discuss how the modules handle requests as a system.
5.1 DAQKER
The DAQERI kernel performs the error book keeping for ail the DAQG VI processcs.
It receives error messages from the M urmur server, processes them and compares the
resultin'!; state of the node to the state presently displayed in each of the GUI ses
sions. If the error state of a node ln a GUI is different from lts prevlOUS val uc, Üll'n
the kernel sends an update message to the GUI Vla a dedicated pipe. The kernel also
listens to pipes from the GUIs and handles requests for specifie error information,
requests ta clear or disable errors etc .
The kernel interfaces via unix pipes to Murmur and multiple DAQGUI processcs.
24
•
•
•
DAQERI 25
The information exchanged is detailed In the sections below.
5.1.1 The Murmur Interface
The error messages sent by Murmur are copies of those sent to the logfile supple
mented with a prefix and length code. The prefix code IS used to mdlcate whether
an error message is a standard Murrnur message, part of a message stack, or sorne
special information from the Murmur server. Special information messages are used
to commumcate events In the Murmur server which are of interest to DAQKER. At
the present time the following prefix types are defined:
• standard error message or head of a stack of messages
• stacked error message
• stop request
• logfile recycled
• client connect
• client disconnect
Note that there is no communication from the DAQERI kernel to the Murmur
server.
5.1.2 DAQGUI Interface
DAQGUI processes are started by the DAQERI kernel. Each DAQGUI is passed
the name of two pipes as command Hne arguments. These pipes allow for two way
communication l-etween the DAQKER process and a DAQGUI.
Communication from GUI to kernel consists of the following messages:
• REQ_NODE-INFO' request the complete error status of a particular no de and
enable immediate updates from the kernel if new errors for this node arrive .
• CANCEL_MONITOR: stop update messages for a given node.
•
•
•
DAQERI 26
• DISABLE...NODE: disable error recording for anode.
• EN ABLE.-NODE: enable error recording for anode
• CLEAR_NODE: dear an errors for anode.
• RESET _NODE: reset anode to the GHOST state.
• CLEAR_ERROR: clear the status of a particular error for a given node.
• CHANGE-DISABLE: toggle the enable/disable state of a specifie error of a
node.
• BROWSE...NODE: request that a browser be started for a specified node
• QUIT: terminate the GUI seSSlOn ln response to a qUlt request from the use!
ln addition to these messages there eXIst messages to perform each of the no de
actions on collections of nodes e.g. clear an nodes
Kernel to GUI communicatIOIl is VIa the followmg messages:
• NODE_STATUS: change the status of anode.
• NODE-RECORD: sen ding a complete record of the status of each error for a
node.
• NODE_UPDATE: update the error count and status of an error for anode
which is being monitored by anode error window.
• KER..EXIT: the kernel is stoppmg. Stop the GUI process.
• TAPE_UPDATE: change the status of a tape monitor.
The kernel sends a variety of update messages to the GUI in response to both
changing error states and requests from the GUI. The GUI makes requests to the
kernel in response to actions by the user, for example - clear a particular error.
•
•
•
DAQERI 27
5.1.3 The DAQKER Data Structure
In arder to get a clear understanding of how the kernel operates a knowledge of lts
central data structure is Important ThlS structure is illustrated in Fig 5.1 The
major orgamzing pnnclple of thls data structure is the fact that the status of anode
varies from GUI to GUI due to the fact that users running separate GUIs may inde
pendently clear or disable nodes.
When an error arrIves From Murmur we are faced wlth the problem of mapping
It to a specific DAQ element. The Murmur message provldes the error number (con
taining the facllity number), the node's Ethernet address and 1tS process id (PID) 50
these must form the basls for the search. Sorne DAQ elements are uniquely specified
by their Ethernet ID and facility number (if we know a pnon that there will only he
one client of that facility type at that address). In such cases we search usmg only
the Ethernet address and facihty number. If there are multiple instances of a fac1hty
at a given Ethernet address then we must distinguish them by PID. Smce we have no
way of knowing the PIDs ahead of bme, we must expect sorne number of nodes with
this (Ethernet address, facility number) pair and assign them logical node numhers
as they send their first message
The rnapping from a M urmur message to a node record lS handled by the physical
node table. This table consists of pointers ta node records ordered by facility num
ber, Ethernet address and PID. Information on PIDs lS not avallable at the tlme the
table i5 mitiahzed. In cases where the PID lS required to distingUl5h between multiple
instances of a facllity the PID lS added when DAQERI receives the first message From
the node and the physical node table lS then reordered. The DAQ description in the
configuration files may also spec1fy that sorne number of a particular facility is ex
pected but that the Ethernet addresses are not know ahead of time. In such cases the
Ethernet address is also entered into the se arch table wh en the first message arnves.
Each node in the DAQ is also assigned a logical address which consists of the
facility number and a unique node number. Requests from the DAQGUI processes
indicate nodes by facility number and node numher. The kernel must provide a map
ping from this logical address to a node record. Toward this end it maintains a table
of pomters to no de records ordered by logical address .
DAQERI 28
• Physical Node Table Logical Node Table
rptr ptr
-=--- ptr t--..E!!... -ptr ptr
-....:...-
Node Record
- facl.l i ty # - node # GUI Node Records -- Ethernet Addr. PID gui [MAX_GUIs J ..... - status
error_table - size monl. tored
error_table 1- status_cnt [MAX_STATUS_TYPES]
partl. tion #
Node Error Table Node Error • Record GUX Node Error Re cords - error code - ptr
error code gui [MAX_GUIs] _ .... status ptr - -- tail_error last_cleared
head_error 1-- num
Error Chain (for frequency baaed errora)
..... count count count count ...... - -time time time tlme prev - prev - prev - prev - - --
Figure 5.1: DAQERI kernel data structure
•
•
•
•
DAQERI 29
For a gIven node there is sorne mformatIOn common ta all GUIs In particular the
GUI has a logicd,l (instead of physlcal) vlew of the DAQ system and references nodes
by facility and node number The node record in the kernel holds the facIlity and
no de numbcr for the (Ethernet address, facihty, PID) trIplet sa that in commumcatIOn
with the GUI the kernel can send the loglcal address of the node. The node record
also contains an array of pointers to GUI records, mdexed by GUI number Smce
each GUI can independently dlsable or clear error mformatIOn from a node, the error
status informatIOn must be kept mdependently for each GUI Tpe GUI record holds
a summary of the node's status Tt keeps track of the overall status and mamtams
counters of th!' number of distmct error messages whlch contribute to each of the
nodc status lcvels For example 1t keeps a count of the number of dIstInct warmng
messages whlch have been recelved and remam actIve. As a particular error for a
nor:e is cleared, the appropriate counter is decremented and the overall status of the
no de is re-evaluated.
In addItIon ta the overall status mformatlOn the kernel must retam mformatIOn
for each node on a per error basls Each node record pomts to a table of errors whlch
has an entry for each error the node may generate. This table IS indexed by Murmur
error code The entnes lU thls table pomt to anode error record whlch holds pomters
to records for each GUI and (for frequency based errors) an error cham. The GUI
no de error record holds the present status of the errer, the number of errors recelved
since it was last cleared and the time it was last cleared. The last cleared time is
used for frequency checked errors.
For frequency checked errors a linked list IS used to hold the past error hlstnry The
elements of the linked hst are allocated dynamlcaIly and there is only one p.rrol' chain
per frequency checked error. AIl GUIs refer to thls error chain ta determine the!r
status. Each element m the li st is a "time bin" holding errors which occurred within
the time mterval specified in the node's error configuration file. Frequency based
errors anse when the number of errors m a given time interval exceed the threshold
for that error This IS deterITlined on a per GUI basls by starting from the tail of the
error cham (the oldest errors), prunmg off those errors whlch have "aged out" and
then traversmg the hst untIl the hm time exceeds the last time the GUI cleared this
error Errors are accumulated from thls point to the head of the list. The total is then
compared to the error thresholds. The corn mon mformation for each error, such as
•
•
•
DAQERI
frequency limits, are kept in a generic error table and are not duplicated for each nodt'
AU updates m the data structure occur m response to an error from I\l1lflllUr
This leads to a problem wlth frequency checked errors, they are exprctcd to age t)Ut
as time passes with no further errors, but the hsts arr only checkt'd whell ét lit'\\' t'rrt)r
arrIves This is handled by havmg the DAQERI kernel generate errors.ll tht' appro
pnate (future) tlme As a Murmur trequency checked error arnves, tilt' t1t1lt' of tht'
oldest time bin is noted, and the error handlmg routme requests that. a "non-M1IfIllur"
error be scheduled to coinclde with the tIme when tbs bmt' hm should ,tgC' out. Tl\('
DAQKER main loop inJects It at the appropnate tane The error IS cl lstlllgUI!.;llt·d
internallyas DAQKER generatcd. so that the error counter~ ù're not II1crcInt'l1ted
DAQERI necessarily mterprets the error messages lt recelves to extract no<l(' <1.lId
facility mformatlOn In addlbon to recelvmg error IIlformatlOn, commancls to the
kernel can be sent via Murmur messages. Commands to stàrt new GUI seSSlOlIS .Ln·
handled in this manner
5.2 DAQGUI
The DAQERI GUI IS an mteractlve dlsplay whlch provldes a graphlcal SUHlTTliLry of
the status m.formatlOn retamed by DAQKER It makes use of the Motif hbrtlfy of
X wmdow rout.mes The mam loop of the DAQGUI process alternately halldlc>s X
events (button clicks etc.) and polis the pIpe from the kernel for stdLus updat.es The
X events result either in more X events being created (e g. a button click results III
a new wmdow heing opened) or m the generatlOn of a ITIt'SSdgC tn the kerne! (' g
clear a node). The DAQGUI changes node colors m response to l1pdates from t}\('
kernel This means that a request to dlsable an error res111ts in d rnessagt' Lo tht'
kernel, followed by an update from the kernel to change the node status to dlsablcd.
The DAQGUI mamtams a data structure slmilar m concept Lü that uf,er! hy
DAQKER with one sigmficant dlfference; the GUI only needs to be concerned wlth
the status of its own nodes and does not necd to keep multIple, independent ('op)('s
of the status of nodes. Messages from the kernel contam a message tyP(~, faulity and
node numbers and a parameter. Dependmg on the message subsequent informatlrJn
•
•
•
DAQERI 31
may aiso be sent Refer~nces in the DAQGUI data structure are based on facility and
node number
The GUI does not keep information on a per error basis, unless the user has ex
panded anode into its error display wmdow. Normally the only mformatIOn a GUI
retains is the overall status of the no de Each GUI node record also contams a pomter
to an icon record, indicating the Icon that the node is represented by. ThIs icon record
keeps count of how many nodes are in each error state and thls mformatlOn IS used
to determme the icon color Likewise, each ieon has a pointer to the icon ab ove it in
the hlerarchy to allow the changes in a node's status to be propagated up through
the hierarehy of icons to the top level
When a user expands anode mto an error list, a request lS sent to the kernel for
full error information The kernel responds by sendmg the error cou nt and status of
each error table entry for that node In additIOn the kernel aiso records the fact that
the node is being momtored, so that as new error information arrIves the information
in the GUI window is updated immediately. The GUI sends a message to disable this
monitoring when the user closes the window
In addition to the node error mformation the GUI processes also provide status
information. InformatIOn is extracted from Murmur messages by the kernel and up
dates are sent to the GUIs. Status information is kept m a separate status box below
the DAQERI mam window (see e.g FIgure 4 2).
5.3 Browse
The browser is a simple program which reads through the logfiles in reverse time
order and compares the error entries to the user speeified selection cuts. The most
recent 100 errors which pass the cuts are dlsplayed in the mam window. The user
may specify which fit>lds of the error messages are to be displayed by m'!ans of buttons
along the bottom of the browser main window.
When launched from within DAQERI, the browser is started by the DAQERI
kernel on behalf of a specifie DAQERI GUI. The "Browse Logfile" button is located
•
•
•
DAQERI 32
in a window for a specifie DAQ node and when the browser lS started m response to a
click on thls button it is passed cuts for node number and PID so that by default It wIll
extract only logfile messages assoclated wlth that node These cuts can be changed
interactively wlthin the browser to either narrow or wlden the group of errors selected
by the browser.
5.4 How an Error is Handled
We now describe how an error from a DAQ element is processed by the complet('
error handlmg system. We aIso use thls opportumty to present some detatls abuut
the operatIon of Murmur.
A DAQ client realizes that an error conditIOn has arisen. It responds by sending
an error message VIa a Murmur routine. If thls is the first error message From UllS
source then the error routine attempts to connect to the Murmur server VIa a Tep /IP
conne ct socket. As part of the connect message the client sends its Ethernet address,
process id and other fixed information, so that this does not have to be repeated with
each subsequent error message. The Murmur server accepts the request and l'st ab
lishes a new socket whlch provides a dedicated Tep /IP connectlOn frorn the client to
the Murmur server. The client then sends all subsequent error messages to thls soekeL
The Murmur server polIs its receive sockets and finds the message from the client.
This message is unpacked and the error code IS looked up ln the central Murmur
database. This database provides the full error text and indicates if the message has
any parameters, help text or action scripts associated with it. The cxpanded error 15
written to the Murmur logfile and sent to any Murmur display windows whic.h have
their routing set to accept it. The message is also sent down a pipe to the DAQglU
kernel.
The DAQERI kernel extracts the Ethernet address, facility number and PlO frorn
the error. It uses this information to get a pomter to a no de record. The error table
indicated by the node record is searched to get to the no de error record. The error
count for an GUIs which have not disabled this error are then updated. Any GUI
sessions which have a window open for this node are sent a message indicating that
the error count (and perhaps status) have changed. For cach GUI in the node record
•
•
•
DAQERI 33
the overall node status is updated. If the node state in any of the GUI records changes
as a result then a message is sent to the corresponding GUI mdicatmg the new status.
The GUI receives a message indicating that the node status has changed and it
responds by changîng the color of the node button. It then checks the lcon WhlCh
represents this no de and determmes if the change m the no de state results m a cnange
in the icon state If thls node message results in a new 1con state then the color of
the icons above this node in the hierarchy are altered This alerts the user that an
error has occurred
The user now clicks on the icon and gets a window of node buttons. Clicking
on the node button results in a display ind1catmg aU the errors for the node, with
those contributing to the error state distinguished by col or The user may clear or
disable the error, resulting in a message to perform the action being sent to the
kernel Alternately, the user may request to browse the logfile This results m a
browse request bemg sent to the kernel, Wh1r.h then starts a browser. By selecting
"update" in the browser the most recent error messages from th1s node are d1splayed
and the full error text is available to the user
•
•
•
Chapter 6
Conclusions
The error reporting system described here has been installed in the COF control room
and is being integrated into the system. It lS anticipated that it will be used as p<\.rt of
run lB. The system meets all the reqUlrements set for It but at the present tlme Bot
all elements of the DAQ report errors via Murmur. ThIs is due to an understandable
reluctance to rewrite the working code in existing DAQ elements. The level3 system
in particular continues to make use of its custom graphicai error display and Il 18 our
opinion that this display should continue to be used (although It could III tlme he
augmented by error messages to Murmur).
The DAQERI/Murmur system fulfills a11 of the requirements outlmed in 3.1. Mur
mur alone satisfies many of these by provlding a central error logging system and error
sending routines which can run on aU the new DAQ elements. The extenslOns pro
vided by DAQERI provide for the graphical monitoring of the DAQ in an intuitIve
fashion.
As with any software system of this kind the evolution of bûth Murmur and
DAQERI is an ongoing process. As detector operators become familiar with the error
reporting system requests for changes in functionality and feature enhancements are
inevitable. However, we beheve our solution provides the essential clements for error
monitoring of the detector and foresee this system being used in one form or another
for the remainder of the CDF experment .
34
•
•
•
Bibliography
[1] F. Abe et. al. Nucl. Instr. and Meth. A271 (1988) 387-403.
[2] The CDF Collaboration. "CDF Data Acquistion System Upgrade". CDF InternaI
Document, AprIl 30, 1992.
[3] G. Drake et al. Nuc!. Instr and Meth A269 (1988) 68-8l.
[4] E Barsotti et al Nucl Instr and Meth. A269 (1988) 82-92.
[5] L. Appleton, C. Morre, G Oleynik, G Sergey and L. UdumuIa, "Murmur User
GUIde", Fermilab Document PN457, 1993
[6] C. Moore, "Murmur Quick Reference Guide", Fermilab Document PN472, 1993.
[7] K. Biery, P. Musgrave, K Ragan, K Strahl, A Hûlscher and P Sinervo "An er
ror reporting interface for the upgraded CDF DAQ system". Conference Record
of the 8th Conference on Real T~me Computer Appl~catwns ~n Nudear, Part~
de and Plasma Physics, June 8-11th, 1993, Ed~tor R. Poutisson, Vancouver,
Canada
[8] K. Biery, P. Musgrave, K. Ragan and K. Strahl, "DAQERI Expert Manual",
CDF Note CDF jDOCjONLINEjCDFR 2334, 1993 .
35