update of ibl rod proposal based on rce/cim concept and atca platform rainer bartoldus, andy haas,...
TRANSCRIPT
1
Update of IBL ROD proposal Update of IBL ROD proposal based on RCE/CIM concept and based on RCE/CIM concept and
ATCA platform ATCA platform
Rainer Bartoldus, Andy Haas, Mike Huffer, Martin Kocian,
Su Dong, Emanuel Strauss, Matthias Wittgen (SLAC)Erik Devetak, Dmitri Tsybychev (Stonybrook)
2
IBL ROD Upgrade Scheme
Initial mode: pure ROD behavior to output via S-link to ROS from each ROM.
Upgrade Mode: combined ROD+ROS behavior directly output to Ethernet.
Read OutModule
3
Essential Features of RCE on ATCA
• Generic DAQ concept with RCE born out of analysis of previous HEP DAQ systems to establish basic building blocks serving common needs of broad range of applications.
• Explore the modern System-On-Chip technology with e.g. Vertex-4 FPGAs with versatile integrated resources.
• High speed I/O capabilities for multi Gb/s transmissions to fully utilize FPGA processing power and reduce system footprint.
• Implementation over ATCA based crate infrastructure to benefit from modern telecommunication technology.
• A system consists of RCE processing boards and Cluster Interconnect Modules (CIM) to utilize ATCA point-point serial backplane connections for high bandwidth data movements and 10GE ethernet access.
• Rear Transition Modules (RTM) to facilitate custom user I/O. • Extensive software infrastructure and utilities are integral
part of the design.
4
Reconfigurable Cluster Element (RCE)
MGTs
Configuration128 MByte Flash
Memory Subsystem
Core
Resources
DX Ports
DX Ports
DSP tiles
Processor
450 MHZ PPC-405Cross-Bar
512 MByte RLDRAM-II
DSP tiles
CombinatoricLogic
DSP tiles
DSP tilesMGTs
Boot Options
reset & bootstrap
options
Combinatoric
Logic
CombinatoricLogic
Combinatoric
Logic
Next generation with Virtex 5
RCE memory 1-2 Gbytes
(192 MAC units)+ Extensive associated software infrastructure
and utilities
Current implementationOn Virtex-4 FPGA
5
RCE Software & Development
• Cross-development…– GNU cross-development environment (C & C++)– remote (network) GDB debugger– network console
• Operating system support…– Bootstrap loader– Open Source Real-Time kernel (RTEMS)
• POSIX compliant interfaces• Standard IP network stack
– Exception handling support• Object-Oriented emphasis:
– Class libraries (C++)• Plugin support• Configuration Interface
6
A 48 channel ROM (Read Out Module)
2x6
Need update for S-linkfrom CIM switch to P3
7
RCE Development Lab at CERN
HSIO & pixel moduleat back of rack
8
RCE Development Status• An open collaboration of anyone interested in exploring
the RCE platform for ATLAS upgrades.• Training workshop June/09 at CERN:
http://indico.cern.ch/conferenceOtherViews.py?view=standard&confId=57836
contains documentations, online examples, mailing list instructions, RCE lab account signup etc. Everyone is welcome to explore !
• Work are underway to port current pixel calibrations to RCEs with modern pixlib+TDAQ code, aiming at FE-I4 tests, test beam and IBL stave-0.
• A compact RCE+HSIO test stand board planned for Feb.• Full set of prototypes in coming months to demonstrate
integration of IBL I/O, S-link, TTC interface. • A significantly upgraded generation-2 RCE with Xilinx
Virtex 5 is envisioned this year (more memory and user firmware space).
9
IBL Upgrade Hardware Components (I)
• ROM– Regular ROM assumes all functionalities of present ROD
and with room to host ROS functionalities.– Each ROM has 6 RCEs (FPGAs) hosting 12 cores to
process 48x160Mb/s input FEs (only ~1/5-10 of RCE I/O capacity).
– RCE includes all resources for data formatting, DAQ data flow, calibration + memory in present ROD.
– Each ROM has one additional CIM FPGA hosting network switch, S-link data gathering/formatting, TTC interface.
• RTMROM
– Assumes only the front-end communication roles of the present BOC, while only hosting simple drivers for S-links.
– 48 channel compact optical components for TX/RX functionalities.
– No need to deal with 8b/10b encoding as the RCE has embedded native utilities to encode/decode.
10
IBL Upgrade Hardware Components (II)
• CIM– Assumes the network interconnect management and
external interface roles to cover present SBC and TIM functionalities.
– RCE master + 2 Fulcrum FM224s ASICs for 10 GE network switching.
• RTMCIM
– Ethernet I/O connections. – Some functionalities of present TIM and drivers for I/O
with the pixel system TTC crate. • There is no longer a dedicated TIM module in the system.
TTC interface is distributed with TTCrx ASIC next to each RCE.
• There is no longer an SBC in the system. The role of the CPU in SBC for interfacing TDAQ to VME board is taken up by the distributed RCE CPUs (therefore avoiding the limitations of single ethernet port and VME backplane bandwidth at the SBC).
11
TTC Distribution in RCE/CIM crate
Distributed interface with TTCrx ASIC paired with each RCE
QPLL will be absorbed into
RCE
Half crate
12
Interface to Pixel Detector
13
ROM RTM Interface
• General strategy is to bring plenty signals from each RCE to the P3 connector in a uniform way to keep the ROM completely generic while RTM can be changed for different applications.
• Connection count for IBL RTM:– 1 pair wires for clock40 from each RCE (2 cores)– 4 wires (2 up + 2 down) per channel per core
(2xcores/RCE, 4 channels/core) – 3+1 wires of I2C per core (+1=spare)– additional 8 wires per core spare ?
7 RCE x ( 2 + 4x4x2 + 4x2 + 8x2 ) = 406
Current RCE board already has 500 pin P3
14
Upgrade ROM Benefits
• Allow more frequent/extensive/faster calibration – Calibration histogram data output path via 10GE ethernet will
completely remove data shipping timing concerns.– 4x (12x) more memory per pixel than current VME ROD for IBL
(current outer layer), and the memories are internal within RCE with much faster access.
– Power PC programming environment much easier than DSPs for complex algorithms, while the 192 DSP tiles/RCE offers large processing power for repetitive simple processing.
• Smaller footprint modern hardware for easier production, installation and maintenance.
• Simpler variation of the ROM with present RCEs offers prototype and test stand boards to meet FE-I4 tests, stave test needs and same software preserved into full system.
• Has built-in architecture evolution flexibility to explore upgrade schemes such as integrated ROD+ROS and potential services to trigger with the very high bandwidth.
15
Backward Compatibility & Commissioning
• Despite the different look of hardware, the user interface will be no different to the existing pixel detector and interface to the rest of pixel DAQ and TDAQ will also look like just another pixel crate (until we try to become ROD+ROS).
• New system can also be made to be able to run on present b-layer so that fiber splitting can be done early on with real system as parasitic DAQ commissioning (as extensively used in BaBar/Tevatron). Old b-layer ROD can become (plenty) spares for outer layers.
• The most important issue is software compatibility:– Most existing calibration DSP code are adoptable to
RCE CPU which is an much easier environment. – The SBC TDAQ interface and infrastructure code can
also run on the RCE. VME is not the magic word to guarantee software backward compatibility, while a flexible modern hardware may do better than naïve prejudice...
16
Application of RCE to Pixel Calibration
Pixel DigitalCalibrationDemo by Martin Kocian
After a few mask stages
End of calibration
Demonstrated at RCE training workshop Jun/15-16/2009 at CERN
Existing PixelModule 3 Gb/s
/CIM
10-GEEthernet HSIO
17
Calibration Software Progress
• June setup was a bit of hack on old pixlib while we would like to have software written once for teststand->full DAQ system.– Martin created a new framework adapting the current Pixel
Action Server based on the TDAQ IS/IPC infrastructure. This now runs on RCE. Compatibility to full system DAQ/calibration !
• June setup had only the digital calibration – Matthias ported the threshold scan DSP code to RCE and ran
after a couple of weeks. Needed to convert a few floating point ops to i64 for the fit.
• June setup data formatting software was a bit slow – Got some magic code from JJ Russell sped up by ~200 ! Erik
and Martin debugged and commissioned the fast formatter. • June setup configuration was a bit slow
– Martin and Erik commissioning blockwrite RCE firmware update
• June setup was using a command line control– Emanuel/Andy integrating ST control interface
18
Software Time Line
• On track to demonstrate the full calibration chain for digital and threshold scans by IBL gen meeting in Feb., with a modern pixel TDAQ code infrastructure and hopefully ST control interface.
• Full set of calibrations can be ready by late summer this year for FE-I4 testing.
• Multi-channel readout with existing RCE+HSIO is already being explored and can be expected to be in operation sometime this year for test beam and cosmic telescope if priority is assigned.
19
Hardware Time Line
• Existing RCE + HSIO – Can already run all calibrations for FE-I4/sensor tests
without optical link. HSIO serves the “eBOC” function (and much more) compared to the old system.
– With a simple new RTM can also run multi-channel test beam and cosmic telescope and even stave-0 (may be a bit slow).
• +Prototype RTM (spring this year?):– IBL and current pixel optical link validation– S-link RCE plugin validation
• +Gen-1 RCE prototype ROM and CIM with TTC (summer ?)– TTC interface validation
• +Gen-2 RCE full ATLAS ROM/CIM prototypes (early 2011) – Full blown multi-channel DAQ/calibration tests. – Ready for stave-0.
20
RCE+HSIO combined board
• RCE boards have strong software base for flexible and fast development, but rather bulky with the ATCA crate infrastructure and excess resources not needed for test stand.
• HSIO has the large variety and multiplicity of I/O channels to serve wide range of applications, but the vast FPGA resources is not easy to explore with coding only in firmware.
• Dave Nelson is working on a combined test stand board merging RCE and HSIO:– A slimmed down single FPGA RCE and software support– A separate Virtex-5 FPGA play original HSIO role– Same variety of I/O channels as HSIO– Same simple stand alone bench operation as HSIO with
just an external 48V, but can also just plug in an ATCA crate
• Expecting to roll for production in Feb/Mar with a rather large demand (~15) from strip upgrade.
21
Summary
• RCE/ATCA based ROD for IBL can easily meet the IBL ROD requirements and offers extra margin for much improved performance.
• The project is very much realizable on the IBL time frame owing to the well advanced R&D already carried out at SLAC for other projects, so that the manpower needs is not excessive.
• The upgrade system has a small hardware foot print and moderate cost.
• The application software effort can benefit from integrated core software utilities and now have a clear path forward for fast progress in calibration implementation.
• This can be a very beneficial forward looking step for ATLAS pixel and DAQ in general to evolve smoothly into a modern architecture with extra capacity to allow potentially more innovative use of the pixel system (e.g. in trigger).
Additional collaborating effort are very much welcome !
22
Backup
2323
RCE Hardware ResourcesRCE Hardware Resources
• Multi-Gigabit Transceivers (MGTs)– up to 12 channels of:
• SER/DES• input/output buffering• clock recovery• 8b/10b encoder/decoder• 64b/66b encoder/decoder
– each channel can operate up to 6.5 gb/s– channels may be bound together for greater
aggregate speed• Combinatoric logic
• gates• flip-flops (block RAM)• I/O pins
• DSP support– contains up 192 Multiple-Accumulate-Add (MAC) units
2424
The Cluster Interconnect (CI)The Cluster Interconnect (CI)
• Based on two Fulcrum FM224s– 24 port 10-GE switch– is an ASIC (packaging in 1433-ball BGA)– XAUI interface (supports multiple speeds including 100-
BaseT, 1-GE & 2.5 gb/s)– less then 24 watts at full capacity– cut-through architecture (packet ingress/egress < 200
NS)– full Layer-2 functionality (VLAN, multiple spanning tree
etc..)– configuration can be managed or unmanaged
Management bus
RCE
10-GE L2 switch10-GE L2 switch 10-GE L2 switch
Q0 Q1
Q2 Q3
2525
Cluster Interconnect board + RTM (Block diagram)Cluster Interconnect board + RTM (Block diagram)
MFD CI
Q2
P2
1-GE
10-GE XFP
XFP
Q0
Q1 Q3
XFP
XFP
XFP
XFP
XFP
XFP
XFP
Payload RTM
P3
P3
10-GE1-GE
10-GE XFP
10-GE
(fabric)
(base)
base
fabric
(fabric)
(base)
2626
sLHC Upgrade Read-Out-Crate (ROC)sLHC Upgrade Read-Out-Crate (ROC)
from L1
CIM
Rear Transition Module
10-GE switch
P3
Backplane
Rear Transition Module
switch managementL1 fanout 10-GE
switch
Shelf Management
10-GE switch
10-GE switch
P3
ROMs
CIM
To monitoring & control from L1
To L2 & Event Building
switch managementL1 fanout
(X12) 10 gb/s
(x4) 10 gb/s
(x4) 10 gb/s
27
ROD busy