lecc2003 amsterdammatthias müller a robin prototype for a pci-bus based atlas readout-system b....
TRANSCRIPT
LECC2003 Amsterdam Matthias Müller
A RobIn Prototype for a PCI-Bus based
Atlas Readout-System
B. Gorini, M. Joos, J. Petersen (CERN, Geneva)A. Kugel, R. Männer, M. Müller, M. Yu (University of Mannheim)
B. Green (Royal Holloway University London)G. Kieft (NIKHEF, Amsterdam)
LECC2003 Amsterdam Matthias Müller 2
Outline
• Overview• The Atlas Readout Sub-System (ROS)• PCI based Atlas ROS• The RobIn Prototype• Measurements• Conclusions
LECC2003 Amsterdam Matthias Müller 3
Overview• PCI-based-ROS is one of the two implementation option
of the Atlas ROS. Uses custom PCI board for receiving / buffering data RobIn Host is a PC with multiple PCI-Buses Gigabit Ethernet connection to LVL2 and EF PC running multithreaded Software and Master-DMA based PCI
messaging scheme
• Data request rates of 170kHz@1kB measured
• Full scale system achieves LVL1 Rate of 130kHz@1kB (with GE Net I/O)
LECC2003 Amsterdam Matthias Müller 4
Atlas Readout Subsystem Overview
• Buffers detector data while LVL2computes trigger decision
• 1600 links from detector
• up to 160 MB/s input bandwidth, 100kHz input rate.
• 2 kHz output to LVL2 on request via Gigabit Ethernet
• Output to Event Filter on event accept (~3kHz)
ARobC
LVL2EF
ATLAS Detector
decision
data
datadataARobC
LVL2EF
ATLAS Detector
decision
data
datadata
LVL2EF
ARobC
LVL2EF
ATLAS Detector
decision
data
datadataARobC
LVL2EF
ATLAS Detector
decision
data
datadata
ROS
LECC2003 Amsterdam Matthias Müller 5
VME busRCP
ROD
ROD
ROD
ROD
Config & Control
Event sampling & Calibration data
… PCI busR
OB
IN
RO
BIN
RO
BIN
NIC
GigabitEthernetlinks
LVL2 & Event Builder NetworksLVL2 & Event Builder NetworksAlternative data
paths
RODCrate
Processor
ROLs Data
90 crates (~40 racks)
144 4U PCs(~15 racks)
1600 links (HOLA S-link,160 MByte/s per link)
In USA15(underground)
In SDX15(at surface)
Atlas Readout Subsystem
LECC2003 Amsterdam Matthias Müller 6
PCI based Atlas ROS: Hardware
• Available: 2 GHz , 2.4 GHz and 3 GHz Xeon PC• OS: Linux CERN RedHat 7.3, 2.4.18 kernel (patched)
532MB/s
CPU (2.4GHz)
MemDDR RAM
PCI64bit/66MHz
SCSI 2xFE/GE
Slot 1
Slot 2
Slot 3
Slot 4
Slot 5 Slot 6
~2GB/s
532MB/s 532MB/s 532MB/s
PCI64bit/66MHz
PCI64bit/66MHz
PCI64bit/66MHz
RobIn RobIn RobIn
RobIn RobIn
GEth
LECC2003 Amsterdam Matthias Müller 7
PCI based Atlas ROS: Software
• ROS software multi-threaded• Fragment Manager interface for RobIn hardware abstraction
= Thread
= Process
Requests(L2, EB, Delete)
Request Queue
RobInsRequest Handlers
Control, error
Trigger
FragmentsFragment Manager
PCI - BUS
= Thread
= Process
Requests(L2, EB, Delete)
Request Queue
RobInsRequest Handlers
Control, error
Trigger
FragmentsFragment Manager
PCI - BUS
= Thread
= Process
Requests(L2, EB, Delete)
Request Queue
RobInsRequest Handlers
Control, error
Trigger
FragmentsFragment Manager
PCI - BUS
= Thread
= Process
Requests(L2, EB, Delete)
Request Queue
RobInsRequest Handlers
Control, error
Trigger
FragmentsFragment Manager
PCI - BUS
LECC2003 Amsterdam Matthias Müller 8
The RobIn Prototype
ARobC
LVL2EF
ATLAS Detector
decision
data
datadataARobC
LVL2EF
ATLAS Detector
decision
data
datadataARobC
LVL2EF
ATLAS Detector
decision
data
datadataARobC
LVL2EF
ATLAS Detector
decision
data
datadata
Gigabit EthernetInterface
Xilinx XC2V1500FPGA 128MB SDRAM
Buffer
4MB SRAMManagement
RAM
IBM PowerPC405CR
Processor
PLX 9656PCI Bridge
2 optical HOLA SLinkInput channels
LECC2003 Amsterdam Matthias Müller 9
The RobIn Prototype (2)
RobIn
ROSSoftware
PowerPC
PLX9656
FPGA
FragmentManager
DMA Engine
DMA BufferSOH
EventData
EventData
X
RequestFIFO
EventData
Buffer
DMAFIFO
PCI Bus
Clear Request
Data RequestData ResponseEvent Data
RobIn
ROSSoftware
PowerPC
PLX9656
FPGA
FragmentManager
DMA Engine
DMA BufferSOH
EventData
EventData
X
RequestFIFO
EventData
Buffer
DMAFIFO
PCI Bus
Clear Request
Data RequestData ResponseEvent Data
• Requests to RobIn sent by PCI single cycles
(data requests) by PLX Bus Master DMA
(clear requests)
• Event data from RobIn: FPGA sends fragment
without first word First word transmitted finally
to signal end-of-transfer
LECC2003 Amsterdam Matthias Müller 10
Measurements
• Initially RobIn Prototype not available
• All presented measurements (except one) with alternative RobIn hardware MPRACE1
• MPRACE1: Common purpose PCI based FPGA Co-Processor FPGA and PCI bridge identical to RobIn Prototype FPGA only board no PowerPC processor available. Implementing the same PCI messaging as the RobIn Prototype
• Measurements on three different PCs: a 2GHz Xeon, a 2.4GHz Xeon and a 3GHz Xeon
LECC2003 Amsterdam Matthias Müller 11
Measurements- Multi-threading -
• Bare data request performance with 1 RobIn, no I/O to Gigabit Ethernet
• Variation of Request Handler threads shows maximum at 14
Variation of Request Handlers2.4 GHz PC, 1 MPRACE RobIn, no Net I/O, 1kB fragments
0
20
40
60
80
100
120
140
160
180
0 5 10 15 20 25 30 35
# of Request Handlers
Req
ues
t R
ate
[kH
z]
LECC2003 Amsterdam Matthias Müller 12
Measurements- Fragment Size Dependency -
• MPRACE: Up to 512 bytes: fix request overheads overlap the returning fragment data transmissions from the RobIn.
very small fragment size dependency
• RobIn Prototype: comparison with MPRACE seems to be valid,up to 1kB no fragment size dependency
Fragment Size Dependency2.4 GHz PC, no Network I/O, 1 RobIn
0
50
100
150
200
250
0 200 400 600 800 1000 1200 1400 1600 1800
Fragment Size [bytes]
Req
ues
t R
ate
[kH
z]
MPRACE
RobIn Prototype
LECC2003 Amsterdam Matthias Müller 13
Measurements- Influence of DC I/O -
• 4 ROLs per RobIn (MPRACE) emulated
• Network I/O to LVL2 and EF reduce performance by a factor of 3.
Effect of Network I/O
0
100
200
300
400
500
600
0 2 4 6 8 10 12
Fraction of LVL1 events accepted by LVL2 trigger (%)
Max
imu
m s
ust
ain
able
LV
L1
rate
(kH
z)
With Net I/OWithout Net I/O
rate of LVL2 data requests per ROS = 16% of LVL1 rateLVL2 mean data size = 1.4 KByte
LECC2003 Amsterdam Matthias Müller 14
Measurements- DC I/O and CPU scalability -
• 4 ROLs per RobIn (MPRACE) emulated
• Moving towards a 3 GHz PC improves performance by ~25%.
CPU scalability
0
20
40
60
80
100
120
140
160
180
200
0 2 4 6 8 10 12
Fraction of LVL1 events accepted by LVL2 trigger (%)
Max
imu
m s
ust
ain
able
LV
L1
rate
(kH
z)
ROS PC @ 3 GHzROS PC @ 2 GHz
rate of LVL2 data requests per ROS = 12% of LVL1 rateLVL2 mean data size = 1 KByte
LECC2003 Amsterdam Matthias Müller 15
Conclusions
• Max. request performance per RobIn is 170 kHz (1kB fragment size).
• “Standalone” ROS can handle 12 ROLs on 3 RobIns with 300 kHz LVL1 input rate .
• Full scale ROS System (3GHz Xeon PC) handles 130 kHz LVL1 input rate (> Atlas requirements)
• First measurements with RobIn Prototype confirm the results obtained with an earlier prototype (MPRACE).
LECC2003 Amsterdam Matthias Müller 16
RobIn (MPRACE1)
PLX9656(PCI
Connection)
Xilinx VirtexIIFPGA
ControlPLD
Expansion Connector Expansion Connector
ZBT SRAM 2MBZBT SRAM 2MB
ZBT SRAM 2MBZBT SRAM 2MB
SDRAM SocketLocal Bus 32bit/66MHz
PCI Bus 64bit/66MHz
• Parts common to the RobIn Prototype:PLX Pci Bridge, Local Bus, FPGA
• Firmware implements RobIn Prototype Message Passing protocol• On-board “local” bus limited to 266MB/s (half of max. PCI
throughput)
LECC2003 Amsterdam Matthias Müller 17
Measurements- Influence of DC I/O -
• 4 ROLs per RobIn (MPRACE) emulated
• Network I/O to LVL2 and EF reduce performance to 1/3
• Large EB fractions: performance limited by GE line speed
• Small EB fractions: performance limited by PC’s computing power
Max. LVL 1 Rate for 12 ROLs on 3 RobIns2% LVL2 rate, 1kB Fragments, DC I/O uses UDP
0
50
100
150
200
250
300
0 2 4 6 8 10 12EB Fraction (%)
Max
L1
rate
(kH
z)
No DC I/O
DC IO 2.4 GHz PC
DC IO 3GHz PC
100 kHz
Gigabit Ethernet Line Speed
3 kHzAtlas Baseline
LECC2003 Amsterdam Matthias Müller 18
Measurements- Multiple PCI Buses -
• Request rate decreases, even though PCI – Bus is not saturated.Low parallelism in software?
Request Rate Depending on the # of Boards2.4 GHz PC, MPRACE on different Buses, 8 Request Handlers, 1kB Fragments
0
20
40
60
80
100
120
140
160
180
0 1 2 3 4 5
# of Boards
Req
ues
t R
ate
[kH
z]
0
20
40
60
80
100
120
140
160
180
Dat
a V
olu
me
[MB
/s]