nss 2008, dresden, high performance event building over infiniband networks gsi helmholtzzentrum...
TRANSCRIPT
20.10.08 1NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
Work supported by EU RP6 project JRA1 FutureDAQ RII3-CT-2004-506078
High Performance Event Building over InfiniBand Networks
J.Adamczewski-Musch, H.G.Essel, N.Kurz, S.Linev
GSI Helmholtzzentrum für Schwerionenforschung GmbHExperiment Electronics, Data Processing group
• CBM data acquisition• Event building network• InfiniBand & OFED• Performance tests• Data Acquisition Backbone Core DABC
20.10.08 2NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
CBM DAQ features summary
Complex trigger algorithms on full data:
Self-triggered front-end electronics.
Time stamped data channels.
Transport full data into filter farm.
Data sorting over switched network on
full data rate of ~1TB/s.
Sorting network: ~1000 nodes.
Is that possible (in 2012)?
20.10.08 3NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
Data flow principle
Merge channels
Sort over switched network,units are not events, but time slice data!
Distributecomplete data
Detector electronics, time stamped data channels
Processor farms, event definition, filtering, archiving
Up to 1000 lines1 GByte/sec each
Partial data
Complete data
20.10.08 4NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
Software developments
Software packages developed:
1. Simulation with SystemC (flow control, scheduling)• Meta data on data network
2. Real dataflow core (round robin, with/without sychronization)• Linux, InfiniBand, GB Ethernet• Simulates data sources
3. Data Acquisition Backbone Core DABC (includes dataflow core)• Controls, Configuration, Monitoring, GUI ...• Real data sources• General purpose DAQ framework
Let's have a look to these
Poster N30-98: (Wednesday, 10:30) DABC, a Data Acquisition Backbone Core Library.
20.10.08 5NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
SystemC simulations 100x100 (1)
data
meta data
no traffic shaping 40%
traffic shaping 95% Optimization of network utilizationfor variable sized buffers.Enhancement up to factor two.
Same network used for data and meta data.Occupancy of networkby different kind of data
20.10.08 6NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
Real implementation: InfiniBand (2)
High-performance interconnect technology– switched fabric architecture – up to 20 GBit/s bidirectional serial link– Remote Direct Memory Access (RDMA)– quality of service– zero-copy data transfer– low latency (few microseconds)– common in HPC
Software library:– OpenFabrics (OFED) package
see www.openfabrics.org for more information
InfiniBand switch of the GSI cluster
20.10.08 7NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
OpenFabrics software stack*
SA Subnet Administrator
MAD Management Datagram
SMA Subnet Manager Agent
PMA Performance Manager Agent
IPoIB IP over InfiniBand
SDP Sockets Direct Protocol
SRP SCSI RDMA Protocol (Initiator)
iSER iSCSI RDMA Protocol (Initiator)
RDS Reliable Datagram Service
UDAPL User Direct Access Programming Lib
HCA Host Channel Adapter
R-NIC RDMA NIC
Common
InfiniBand
iWARP
Key
InfiniBand HCA iWARP R-NIC
HardwareSpecific Driver
Hardware SpecificDriver
ConnectionManager
MAD
InfiniBand OpenFabrics Kernel Level Verbs / API iWARP R-NIC
SA Client
ConnectionManager
Connection ManagerAbstraction (CMA)
InfiniBand OpenFabrics User Level Verbs / API iWARP R-NIC
SDPIPoIB SRP iSER RDS
SDP Lib
User Level MAD API
Open SM
DiagTools
Hardware
Provider
Mid-Layer
Upper Layer Protocol
User APIs
Kernel Space
User Space
NFS-RDMARPC
ClusterFile Sys
Application Level
SMA
ClusteredDB Access
SocketsBasedAccess
VariousMPIs
Access to File
Systems
BlockStorageAccess
IP BasedApp
Access
Apps & Access
Methodsfor usingOF Stack
UDAPL
Ker
nel b
ypas
s
Ker
nel b
ypas
s
* slide from www.openfabrics.org
Subnet Administrator:Multicast
Subnet manager:Configuration
20.10.08 8NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
InfiniBand basic tests
Hardware for testing:• GSI (November 2006) - 4 nodes, single data rate SDR• Forschungszentrum Karlsruhe* (March 2007) – 23 nodes, double data rate DDR• UNI Mainz** (August 2007) - 110 nodes, DDR
SDR (GSI) DDR (Mainz)
Unidirectional 0.98 GB/s 1.65 GB/s
Bidirectional 0.95 GB/s 1.3 GB/s
Point-to-point tests
2 KByte Rate per node
GSI (4 nodes) 0.625 GB/s
Mainz (110 nodes) 0.225 GB/s
Multicast tests
* thanks to Frank Schmitz, Ivan Kondov and Project CampusGrid in FZK** thanks to Klaus Merle and Markus Tacke at the Zentrum für Datenverarbeitung in Uni Mainz
0 0,5 1 1,5 2 2,5
Nominal SDR
Unidir. SDR
Bidir. SDR
Nominal DDR
Unidir. DDR
Bidir. DDR
4 nodes Multic.
110 nodes Multic.
GByte/s
20.10.08 9NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
Scaling of non-synchronized round robin traffic
20.10.08 10NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
Effect of sychronization and topology optimization
+ 25%
Topology optimization for 110 nodes would have required different cabling of switches!
20.10.08 11NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
Motivation for DABC (3)
2004 → EU RP6 project JRA1 FutureDAQ*
2004 → CBM FutureDAQ for FAIR
* RII3-CT-2004-506078
1996 → MBS future 50 installations at GSI, 50 external http://daq.gsi.de
Use cases• Detector tests• FE equipment tests• Data transport• Time distribution• Switched event building• Software evaluation• MBS event builder• General purpose DAQ
Data Acquisition Backbone Core
Intermediatedemonstrator
Requirements• build events over fast networks• handle triggered or self-trigger front-ends• process time stamped data streams• provide data flow control (to front-ends)• connect (nearly) any front-ends• provide interfaces to plug in application codes• connect MBS readout or collector nodes• be controllable by several controls frameworks
Poster N30-98: (Wednesday, 10:30) DABC, a Data Acquisition Backbone Core Library.
20.10.08 12NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
DABC development branches
1. High speed event building over fast networks
2. Front-end readout chain tests (CBM, September 2008)
3. DABC as MBS* event builder (Ready)
4. DABC with mixed, triggered (MBS) and time stamped, data channels• Needs Synchronization of between both • Insert event number from trigger to time stamped data stream• Read out time stamp from MBS via VME module (to be built)
From this, DABC evolved as a general purpose DAQ framework.The main application areas of DABC are:
* MultiBranchSystem: standard DAQ system at GSI
20.10.08 13NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
GE: Gigabit EthernetIB: InfiniBand
DABC design: global overview
datainput
sortingtaggingfilteranalysis
datainput
sortingtaggingfilteranalysis
IB
PC
PCLinux
GE
analysisfilterarchive
archive
PC
frontendDataCombinerr
frontendother
frontendReadout scheduler
scheduler
DABC
TCP
PCIe
Applicationcodes throughplug-ins
20.10.08 14NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
DABC design: Data flow engine
A module processes data of one or several data streams.Data streams propagate through ports, which are connected by transports and devices
DABC Module
port
port
DABC Module
port
port
process process
Device
Transport
Device
Transport
Network
Object manager
locally (by reference)
Central data managerMemory pools BufferqueueBufferqueue
Threads
20.10.08 15NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
DABC and InfiniBand
DABC fully supports InfiniBand as data transport between nodes– connection establishing– memory management for zero-copy transfer– back-pressure– errors handling– thread sharing
GSI InfiniBand cluster with four nodes SDR
All to all test
20.10.08 16NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks
GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de
Conclusion
• InfiniBand is a good candidate for fast event building network
• Up to 520 MB/s bidirectional data rate is achieved on 110 nodes without optimization.
• Mixture of point-to-point and multicast traffic is possible
• Further investigation of scheduling mechanisms is required
• Further investigation of network topology is required
Thank You!