a time-predictable memory network-on-chip
TRANSCRIPT
A Time-predictable Memory Network-on-Chip
Martin Schoeberl, David VH Chong, Wolfgang Puffitsch, Jens Sparsø
Technical University of Denmark
Overview
! T-CREST Project ! Multi-core issues ! Network-on-Chip ! TDM based memory tree ! Implementation results ! Conclusions
Martin Schoeberl 2 Time-predictable Memory NoC
T-CREST Project/Platform
! Time-predictable Multi-Core Architecture for Embedded Systems
! Processor Patmos ! Network-on-chip ! Memory controller and arbitration tree ! WCET optimizing compiler ! WCET analysis with aiT ! Most artifacts open source
Martin Schoeberl 3 Time-predictable Memory NoC
T-CREST Multi-Core
Multicore
Processorcore
Processorcore
Processorcore
Processorcore
Memorycontroller
Core-to-core NoC
Memory NoC
Memory
Martin Schoeberl 4 Time-predictable Memory NoC
Issues with Multi-Core
! Shared resources " Main memory " L2 caches
! Shared to " Save resources
• Memory chips and processor pins " Task/thread communication
• Shared data in shared memory
Martin Schoeberl 5 Time-predictable Memory NoC
Interferences
! Interferences between threads on different cores " We have now true concurrency
! Modeling all possible interleavings is impossible
! Memory access time/latency needed for WCET analysis
! Arbitration determines max. latency " Dynamic arbitration introduces
interferences
Martin Schoeberl 6 Time-predictable Memory NoC
Time-predictable Arbitration
! Time-division multiplexing ! Decouples processor cores ! Allows WCET analysis of multi-cores
! Centralized (bus) arbitration does not scale
! Use a network-on-chip (NoC)
Martin Schoeberl 7 Time-predictable Memory NoC
Proc. node
MEM ctl
Patmos CPU
I$
SPM
Network Interface
SDRAM
DMA
D$
Typical NoC Organization
Martin Schoeberl 8 Time-predictable Memory NoC
Typical NoC
! Mesh or torus layout " Regular topology " Many to many communication
! On node is ‘special’ – the memory controller " This is a many-to-one relation ship " Does not fit into this grid/mesh structure " Better a tree structure
Martin Schoeberl 9 Time-predictable Memory NoC
Two NoCs in T-CREST
! Core to core " Message passing between cores " Most traffic shall stay on-chip " Torus " TDM based arbitration
! Cores to memory " Read and write (cache line) burst " Shared memory " Tree architecture " TDM based arbitration
Martin Schoeberl 10 Time-predictable Memory NoC
Distributed TDM Memory NoC
Martin Schoeberl 11 Time-predictable Memory NoC
Multicore
Memorycontroller
Memory
NI
Processorcore
Processorcore
MI
NI NI
Processorcore
Processorcore
NI
OCP
OCP
TDM Tree
! The TDM schedule at the memory interface " No buffering in the memory controller
! Pipelined tree and response channel ! Pipeline delay in tree is known
" No buffering in tree nodes
! Distributed TDM arbitration at nodes " Just with the right offset " The packet knows when to go
Martin Schoeberl 12 Time-predictable Memory NoC
Distributed TDM Arbitration
! No buffering and arbitration in the network interface " No handshake " No credits or rate control
! The processor local memories are the only ‘buffers’ " Cache or SPM
! A packet can ‘freely’ flow through the tree
! End to end TDM without buffering
Martin Schoeberl 13 Time-predictable Memory NoC
Pipelining & Moving TDM Slots
Martin Schoeberl 14 Time-predictable Memory NoC
Core 0 command
r0
Core 1 command
r1
NI command
r0 r1
Controller command
r0 r1
Memory
r0 D D D D r1 D D D D
Controller response
D D D D D D D D
Core 0 response
D D D D
Core 1 response
D D D D
T � 1 Ldown
tslot
Lup
Slot 3 Slot 0 Slot 1 Slot 2
Implementation
! We implemented three arbiters: " Round robin " Centralized TDM arbiter " Distributed TDM arbiter
! Using multi-core of Patmos ! Chisel as HW description language ! Implementation in an FPGA
" Constraint to 200 MHz " Processor about 80 MHz in this FPGA
Martin Schoeberl 15 Time-predictable Memory NoC
Resources in LCs
! LC is an FPGA logic cell ! RISC pipeline 2000-5000 LCs ! Dist. TDM
" Has local slot counters (replicated) " Still just ~ 100-120 LCs per node
Martin Schoeberl 16 Time-predictable Memory NoC
Cores 4 8 16 32 64 128 RR 139 324 614 1267 2549 5114 Cent. TDM 235 446 969 1943 3893 7764
Dist. TDM 470 980 1894 3827 7575 10277
Maximum Frequency
Martin Schoeberl 17 Time-predictable Memory NoC
0
50
100
150
200
0 20 40 60 80 100 120 140
Fm
ax
(MH
z)
Number of nodes
Distributed TDMRound-robinCentral TDM