david m. zar block design review: planetlab line card header format
DESCRIPTION
3 - David M. Zar - 3/8/2016 Line Card Centric Overview Lookup Phy Int Rx Switch Tx QM/Schd Key Extract Hdr Format Lookup Key Extract Switch Rx Phy Int Tx QM/Schd Hdr Format SWITCHSWITCH Port Splitter Port Splitter (Ingress and Egress): »Accepts packets on a NN ring »Based on the physical destination port number 0-4 go to QM1 on a scratch ring 5-9 go to QM2 on a scratch ring »Measured delay is about 120 cycles, including memory latencyTRANSCRIPT
![Page 1: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/1.jpg)
David M. [email protected]
http://www.arl.wustl.edu/projects/techX
Block Design Review:
PlanetLab Line Card Header Format
![Page 2: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/2.jpg)
2 - David M. Zar - 05/14/23
Revision History 10/31/06 (DMZ):
»Initial Draft 11/04/06 (DMZ):
»Updates for performance issues
![Page 3: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/3.jpg)
3 - David M. Zar - 05/14/23
Line Card Centric OverviewLookupPhy Int
RxSwitch
TxQM/SchdKeyExtract
HdrFormat
Lookup KeyExtract
SwitchRx
Phy IntTx QM/Schd Hdr
Format
SWITCH
Por
t Spl
itter
Por
t Spl
itter
Port Splitter (Ingress and Egress):»Accepts packets on a NN ring»Based on the physical destination port number
0-4 go to QM1 on a scratch ring 5-9 go to QM2 on a scratch ring
»Measured delay is about 120 cycles, including memory latency
![Page 4: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/4.jpg)
Ingress Header Format
![Page 5: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/5.jpg)
5 - David M. Zar - 05/14/23
Ingress Header Format Microengine Usage
»One microengine»Eight identical threads»NN ring input from Lookup»NN ring output to Port Splitter
Main functions:»Using data from Lookup, modify packet header in DRAM for proper
routing to PE: Destination MAC address
First five bytes are same as source MAC address Source MAC address
Address of this LC VLAN tag
»Adjust pre-queue stats counters»Format input data for QM
QID Port Number Ethernet Frame Length
![Page 6: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/6.jpg)
6 - David M. Zar - 05/14/23
LC Ingress Functional Blocks
Type=802.1Q (2B)
PAD (nB)CRC (4B)
UDP Payload(MN Packet)
Src Addr (4B)Dst Addr (4B)
Ver/HLen/Tos/Len (4B)ID/Flags/FragOff (4B)
TTL (1B)Protocol = UDP (1B)
Hdr Cksum (2B)
DstAddr (6B)SrcAddr (6B)
IP Options (0-40B)Src Port (2B)Dst Port (2B)
UDP length (2B)UDP checksum (2B)
VLAN (2B)Type=IP (2B) Et
hern
etHe
ader
IPHe
ader
UDP
Head
erEt
hern
etTr
aile
r
LookupPhy IntRx
SwitchTxQM/SchdKey
ExtractHdr
Format
Buf Handle(32b)IP Pkt
Length (16b)
QID (20b)VLAN (16b) Stats Index (16b)
DAddr(8b)
Port(4b)
Reserved(8b)
Eth HdrLen (8b)
Stats Index (16b)
Buffer Handle(32b)
Frame Length (16b)
QID(20b)Rsv(4b)
Port(4b)
Rsv(4b)
Type=IP (2B)
PAD (nB)CRC (4B)
UDP Payload(MN Packet)
Dst Addr (4B)Src Addr (4B)
Ver/HLen/Tos/Len (4B)ID/Flags/FragOff (4B)
TTL (1B)Protocol = UDP (1B)
Hdr Cksum (2B)
DstAddr (6B)SrcAddr (6B)
IP Options (0-40B)Src Port (2B)Dst Port (2B)
UDP length (2B)UDP checksum (2B)
Type=802.1Q (2B)
PAD (nB)CRC (4B)
UDP Payload(MN Packet)
Dst Addr (4B)Src Addr (4B)
Ver/HLen/Tos/Len (4B)ID/Flags/FragOff (4B)
TTL (1B)Protocol = UDP (1B)
Hdr Cksum (2B)
DstAddr (6B)SrcAddr (6B)
IP Options (0-40B)Src Port (2B)Dst Port (2B)
UDP length (2B)UDP checksum (2B)
VLAN (2B)Type=IP (2B) Et
hern
etHe
ader
IPHe
ader
UDP
Head
er
Possible Input Packet Formats Ouput PacketFormat
![Page 7: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/7.jpg)
7 - David M. Zar - 05/14/23
MAC Address and VLAN Tag (Ingress) The source MAC address is fixed and set at
boot time (_WU_get_mac_address)
The destination MAC address will only differ in the last byte and this byte is obtained from the Lookup data.
The VLAN tag is obtained from the Lookup data.
![Page 8: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/8.jpg)
8 - David M. Zar - 05/14/23
Stats/Counters (Ingress/Egress) The Stats Index is obtained from the Lookup Data The pre-queue packet and byte counters are updated
(_WU_update_counters)» Packet counter is incremented (atomic SRAM)» Byte count is incremented by the number of bytes in
the entire Ethernet frame (_WU_get_enet_frame_length).
Frame_length = IP_pkt_len + 18 18 is the VLAN Ethernet header length
![Page 9: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/9.jpg)
9 - David M. Zar - 05/14/23
QM Data Formatting (Ingress and Egress)
QID is extracted from Lookup data Port number is extracted from Lookup data Total Ethernet frame length is passed to QM Stats index is passed on for post-queue counters
Stats Index (16b)
Buffer Handle(32b)
Frame Length (16b)
QID(20b)Rsv(4b)
Port(4b)
Rsv(4b)
![Page 10: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/10.jpg)
10 - David M. Zar - 05/14/23
Ingress HF Block Diagram
_WU_get_enet_frame_length
_WU_write_vlan_header
_WU_update_counters
_WU_update_buffer_descriptorWait for prev ctx
Signal next ctx
NN Enqueue
Wait for prev ctx
Signal next ctx
NN Dequeueinit
signal
dl_sink()
dl_source()
DRAM: 4|5 4B writes
Cycles: 26
SRAM: 1 read 1 write
Cycles: 10
SRAM: 3 writes
Cycles: 12
Cycles: 10
Cycles: 5
Cycles: 2
Cycles: 1
Total cycles: 33+66=99 Budget: 1400 MHz/(10Gbs/8*90) = 100.8 => 100 cycles
Measured Latency: 745
Cycles: 17
Cycles: 16
![Page 11: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/11.jpg)
11 - David M. Zar - 05/14/23
Ingress Validation Send in non-tunneled packets and check output
packets to see they are our internal, tunneled, packets.» Worked during development but not tested in integrated
system at this point. Send in tunneled packets and check output packets to
see they are our internal, tunneled, packets.» Example:
01020304 05060708 090a0b0c 81000aaa 08004500 00380000 0000ff11 3a61c0a8 0001c0a8 00020001 00010024 ffbd4500 001c0000 0000ff11 3a7dc0a8 0001c0a8 00020001 00020008 7e87 [6d7e d5be] CRC that’s stripped by RX->
» 01020304 0a020102 03040a0b 81000002 08004500 00380000 0000ff11 3a61c0a8 0001c0a8 00020001 00010024 ffbd4500 001c0000 0000ff11 3a7dc0a8 0001c0a8 00020001 00020008 7e87
![Page 12: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/12.jpg)
Egress Header Format
![Page 13: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/13.jpg)
13 - David M. Zar - 05/14/23
Egress Header Format Microengine Usage
»One microengine»Eight identical threads»NN ring input from Lookup»NN ring output to Port Splitter
Main functions:»Using data from Lookup, modify packet header in DRAM for proper
routing to Switch: Destination MAC address
First five bytes are same as source MAC address Destination MAC address is looked up based on IP address from lookup
Source MAC address Address of this LC
VLAN tag»Adjust pre-queue stats counters»Format input data for QM
QID Port Number Ethernet Frame Length
![Page 14: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/14.jpg)
14 - David M. Zar - 05/14/23
LC Egress Functional Blocks
Lookup KeyExtract
SwitchRx
Phy IntTx QM/Schd Hdr
FormatSWITCH
EthernetFrame Length (16b)
Buffer Handle(32b)
Stats Index (16b)
QID(20b)Rsv(4b)
Port(4b)
Rsv(4b)
Type=802.1Q (2B)
PAD (nB)CRC (4B)
UDP Payload(MN Packet)
Src Addr (4B)Dst Addr (4B)
Ver/HLen/Tos/Len (4B)ID/Flags/FragOff (4B)
TTL (1B)Protocol = UDP (1B)
Hdr Cksum (2B)
DstAddr (6B)SrcAddr (6B)
IP Options (0-40B)Src Port (2B)Dst Port (2B)
UDP length (2B)UDP checksum (2B)
VLAN (2B)Type=IP (2B) Et
hern
etHe
ader
IPHe
ader
UDP
Head
erEt
hern
etTr
aile
r Inpu
t Pac
ket F
orm
at
Type=802.1Q (2B)
PAD (nB)CRC (4B)
UDP Payload(MN Packet)
Src Addr (4B)Dst Addr (4B)
Ver/HLen/Tos/Len (4B)ID/Flags/FragOff (4B)
TTL (1B)Protocol = UDP (1B)
Hdr Cksum (2B)
DstAddr (6B)SrcAddr (6B)
IP Options (0-40B)Src Port (2B)Dst Port (2B)
UDP length (2B)UDP checksum (2B)
VLAN (2B)Type=IP (2B) Et
hern
etHe
ader
IPHe
ader
UDP
Head
erEt
hern
etTr
aile
r
Out
put P
acke
t For
mat
Buf Handle(32b)IP Pkt
Length (16b)Reserved
(8b)Eth HdrLen (8b)
VLAN(12b)
QID (20b)Rsvd(4b)
Port(4b)
Rsvd(4b)
Stats Index (16b)Rsvd(4b)
IP DAddr (32b)
![Page 15: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/15.jpg)
15 - David M. Zar - 05/14/23
MAC Address and VLAN Tag (Egress) The source MAC address is fixed and set at
boot time (_WU_get_mac_address)
The destination MAC address will only differ in the last nibble and this nibble is obtained from the Lookup data.» _WU_ip_lookup will take 32 bits from the destination IP address
and use the local CAM to obtain the least significant 4 bits of the MAC address.
» The CAM state bits are used for this so that’s why there are only 4 bits of data returned
The VLAN tag is obtained from the Lookup data.
![Page 16: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/16.jpg)
16 - David M. Zar - 05/14/23
Egress HF Block Diagram
_WU_get_enet_frame_length
_WU_write_vlan_header
_WU_update_counters
_WU_update_buffer_descriptorWait for prev ctx
Signal next ctx
NN Enqueue
Wait for prev ctx
Signal next ctx
NN Dequeueinit
signal
dl_sink()
dl_source()
DRAM: 1 4B read 4 4B writesCycles: 32SRAM: 1 add 1 incrCycles: 6
SRAM: 3 writesCycles: 10
_WU_ip_lookup
Cycles: 10
Cycles: 2
Cycles: 2
Cycles: 1
Cycles: 1
Cycles: 1
Total cycles: 65
Measured Latency*: ~660
![Page 17: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/17.jpg)
17 - David M. Zar - 05/14/23
Egress Validation Send in our internal, tunneled packets and check
output packets to see they are our valid IP, tunneled, packets.» For the PlanetLab demo, there are no non-tunneled output
packets Check packet and byte counters for valid updates Check CAM for proper initialization (data watch)
![Page 18: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/18.jpg)
18 - David M. Zar - 05/14/23
HF Initialization (Ingress/Egress) All memory locations defined in dl_system.h:
»Base address for HF LC[I/E]_HF_SRAM_INIT_BASE
MAC_ADDR_HI32 MAC_ADDR_LO16
»Pre-Queue Counters LC[I/E]_LU_COUNTERS_SRAM_INIT_BASE
LC[I/E]_LU_PRE_Q_PKT_CNT_OFFSET – offset into counters structure for packet counter
LC[I/E]_LU_PRE_Q_BYTE_CNT_OFFSET – offset into counters structure for byte counter.
Thread 0 waits for signal from rx For Egress, the CAM is filled (_WU_hfe_initialize_ip_lookup)
with data from LCE_HF_SRAM_INIT_BASE + 8:each entry is 64 bits: cam_entry (32b), RSVD (28b), MAC_DEST (4b)
![Page 19: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/19.jpg)
19 - David M. Zar - 05/14/23
File Locations (Ingress and Egress) Main code
» Applications/LC_Ingress/src/hdr_format/PL/hdr_format.uc» Applications/LC_Egress/src/hdr_format/PL/hdr_format.uc
Library » library/DataPlane/hdr_format_util.uc
![Page 20: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/20.jpg)
20 - David M. Zar - 05/14/23
Required Includes (Ingress and Egress) Files
»build/PL/dispatch_loop/dl_system.h memory locations
»IXA_SDK_4.0/src/library/microblocks_library/ dl_meta – for metadata macros
»IXA_SDK_4.0/src/library/dataplane_library/ dram – for DRAM read/write macros sram – for SRAM read/write/add/incr macros xbuf – for transfer buffer macros
![Page 21: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/21.jpg)
Performance Issues
![Page 22: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/22.jpg)
22 - David M. Zar - 05/14/23
Ingress Performance AnomaliesThese stalls are in various SRAM and DRAM accesses – the
command FIFO is FULL!
![Page 23: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/23.jpg)
23 - David M. Zar - 05/14/23
Ingress Anomalies (Explanation)
![Page 24: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/24.jpg)
24 - David M. Zar - 05/14/23
Ingress Anomalies (Explanation)
These bus arbiters are shared across
all memory interfaces
The SRAM Controllers have a
command FIFO
![Page 25: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/25.jpg)
25 - David M. Zar - 05/14/23
Ingress/Egress SRAM Issues It seems that using atomic ADD/INCR instructions is
expensive at the SRAM controller If I remove them and read the SRAM, add myself, write
the SRAM, this is quicker and consumes less of the SRM controller time an, thus, the command queue never backs up.
The this new design, there are more instructions executed, but there may be a few I could eliminate with some optimizing of code.
No stalling in the WU microblocks (well QM does and RX and TX still do but these looks normal).
![Page 26: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/26.jpg)
26 - David M. Zar - 05/14/23
Ingress/Egress Performance ~99 CPU cycles ~745 cycles latency Expected performance
»Should have no trouble going at 10 Gb/s but does… Simulated performance (as of 11/06/2006)
»~10 Gb»With all other microengines in place (i.e. real simulation)
![Page 27: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/27.jpg)
Future Work
![Page 28: David M. Zar Block Design Review: PlanetLab Line Card Header Format](https://reader035.vdocuments.us/reader035/viewer/2022070616/5a4d1bd87f8b9ab0599dafd0/html5/thumbnails/28.jpg)
28 - David M. Zar - 05/14/23
Determine source of I/O stalls Update Stubs projects for validation of Ingress/Egress
blocks (done for Ingress) Extend Both blocks for all possible packet formats
»Ingress – inputs»Egress – outputs
Possible instruction optimization to give a little headroom (99 cycles out of 100). Currently, design will not work for standard IPv4 packets; PlanetLab VLAN packets are OK.
Ingress/Egress Future Work