development and verification of soft ip core of usb 3.0 in verilog hdl baig - be thesis.pdf ·...

74
Development and Verification of Soft IP Core of USB 3.0 in Verilog HDL B.E. (EL) PROJECT REPORT Prepared by: Hasan Baig Project Advisors: Mr. Muhammad Nauman (Internal Advisor) Mr. Fasahat Hussain (External Advisor) Department of Electronics Engineering N.E.D. University of Engineering and Technology Karachi-75270

Upload: tranhuong

Post on 12-Mar-2018

214 views

Category:

Documents


2 download

TRANSCRIPT

Development and Verification of Soft IP Core of USB 3.0 in Verilog HDL

B.E. (EL) PROJECT REPORT

Prepared by:

Hasan Baig

Project Advisors:

Mr. Muhammad Nauman (Internal Advisor) Mr. Fasahat Hussain (External Advisor)

Department of Electronics Engineering

N.E.D. University of Engineering and Technology Karachi-75270

Development and Verification of Soft IP Core of USB 3.0 in Verilog HDL

B.E. (EL) PROJECT REPORT

Prepared by:

Muhammad Obaid Khalid (EL-025) Hasan Baig (EL-034)

Syed Taha Munir (EL-048) Muhammad Asrar Alam (EL-068)

Project Advisors:

Mr. Muhammad Nauman (Internal Advisor) Mr. Fasahat Hussain (External Advisor)

Department of Electronics Engineering

N.E.D. University of Engineering and Technology Karachi-75270

Abstract The Universal Serial Bus (USB) is a new way of attaching devices to personal

computers. The bus architecture features two-way communication and has been

developed as a response to devices becoming smarter and requiring more

interaction with the host. USB support is included in all current PC chipsets and is

therefore available in all recently built PCs. USB, as a protocol, has also been

picked up for many non-traditional applications such as industrial automation and

control.

Universal serial bus has supported a wide variety of devices from keyboard,

mouse, flash memory device, game peripheral, imaging up to high speed broad

band devices. In addition, user applications demand a higher performance

connection between the PC and other increasingly sophisticated peripherals. USB

3.0 addresses this need by adding even faster transfer rates. It promises a data

transfer rate of 4.8 Gbps as compared to its predecessor interface USB 2.0 which

has a raw data rate at 480Mbps.

This implementation of SuperSpeed USB Memory Device, with a pipelining

concept of processing the packets, is proposed to support high speed transfer rate

and high throughputs. Alongside, the use of efficient handshaking signals complies

with optimum performance of the overall device. This implementation meets all

the required specifications with high reliability.

Acknowledgments ]

We would first of all thank ALLAH Almighty Who enabled us to carry out this project

with full devotion and consistency. It was only because of His blessings that we could

find our way up to the completion of this task. Next we would like to pay our regards to

Mr. Khursheed Hassan, the CEO of Eonsil LLC. Texas, U.S.A, for allowing us an

opportunity to work on a project revealing the cutting edge technology of USB 3.0 that

brought us up to date with the latest technology and gained us unrivalled experience in

the field of digital designing and FPGA. Along with this we would like to present our

gratitude to Mr. Fasahat Hussain of Digitech Karachi, Pakistan, for guiding us and

leading us out of trouble at each and every pinnacle of this project work. Mr. Fasahat

coordinated with us at each and every aspect and steered us into the very details.

We would also of course like to thank our internal Mr. Muhammad Nauman for co-

operating with us throughout the course of this project and co-coordinating from time

to time with Mr. Fasahat and Mr. Khursheed about our progress and problems.

In the last but not the least we would like to acknowledge the efforts of our group

members who worked tirelessly for the accomplishment of this project leading it in

parallel with our studies and other co-curricular activities. This project was not an

achievement of any individual but its credits and acknowledgements are deserved by all

hands and minds that contributed devotedly towards its triumph.

Table of Contents 1. The Implemented Architecture of USB 3.0 Memory Device .................................. 1

1.1 Architecture’s Overview.................................................................................... 1 1.2 Dual Simplex Requirement Of USB 3.0 SuperSpeed Protocols ....................... 3 1.3 PHY Chip Behavioral Model ............................................................................ 3 1.4 The Pipe Line Operation ................................................................................... 3

2. The Physical Layer Controller .................................................................................. 4

2.1 Physical Layer Overview .................................................................................. 5 2.2 USB Physical Layer .......................................................................................... 6 2.3 MAC – PHY Interface....................................................................................... 7

2.3.1 MAC – PHY Interface Signals .................................................................... 9 2.3.2 MAC – LTSSM Interface Signals ............................................................. 10 2.3.3 MAC – Link Layer Controller Interface Signals....................................... 11 2.3.4 MAC – Master Controller interface signals .............................................. 11

2.4 MAC Layer intermediate signals..................................................................... 12 2.5 Algorithmic States Machine Description of Phy Encoder .............................. 13 2.6 Algorithmic States Machine Description of Phy Decoder .............................. 16

3. The Link Layer ......................................................................................................... 18

3.1 Link Layer Overview....................................................................................... 19 3.2 Hardware Implementation of Link Layer........................................................ 19

3.2.1 Packet Disassembler .................................................................................. 20 3.2.2 Packet Assembler ...................................................................................... 20

3.3 Buffer controllers (buffer interfaces)............................................................... 21 3.4 Link Layer Controller...................................................................................... 21

3.4.1 Link Commands ........................................................................................ 21 3.4.2 Header Packet Exchange Control .............................................................. 23

3.4.2.1 IDLE STATE.............................................................................. 24 3.4.2.2 Initialization for HP integrity and flow control .......................... 25 3.4.2.3 Valid Header Packets Exchange................................................. 25 3.4.2.4 Header Packet Retry Process...................................................... 26

3.4.3 Link Power Management: ......................................................................... 27 3.4.3.1 Rules for a port to request or accept a low power link state....... 27

3.5 Error Detection Algorithm (CRC)................................................................... 27 3.5.1 Method of Parallel CRC Computation ...................................................... 28

4. The Link Training and Status State Machine ....................................................... 30

4.1 Introduction ..................................................................................................... 31 4.2 LTSSM’s Interconnections with other Layers................................................. 32

4.2.1 LTSSM AND MAC LAYER .................................................................... 32 4.2.2 LTSSM AND PHY.................................................................................... 33 4.2.3 LTSSM AND DATA LINK LAYER........................................................ 34

4.3 An Overview Of The LTSSM State Machine ................................................. 34 4.4 Detailed Description Of LTSSM States .......................................................... 35

4.4.1 SS. DISABLED......................................................................................... 35 4.4.2 SS. INACTIVE.......................................................................................... 36 4.4.3 RX.DETECT ............................................................................................. 37 4.4.4 POLLING .................................................................................................. 37 4.4.5 U0 – LINK ACTIVE ................................................................................. 39 4.4.6 U1 – LINK IDLE WITH FAST EXIT ...................................................... 39 4.4.7 U2 – LINK IDLE WITH SLOW EXIT..................................................... 39 4.4.8 U3 – LINK SUSPENDED......................................................................... 40 4.4.9 RECOVERY.............................................................................................. 40 4.4.10 LOOPBACK........................................................................................... 41 4.4.11 COMPLIANCE ...................................................................................... 41 4.4.12 HOT RESET........................................................................................... 41

4.5 Brief Description Of LTSSM’s Functionalities .............................................. 42 4.5.1 Link Training & Initialization ................................................................... 42 4.5.2 POWER MANAGEMENT ....................................................................... 43 4.5.3 ERROR RECOVERY ............................................................................... 43

5. The Protocol Layer................................................................................................... 46

5.1 Protocol Layer Overview ................................................................................ 47 5.2 Types of Packets.............................................................................................. 47 5.3. Hardware Implementation of Protocol Layer................................................. 47

5.3.1 Registers bank for Descriptors and Device Configuration........................ 48 5.3.2 Packet assembler ....................................................................................... 48 5.3.3 Packet-disassembler................................................................................... 49 5.3.4 Protocol layer controller ............................................................................ 50

5.3.4.1 IN Transfers................................................................................ 51 5.3.4.2 OUT Transfer ............................................................................. 53

5.3.5 Buffers for packet storage ......................................................................... 55 5.3.6 Buffer controllers (buffer interfaces)......................................................... 55

6. The Master Controller ............................................................................................. 56

6.1 Master Controller Overview............................................................................ 57 6.2 Decoding Path Controller ................................................................................ 58 6.3 Encoding Path Controller ................................................................................ 60

7. Functional Simulation of Implemented Device...................................................... 63

7.1 Functional Verification of LTSSM ................................................................. 63 7.2 Functional Verification of Phy Layer Controller ............................................ 63 7.3 Functional Verification of Link Layer............................................................. 64 7.4 Functional Verification of Protocol Layer ...................................................... 64

Bibliography.................................................................................................................. 63

Table of Figures Fig 1.1: The Over-all block diagram of the Architecture ................................................. 2 Fig 2.1: PHY/MAC Interface ........................................................................................... 7 Fig 2.2: Top Level Block Diagram of PHY Layer Controller.......................................... 8 Fig. 2.3: ASMD of Phy Encoder .................................................................................... 15 Fig. 2.4: Standard packet with maximum of 1024 data bytes ........................................ 16 Fig. 2.5: ASMD of Phy Decoder .................................................................................... 17 Fig. 3.1: Block Diagram of the Link Layer .................................................................... 19 Fig. 3.2: Header packet with HPSTART, Packet Header and Link Control Word ........ 20 Fig. 3.3: Link Control Word........................................................................................... 21 Fig. 3.4: Link Command Structure................................................................................ 22 Fig. 3.5: Initial Condition and Idle State ....................................................................... 24 Fig. 3.6: Successful Transmission & Reception of HP ................................................. 25 Fig. 3.7: Transmission of a corrupted HP....................................................................... 26 Fig. 3.8: CRC5 Serial Remainder Generation ................................................................ 28 Fig. 4.1: LTSSM placement in USB 3.0 device ............................................................. 32 Fig. 4.2: LTSSM State Machine Diagram...................................................................... 36 Fig. 4.3: SS.INACTIVE Sub state Machine................................................................... 37 Fig. 4.4: RX.DETECT Substate Machine ...................................................................... 37 Fig. 4.5: POLLING Substate Machine ........................................................................... 38 Fig. 4.6: U1 Exit Conditions State Diagram................................................................... 39 Fig. 4.7: U2 EXIT Conditions State Diagram ................................................................ 40 Fig. 4.8: U3 EXIT Conditions State Diagram ................................................................ 40 Fig. 4.9: Recovery Substate Machine............................................................................. 41 Fig. 4.10: Hot Reset Substate Machine .......................................................................... 42 Fig. 4.11: Link Initialization & Training Flow Chart..................................................... 43 Fig. 4.12: Power Management Flowchart ...................................................................... 44 Fig. 4.13: Error Recovery Flow Charts .......................................................................... 44 Fig. 5.1: Block Diagram of the Protocol Layer .............................................................. 48 Fig. 5.2: SuperSpeed IN transfer sequence..................................................................... 52 Fig. 5.3: SuperSpeed OUT transfer sequence ................................................................ 55 Fig. 6.1: Top Level Block Diagram of Master Controller.............................................. 57 Fig. 6.2 Timing diagram of decoding process. ............................................................... 58 Fig. 6.3 Timing diagram of encoding process. ............................................................... 60

Chapter # 1

The Implemented Architecture of USB 3.0 Memory Device

1 Chapter 1 The Implemented Architecture of USB 3.0 Memory Device

1.1 Architecture’s Overview

USB 3.0 specification provide an extensively complicated hardware inference. It was

emphasized to produce such an architecture which can be easily comprehended,

integrated and implemented without an extraordinary knowledge of interfacing different

layers of the Device entity. To achieve such goals each layer was kept separated from

the other by placing dual-port-memory-banks in between two consecutive layers. As

seen by the device, first comes the physical layer (shown by PHY Chip entity which

converts the serial interface to parallel interface), the MAC layer, the link layer and

lastly the protocol layer. Since these layers are separated by intermediate memories

which are rather slaves for the layers they are connected to. When one of the layers is

done with the intermediate memory(dual-port-memory), there is a primary need of

notifiying the next concerned layer to begin exection and process the valid memory

contents in the intermediate memory area. This need is accomplished by using the

Master Controller which is scheduling the exection of layers in a pre-determined

sequence which is given in Chapter-6. There has been some extensive usage of buffer

interfaces. The primary concept is to overcome the need to incorporate a memory

controller into a layer’s main controller. For instance, if protocol layer is instructed to

start processing some valid memory contents, there are two possiblities in this regard.

Firstly protocol layer controller can fetch memory contents by driving the address, data,

enable ports of the memory with incrementing each time the address for next valid data

and asserting the enable port. Second possiblity is that it has a separate module which is

notified of the number of bytes to be fetched from memory and which is resposible of

incrementing the address each time it gets valid data for memory. In order to simplify

the implementation, it is recommend to have separate entities so that hardware can be

easily comprehended and debugged or in another scence the main controller is freed

from some extra burden to deal with the memory. Thus second approad seems feasible

and to meet its requirement buffer interface or memory controllers are used in the

architecture just in the neighbourhood of intermdiate dual-port-memories.

For simulation purposes, SRAM is shown as the ultimate source or sink of data in the

device. This is also a Dual port SRAM having a read-only port and a write only port. It

is generator by using Coregen facility provided by the Xilinx ISE Design Suite.

2 Chapter 1 The Implemented Architecture of USB 3.0 Memory Device

Fig 1.1: The Over-all block diagram of the Architecture

3 Chapter 1 The Implemented Architecture of USB 3.0 Memory Device

1.2 Dual Simplex Requirement Of USB 3.0 SuperSpeed Protocols

Since SuperSpeed Protocols are meant for dual simplex transmission lines, transmitting

and receiving transactions independently, there is an absolute need of having the

architecture which support such Protocols. In order to meet the requirement there are

separate encode and decode paths working concurrently and independently. Thus

encode path is associated with packet assemblers or encoders while the decode path is

associated with packet disassembler or decoders. Encode and decode paths are executed

by the Master Controller State Machine so as to fulfill the dual simplex capability of the

bus.

1.3 PHY Chip Behavioral Model

Having discussed all the fundamental and higher level entities of the USB device, PHY

Chip Behavioral model needs some explanation. Although the whole of the USB Device

is written in synthesizable RTL code, this entity will be representing the behavior of the

Host plus the behavior of the PHY Chip. It is meant only for simulation purposes and

can never infer a hardware whatsoever. It can supress the concept of separate layers and

can accommodate the behavioral of the host entity and PHY Chip as a single entity

which is needed to derive the MAC layer, appearing in the front-line of the upstream

port (USB Devie). It tests whole of the device by looping back the data from the device

that it sends.

1.4 The Pipe Line Operation

The dual-port-memory bank no. 3 and 4 have two buffers whereas dual-port-memory

bank no. 1 and 2 have four buffers. This implementation promises high speed pipe

lining with each layer filling up the buffers and the following layer fetching the

previous memory contents at the same time.

Chapter # 2

The Physical Layer Controller (Media Access (MAC) layer)

Chapter 2 The Physical Layer Controller

5

2.1 Physical Layer Overview

The physical layer defines the PHY portion of a port and the physical connection

between a downstream facing port (on a host or hub) and a upstream facing port on a

device. The SuperSpeed physical connection is comprised of two differential data pairs,

one transmit path and one receive path (Fig. 2.1). The nominal signaling data rate is

5Gbps.

The electrical aspects of each path are characterized as a transmitter, channel, and

receiver; these collectively represent a unidirectional differential link. Each differential

link is AC-coupled with the capacitors located on the transmitter side of the differential

link. The channel includes the electrical characteristics of the cables and connectors.

At an electrical level, each differential link is initialized by enabling its receiver

termination. The transmitter is responsible for detecting the far end receiver termination

as an indicator of a bus connection and informing the link layer so the connect status

can be factored into link operation and management.

When receiver termination is present but no signaling is occurring on the differential

link, it is considered to be in the electrical idle state. When in this state, Low Frequency

Periodic Signaling (LFPS) is used to signal initialization and power management

information. The LFPS is relatively simple to generate and detect and uses very little

power.

Each PHY has its own clock domain with Spread Spectrum Clocking (SSC)

modulation. The USB 3.0 cable doesn’t include a reference clock so the clock domains

on each end of the physical connection are not explicitly connected. Bit-level timing

synchronization relies on the local receiver aligning its bit recovery clock to the remote

transmitter’s clock by phase-locking to the signal transitions in the received bit stream.

The receiver needs enough transitions to reliably recover clock and data from the bit

stream. To assure that adequate transitions occurs in the bit stream independent of the

data content being transmitted, the transmitter encodes data and control characters into

Chapter 2 The Physical Layer Controller

6

symbols using 8b/10b code. Control symbols are used to achieve byte alignment and are

used for framing data and managing the link. Special characteristics make control

symbols uniquely identifiable from data symbols.

The physical layer receives 8-bit data from the link layer and scrambles the data to

reduce Electromagnetic Interference (EMI) emissions. The bit stream is recovered from

the differential link by the receiver, assembled into 10-bit symbols, decoded and

descrambled, producing 8-bit data that are then sent to the link layer for further

processing.

2.2 USB Physical Layer

The USB PHY Layer handles the low level USB protocol and signalling. This includes

features such as; data serialization and deserialization, 8b/10b encoding, analog buffers,

elastic buffers and receiver detection. The primary focus of this block is to shift the

clock domain of the data from the USB rate to one that is compatible with the general

logic in the ASIC.

Some key features of the USB PHY are:

o Standard PHY interface enables multiple IP sources for USB Link Layer and

provides a target interface for USB PHY vendors

o Supports 5.0 GT/s serial data transmission rate

o Utilizes 8-bit or 16-bit parallel interface to transmit and receive PCI Express

data

o Allows integration of high speed components into a single functional block as

seen by the device designer

o Data and clock recovery from serial stream on the USB SuperSpeed bus

o Holding registers to stage transmit and receive data

o Supports direct disparity control for use in transmitting compliance pattern(s)

o 8b/10b encode/decode and error indication

o Receiver detection

o Low Frequency Periodic Signalling (LFPS) Transmission

o Selectable Tx

o Margining

Chapter 2 The Physical Layer Controller

7

2.3 MAC – PHY Interface

Fig. 2.1 shows the implemented data and logical command/status signals between the

PHY and MAC (or PHY layer controller)1 layer. These signals will be described in the

next section. Full support of USB mode requires 16 control signals and 7 status signals.

Fig 2.1: PHY/MAC Interface

Since the PIPE (PHY Interface for the PCI Express) is implemented for USB mode that

supports 5.0GT/s, we have chosen 32 bits data paths with PCLK running at 125MHz.

The MAC Layer commands the communication of PHY Layer with the Link Layer and

LTSSM. PHY layer controller itself is commanded by Master Controller. The top level

block diagram of PHY Layer Controller is shown in Fig. 2.2.

It can be observed that the PHY Layer Controller itself comprises of some modules out

of which DPRF, Read and Write buffer interfaces is discussed in chapter 5. The I/O

signals of PHY Layer Controller are described in the following sections.

1 Phy Layer Controller is also called Media Access (MAC) Layer. We will use these terms interchangeably throughout this document.

Chapter 2 The Physical Layer Controller

8

Fig 2.2: Top Level Block Diagram of PHY Layer Controller

Chapter 2 The Physical Layer Controller

9

2.3.1 MAC – PHY Interface Signals The MAC-PHY input and output signals are described in the Table 2-1. The signals

described here and later are defined from the perspective of a PHY Layer Controller

(MAC Layer). Thus a signal described as an “output” is driven by MAC and the signal

described as an “input” is received by the MAC. Legends

Encoder Decoder Data Signals Command Signals Status Signals External Signals

Table 2-1: MAC-PHY I/O Interface signals

Name Direction Active Level

Description

[31:0]Txdata

Output

N/A

Parallel USB data output bus. 32 bits represents the 4 symbols of transmit data. Bits [7:0] are the first symbol to be transmitted, bits [15:8] are the second symbol, bits [23:16] are the third symbol and bits [31:24] are the fourth symbol.

[3:0]TxdataK

Output

N/A

Data/Control bit for the symbols of transmitted data. For 32-bit interfaces, Bit 0 corresponds to the Low-byte of Txdata (i.e. bits [7:0]) and Bit 3 corresponds to the Upper-byte (i.e. bits [31:24]). A value of “0” indicates a Data byte and a value of “1” indicates a Control byte.

TxDetectRx Output High Used to tell the PHY to begin a receiver detection operation.

Tx_elec_idle

Output

High

Forces Tx output to electrical idle when asserted in all power states. When deasserted while in P0 (as indicated by the PowerDownLTSSM signals) indicates that a valid data present on Txdata and TxdataK pins and that data must be transmitted.

[1:0] PowerDown

Output N/A Power up or down the transceiver power states.

phy_status

Input

High

Used to communicate completion of several PHY functions including power management state transitions, rate change, and receiver detection.

Encoder

[2:0]Rx_status

Input

N/A

Encodes receiver status and error codes for the received data stream when receiving data. Receiver is detected when Rx_status = 011.

[31:0]RxData

Input

N/A

Parallel USB data input bus. 32 bits represents the 4 symbols of transmit data. Bits [7:0] are the first symbol to be transmitted, bits [15:8] are the second symbol, bits [23:16] are the third symbol and bits [31:24] are the fourth symbol.

[3:0]RxDataK

Input

N/A

Data/Control bit for the symbols of received data. For 32-bit interfaces, Bit 0 corresponds to the Low-byte of RxData (i.e. bits [7:0]) and Bit 3 corresponds to the Upper-byte (i.e. bits

Chapter 2 The Physical Layer Controller

10

[31:24]). A value of “0” indicates a Data byte and a value of “1” indicates a Control byte.

reset_rx_tx Output Low Resets the transmitter and receiver

RxPolarity

Output

High Tells PHY to do a polarity inversion on the received data: 0: PHY does no polarity inversion 1: PHY does polarity inversion

RX_Termina-tion

Output

High

Control presence of receiver terminations: 0: Terminations removed 1: Terminations present

RxValid Input High Indicates valid data on RxData and RxDataK

Rx_elec_idle

Input

High Indicates receiver detection of an electrical idle. While deasserted with PHY in P0, P1, P2 or P3 indicates the detection of LFPS.

Decoder

PowerPresent Input High Indicates the presence of VBUS.

PCLK

Input

Rising Edge

Parallel interface differential data clock. All data movement across the parallel data interface is synchronized to this clock which operated at 125MHz (in our case).

External Signals

Phy_mode

Output

N/A Selects PHY operating mode 0: PCI Express 1: USB Mode So it should always be kept High.

2.3.2 MAC – LTSSM Interface Signals The MAC – LTSSM I/O signals are described in the Table 2-2. The signals described as

inputs are received by MAC and those described as outputs are driven by MAC. Legends

Encoder Decoder

Table 2-2: MAC – LTSSM I/O Interface signals

Name Direction Active Level

Description

[1:0]PowerDownLTSSM

Input

N/A

Instruction for MAC to take PHY chip into the Power State (P0, P1, P2 or P3) mentioned by LTSSM.

transmit_LFPS

Input

High

Instruction for MAC to transmit Low Frequency Periodic Signaling (LFPS) when the PHY is in P1, P2 or P3 state.

transmit Input High Instruction for MAC to begin transmission operation followed by the proper protocols.

receiver_DO Input High Instruction for MAC to do receiver detection operation.

[2:0]Rx_status_2LTSSM Output N/A Sends back the encoded receiver status to LTSSM.

LTSSM_phy_status

Output

High

Informs LTSSM the completion of several PHY functions including power management, state transitions, rate change, and receiver detection.

Chapter 2 The Physical Layer Controller

11

do_rx_termination Input High Controls the presence of receiver terminations commanded by LTSSM.

VBUS Output High Indicates the presence of VBUS to LTSSM LFPS_detected Output High Indicates LTSSM that Low Frequency

Periodic Signaling (LFPS) is being detected.

2.3.3 MAC – Link Layer Controller Interface Signals

The MAC-Link Layer Controller I/O signals are described in the Table 2-3. The signals

described as inputs are received by MAC and those described as outputs are driven by

MAC. DPRF (for Encoder) is used by the Link Layer Controller to write the data in

dual port memory. That data is read and send (to PHY chip) by the Read Buffer

Interface and Phy Encoder respectively (See Fig. 2.2). Similarly, the data coming from

the PHY chip is received by Phy decoder, and then written into DPRF (for Decoder)

through the Write Buffer Interface, which is then used by Link Layer Controller (See

Fig. 2.2).

Legends

DPRF (for Encoder) DPRF (for Decoder) PHY Decoder

Table 2-3: MAC – Link Layer Controller I/O Interface signals

Name Direction Active Level

Description

[8:0]ll_wr_dprf_addr Input N/A Address from which Link Layer Controller starts writing the data in DPRF.

[31:0]ll_wr_dprf_din Input N/A 32-bits data input bus. [31:0]ll_wr_dprf_wem Input N/A Write enable mask.

ll_wr_dprf_en Input High Enable DPRF for writing data. [8:0]ll_rd_dprf_addr Input N/A Address from which Link Layer Controller

wants to read the data from DPRF. ll_rd_dprf_en Input High Enable DPRF for reading data.

[31:0]ll_rd_dprf_dout Output N/A 32-bits data out. Ignore Input High Force Phy Decoder to ignore the incoming

packet of data until lrty is found. lrty_found Output High Informs the Link Layer Controller that header

packet is resending.

2.3.4 MAC – Master Controller interface signals

The signals used to monitor and control the PHY Layer Controller are described in the

Table 2-4. The signals are described from the perspective of Master Controller. Thus the

signals described as “input” are received by the Master and signals described as

“output” are driven by the Master.

Chapter 2 The Physical Layer Controller

12

Table 2-4: MAC – Master Controller I/O Interface signals

Name Direction Active Level

Description

clk Input N/A Pclk coming from PHY chip.

done

Input

High Asserts after a complete transaction of packet from Read Buffer interface to PHY chip.

phy_active_tx

Input

High

Indicates that Encoder is active and fetching data from Read Buffer interface.

rx_done Input High Informs the Master controller that one packet has been fetched from PHY chip.

phy_active_rx Input High Indicates that decoder is in active state and reading data from PHY chip.

[10:0]packet_size Input N/A Size of packet (calculated by Phy Decoder) received from the PHY chip.

[8:0] pld_base_addr Input N/A Base address of next packet generated by Phy decoder.

[8:0] pld_base_addr_en Output N/A Base address from which Phy Encoder needs to read data.

[10:0] pack_size Output N/A Instruct Phy Encoder to fetch the given size of packet.

reset_n Output Low Master reset start_en Output High Starts encoding operation.

2.4 MAC Layer intermediate signals

Communication flow between intermediate modules of PHY Layer Controller is shown

in Fig. 2.2. The Phy Encoder – Read Buffer interface signals and Phy Decoder – Write

Buffer interface signals are described in the Table 2-5. The signals described here are

from the perspective of Phy Encoder and Phy Decoder. Thus a signal described as an

“output” is driven by Phy Encoder/Decoder and the signal described as an “input” is

received by the Phy Encoder/Decoder. Legends

Phy Encoder – Read Buffer interface signals Phy Decoder – Write Buffer interface signals

Table 2-5: Intermediate I/O Interface signals

Name Direction Active Level

Description

ready_en

Output

High

Signal used to inquire the Read Buffer Interface whether it is ready to send data to Phy Encoder.

ack_en Input High Acknowledgment of “ready_en” signal from Read Buffer interface.

[31:0]rd_data Input N/A 32-bits data bus used to fetch data from Read Buffer Interface.

EOP Input High End Of Packet: indicates last packet from Read Buffer interface.

Chapter 2 The Physical Layer Controller

13

valid Input High Indicates valid data at “rd_data” bus of Read Buffer interface

buf_if_active_en Input High It signifies that Read Buffer is in active state and fetching data from dprf.

phy_rd_valid Output High Indicates valid data, at 32-bits RxData bus, to Write Buffer Interface.

ready_de

Output

High

Signal used to inquire the Write Buffer Interface whether it is ready to receive data from Phy Decoder.

[31:0]phy_wr_data_bus

Output

N/A

32-bits data bus; used to write data into Write Buffer interface.

phy_data_last Output High Indicates last packet from PHY chip buf_if_active_de Input High It signifies that Write Buffer is in active state

and writing data into dprf. ack_de Input High Acknowledgment of “ready_de” signal from

Write Buffer interface.

2.5 Algorithmic States Machine Description of Phy Encoder

ASMD of PHY Encoder is shown in Fig. 2.3. When an encoding process is done by

Link Layer controller, it asserts “ll_enc_done” (mentioned in Chapter 6, The Master

Controller), informing master controller that a valid data has been placed in dprf and

must be fetched by Phy Encoder. Master controller then asserts “start_en” signal to

initialize Phy encoder and waits for being acknowledged from Phy encoder.

LTSSM controls the power state of PHY chip through Phy Encoder. Phy chip remains

idle in P1 and P3 power states. In P2 state, encoder waits for the instruction from

LTSSM either to force Phy Chip to transmit LFPS or to do receiver detection operation

(Fig. 2.3). When a valid data is present in the buffers, LTSSM instructs Phy Encoder to

take Phy chip into P0 state. Encoder starts the process of fetching data, from buffer,

only when a positive edge of “transmit” is seen asserted.

When LTSSM asserts “transmit” signal, encoder requests the data and waits for the

acknowledgment from Read Buffer Interface. When transaction begins, encoder obtains

the data payload size from the packet size (given by master controller, in terms of bytes)

and puts into the register “[10:0] data_pld_size”. The purpose of calculating the data

payload size is to find out how many number of transactions are required to send the

complete packet to Phy chip. Since each transaction can have 4 symbols of transmit data

(32-bit bus), therefore a packet size is divided by 4 to obtain the correct number of

transactions required.

Chapter 2 The Physical Layer Controller

14

As mentioned in section 2.3.1 (Table 2.1), TxDataK bus indicates Control or Data byte

in a current transaction. The RTL of encoder is efficient enough to locate which byte is

a control byte or data byte in a current transaction. Fig 2.4 depicts that there are two

such transactions (1st and 6th) which have complete control symbols (bytes) in it. The

last transaction should have all control bytes, but it depends on the data payload size. If

data payload size is not a multiple of 4, then there must be an ambiguity which symbol

is a control or a data byte, in 2nd last transaction. Two least significant bits of “[10:0]

data_pld_size” indicates the position of data byte in 2nd last transaction (Fig. 2.3).

Chapter 2 The Physical Layer Controller

15

Fig. 2.3: ASMD of Phy Encoder

Chapter 2 The Physical Layer Controller

16

Fig. 2.4: Standard packet with maximum of 1024 data bytes

2.6 Algorithmic States Machine Description of Phy Decoder

ASMD of Phy Decoder is shown in Fig. 2.5. “PowerState” of Phy Decoder is again in a

control of LTSSM. Phy Decoder remains idle in P1 & P2 states. In P3, LTSSM asserts

“receiver_DO” signal when it requires “receiver detection” operation to be performed.

Phy Decoder in-turn asserts “TxDetectRx” signal, requesting PHY chip to begin

“receiver detection” operation. This signal should remain high until “phy_status” signal

from Phy Chip is seen asserted. When the receiver detection operation is completed,

PHY chip asserts “phy_status” signal. Phy decoder then deasserts “TxDetectRx”,

meanwhile informs LTSSM, the status of receiver through “Rx_status_2LTSSM” bus.

As soon as LTSSM instructs Phy decoder to take PHY Chip into the power state P0,

decoder starts looking for “Rx_elec_idle” signal. Phy Decoder informs the LTSSM

about LFPS on the basis of “Rx_elec_idle” signal. It then goes into “idle” state until

valid data is present at “RxData” bus. When the valid data is present, decoder

interrogates the Write Buffer Interface”, whether it is ready to accept the incoming data,

and jumps to the “ackldg” (acknowledge) state. It then waits for the acknowledgment

from “Write Buffer Interface”. As soon as the buffer acknowledges, decoder starts

fetching and sending data from Phy Chip to Write Buffer Interface respectively (see Fig.

2.2).

Chapter 2 The Physical Layer Controller

17

Fig. 2.5: ASMD of Phy Decoder

Chapter 2 The Physical Layer Controller

18

Phy Decoder keeps on transferring packet from Phy Chip to Write Buffer Interface

unless the Link Layer Controller asserts “ignore” signal. When “ignore” is seen

asserted, Phy decoder discards the incoming data from the Phy Chip and starts looking

for LRTY. Decoder also calculates the size of packet while transferring data from Phy chip to write

buffer interface. Fig. 2.4 depicts that a packet can have a maximum size of 1024 bytes

(max data payload) + 28 bytes (standard protocol of each packet). The packet size is

calculated in such a way that a counter is incremented each time a transaction occurs.

Decoder continuously monitors RxDataK lines. Control byte is indicated by RxDataK

bus whenever its value is non-zero. Whenever a non-zero value is present at RxDataK

lines, another counter is incremented to monitor the number of control byte transactions.

Referring to the Fig. 2.4, it can be observed that there could only be 3 or 4 such

transactions which have control bytes in it, i.e. the first transaction, the sixth transaction

and the last transaction. There could be fourth control byte transaction when data

payload size is not a multiple of 4 (i.e. first three of the last four control bytes can be a

part of second last transaction).

Since the first and the sixth transaction is a complete control byte transaction, therefore

one doesn’t need to care about them. The problem arises after data pay load due to

variations in the data payload sizes.

Fig 2.5 reveals that decoder repeatedly checks for “rxdataK_count” to become equal to

2. When “rxdataK_count” become equal to 2, decoder checks the value of RxDataK.

RxDataK = 4’hF point towards that all the four bytes are control bytes and a current

transaction is End of Packet. RxDataK, other than 4’hF, clearly indicates that the data

payload size is not a multiple of 4 and the present transaction contains the data byte(s)

besides the control byte(s) as well. Also we would have fourth control byte transaction.

If RxDataK = 4’h8 (4’b1000), it shows, there are 3 data bytes and 1 control byte. This

one control byte is actually from the four of the last control bytes (shown in Fig. 2.4).

This means that there will be only 3 (remaining) control bytes in the next transaction

and the last byte will remain empty, thus a value of 1’b1 is subtracted from the size of

packet (shown in Fig. 2.5). Similar method is implemented for RxDataK = 4’hC and

4’hE.

Chapter # 3

The Link Layer

Chapter 3 The Link Layer

19

3.1 Link Layer Overview

A SuperSpeed link is a logical and physical connection of two ports. The connected

ports are called link partners. A port has a physical part and a logical part. The link layer

defines the logical portion of a port and the communications between link partners. The

responsibilities of link layer contain successful data transfer with the link partner and

link training control. The robust link flow control is based on packets and link

commands. Link management between partners and the flow control is completely

governed by the link command words.

3.2 Hardware Implementation of Link Layer

Fig. 3.1: Block Diagram of the Link Layer

Chapter 3 The Link Layer

20

3.2.1 Packet Disassembler

This module is associated with the decoding of the packets, which are received from the

link partners, link layer and protocol layer both contains this disassembler but

differentiate in the extraction of information link layer is more interested in the error

free transaction so it checks the CRC 16 and CRC 5 of the header packets to endorse it

and the Header Sequence # to verify the flow of the transaction and asks the Link Layer

controller to acknowledge the particular HP with LGOOD or reject a corrupted HP with

LBAD.

3.2.2 Packet Assembler

Link Layer controller fetches the data from Dual Port memory bank 1 where already

assembled data is placed by the protocol layer and place the link control word in the

header packet. HP are of 20 bytes which are constructed in link layer by the addition of

Link Control Word (2 Bytes) and Header Packet Framing HPSTART ordered set

defined as three consecutive symbols of SHPs (Start Header Packet) followed by a

symbol of EPF (End Packet framing) while CRC 16 and header information collectively

called as Packet header is added in the protocol layer. The assembled packet is then

write in the Dual port memory bank 3.The complete HP structure is shown in figure .1.

Fig. 3.2: Header packet with HPSTART, Packet Header and Link Control Word

Link control word is also generated in the link layer which is 16 bits long, which

contains 11 bits of information and 5 bits for CRC 5 as shown in figure .2. Header

Sequence # is unique to each link which is used for integrity assurance of the link and

detect the missing HP while CRC-5 protects the integrity of other 11 bits. Delayed,

deferred and Hub Depth bits are utilized for Hub forwarding support. Data packet

payloads are generated and framed in the protocol layer but transferred to the link

Chapter 3 The Link Layer

21

partner followed by their HP under the governance of link command words through link

layer.

Fig. 3.3: Link Control Word

3.3 Buffer controllers (buffer interfaces)

These controllers are meant for fetching and writing packets of specified data packet

size on temporary storage buffers in the device’s Link layer or in dual-port-memory-

banks. The concept behind these controllers is to remove the burden from the Link layer

controller for fetching and writing the packets in the buffer storage area. They provide

efficient handshaking signal for efficient performance. Data is fetched from the buffers

via read buffer interfaces while written via write buffer interfaces.

3.4 Link Layer Controller Link layer controller is the central unit which controls the whole processing of the link

layer. Link layer controller acknowledges the link partners about the link flow either

link layer is receiving valid header packets or not, it keeps record about the sequence of

the HPs and detect any missing or corrupted header packet and ask the link partner to

send it again. Controller handles all the tasks using link command words.

3.4.1 Link Commands

Link commands enable all Link layer function other than link training control. Three

basic purposes are link level data integrity, data flow control and power management

between link partners .Link command structure is 8 bytes long as shown in figure.3. It

contains 4 bytes of framing ordered set LCSTART consisting of three consecutives

SLCs (Start Link Command) followed by an EPF no end framing is used. Link

command word is added twice in the link command structure. A valid reception of link

command needs 3 of 4 K-symbols and either both link command words are valid and

Chapter 3 The Link Layer

22

identical or one Link command word is valid and the other is invalid. In USB3.0 we use

three types of link command words for different usage as shown in table 3-1.

Fig. 3.4: Link Command Structure

Table.3-1: Types of Link Commands

Usage Cases Link Commands

Ensure Successful Transfer Of HP

(Acknowledgement)

LGOOD_n(n=0to7), LBAD, LRTY

Flow Control LCRD_x(x=A,B,C,D)

Power Management LGO_Ux(x=1,2,3), LAU, LXU,

LPMA

Special (For presence in U0) LUP

LGOOD:

This is an acknowledgement from a link partner that a header packet with the Header

Sequence Number of “n” is received properly. LGOOD_n uses an explicit numerical

index called Header Sequence Number to represent the sequencing of a header packet.

The Header Sequence Number starts from 0 and is incremented by one based on

modulo-8 addition with each header packet. The index corresponds to the received

Header Sequence Number and is used for flow control and detection of lost or corrupted

header packets.

LBAD:

LBAD commands are used for Bad header packet. LBAD is sent by a port receiving the

header packet in response to an invalid header packet. Packet that was received has

corrupted CRC-5 and/or CRC-16.

Chapter 3 The Link Layer

23

Receipt of LBAD will cause a port to resend all header packets after the last header

packet that has been acknowledged with LGOOD_n.

LRTY:

Sent by a port before resending the first header packet in response to receipt of LBAD.

LCRD:

Sent by a port after receiving a header packet that meets the following criteria:

• LGOOD_n is sent.

• The header packet has been processed, and an Rx Header Buffer Credit is available.

LCRD_x is sent in the alphabetical order of A, B, C, D, and back to A without skipping.

Missing LCRD_x

will cause the link to transition to Recovery.

LGO_Ux:

LGO_U1 Sent by a port requesting entry to U1.

LGO_U2 Sent by a port requesting entry to U2.

LGO_U3 Sent by a downstream port (Host) requesting entry to U3. An upstream

port (Device) shall accept the request.

LAU:

Sent by a port accepting the request to enter U1, U2, or U3.

LXU:

Sent by a port rejecting the request to enter U1 or U2.

LPMA:

Sent by a port upon receiving LAU. Used in conjunction with LGO_Ux and LAU

handshakes to guarantee both ports are in the same state.

LUP:

It is a special link command use to signify that device is present in U0. Sent by an

upstream port every 10 μs when there are no packets or other link commands to be

transmitted.

3.4.2 Header Packet Exchange Control

Both link partners contains 4 buffers in each transmitter and receiver, these buffers have

the capacity to hold four HPs at a time. In order to explain HP exchange procedures

Chapter 3 The Link Layer

24

among two link partners some terminologies and link commands should be explained

first.

RX Header Sequence Number: The sequence number of the expected HP from TX.

TX Header Sequence Number: The sequence number assigned to new HP.

ACK_TX Header Sequence Number: Oldest unacknowledged HP in TX buffer.

Local RX Header Buffer Credit: Space available in RX HP buffer.

Remote RX Header Buffer Credit: Credit to transmit HPs.

RX LCRD Index: Assigned to next LCRD_x.

Remote RX LCRD index: Expected index of next LCRD_x from RX.

Link Partner: Device at one end of the link.

Link Partner TX: HP sender.

Link Partner RX: HP receiver.

3.4.2.1 IDLE STATE

TX  HP  BUF F E REmpty

Empty

Empty

Empty

TX  HP  BUF F E REmpty

Empty

Empty

Empty

R X  HP  BUF F E REmpty

Empty

Empty

Empty

R X  HP  BUF F E REmpty

Empty

Empty

Empty

R X  Header S equence # : 0

L C RD  Index: A

L ocal RX  Header Buffer c redit: 4

R X  Header S equence # : 0

L C RD  Index: A

L ocal RX  Header Buffer c redit: 4

TX  Header S equence #: 0

AC K  Header S equence#: 0

R emote RX  Header Buffer C redit: 4

R emote RX  L C RD  Index: A

TX  Header S equence #: 0

AC K  Header S equence#: 0

R emote RX  Header Buffer C redit: 4

R emote RX  L C RD  Index: A

L ink  Partner TX L ink  Partner RX

LUP

L UP

L UP

Fig. 3.5: Initial Condition and Idle State

The link has just entered the U0 state from reset hence initial values of above described

parameters are set as shown in the figure3.5. It is clear that so far no HP is transmitted

Chapter 3 The Link Layer

25

or received between link partners. Hence there is no exchange of packets or link

command words but working in U0 this is an IDLE state so LUP command word is

transmitted by upstream port after every 10us to aid the disconnect detection . When a

header packet will be transmitted by the TX link partner LUP transmission will be

automatically ended.

3.4.2.2 Initialization for HP integrity and flow control

Whenever the link state transition back to U0 initialization using advertisement is done

to announce the link partner about parameters, header sequence number and header

buffer credit is advertised.

• Advertise expected RX Header Sequence Number, send LGOOD_n,where

n=HSEQ# of last good received HP other wise.

• Advertise Local RX Header Buffer Credit,send credit for all available buffers

using LCRD_x.

3.4.2.3 Valid Header Packets Exchange

TX  HP  BUFF E REmpty

HS E Q#=1,Acknowledged

HS E Q#=2,UnAcknowledged

Empty

TX  HP  BUFF E REmpty

HS E Q#=1,Acknowledged

HS E Q#=2,UnAcknowledged

Empty

R X  HP  BUFF E REmpty

HS E Q#=1,P roces s ing  Done

HS E Q#=2,C hecking  OK

Empty

R X  HP  BUFF E REmpty

HS E Q#=1,P roces s ing  Done

HS E Q#=2,C hecking  OK

Empty

R X  Header S equence # : 3

L C RD  Index: B

L ocal RX  Header B uffer c redit: 2

RX  Header S equence # : 3

L C RD  Index: B

L ocal RX  Header B uffer c redit: 2

TX  Header S equence #: 3

AC K  Header S equence#: 2 

R emote RX  Header B uffer C redit: 2

R emote RX  L C RD  Index: B

TX  Header S equence #: 3

AC K  Header S equence#: 2 

R emote RX  Header B uffer C redit: 2

R emote RX  L C RD  Index: B

L ink  Partner TX L ink  Partner RXHP,HS E Q#0

L GOOD_

0

HP ,HS E Q#1

L GOOD

_1

L C RD_A

HP ,HS E Q#2

Fig. 3.6: Successful Transmission & Reception of HP

Fig. 3.6 shows the successful transmission and rececption of HP.All the parameter

values are updated accordingly.Three HPs are sent by the Link PartnerTX HP#0,HP#1

and HP#2. LGOOD of HP#0 and HP#1 has been received so these packets are

acknowldeged, HP#0 is removed from the buffer because LCRD was received for that

Chapter 3 The Link Layer

26

HP#0, but HP#2 is still unacknowledged because it is being checked for errors,as

checking of HP#2 is also done valid, LGOOD for HP#2 is also about to send.

3.4.2.4 Header Packet Retry Process

TX  HP  BUF F E RHS E Q#=0,Acknowledged

HS E Q#=1,Unacknowledge

HS E Q#=2,Unacknowledged

Empty

TX  HP  BUF F E RHS E Q#=0,Acknowledged

HS E Q#=1,Unacknowledge

HS E Q#=2,Unacknowledged

Empty

R X  HP  BUF F E RHS E Q#=0, Proces s ing  Done

HS E Q#=1, C RC 5 E rror

HS E Q#=2, C hecking  OK

Empty

R X  HP  BUF F E RHS E Q#=0, Proces s ing  Done

HS E Q#=1, C RC 5 E rror

HS E Q#=2, C hecking  OK

Empty

R X  Header S equence # : 1

L C RD  Index: B

L ocal RX  Header Buffer c redit: 3

RX  Header S equence # : 1

L C RD  Index: B

L ocal RX  Header Buffer c redit: 3

TX  Header S equence #: 1

AC K  Header S equence#: 1

R emote R X  Header Buffer C redit: 3

R emote R X  L C RD  Index: B

TX  Header S equence #: 1

AC K  Header S equence#: 1

R emote R X  Header Buffer C redit: 3

R emote R X  L C RD  Index: B

L ink  Partner TX L ink  Partner RXHP ,HS E Q#0

HP ,HS E Q#1

L BAD

HP ,HS E Q#2

L GOOD_

0

L R TY

Fig. 3.7: Transmission of a corrupted HP

Figure 3.7 explains how a link layer controller will detect a corrupted header packet and

ask the link partner to resend packets again .Link partner has transmitted 3 header

packets HP#0 has been acknowledged but still waiting for LCRD to be removed from

the buffer .HP#1 and HP#2 are still unacknowledged and waiting for their LGOOD but

HP#1 is a corrupted header packet because it generated CRC5 error so LBAD is send

from Link Partner RX to Link Partner TX, after reception of LBAD TX will generate

LRTY for RX which means it is trying to resend the corrupted header packet but not

only corrupted HP but all unacknowledged HPs will be resend again to the RX ,HP#2

was a valid HP as it encounter no error in error checking but because of HP#1 it has to

be resend again. Until the reception of LRTY link partner RX will ignore all received

Chapter 3 The Link Layer

27

HP. Several types of header packet errors are detected and they are resolved using

different methods else they made the link to transition to recovery. They are:

1. Missing of a header packet

2. Invalid header packet due to CRC errors

3. Mismatch of a Rx Header Sequence Number

4. Mismatch of ACK Header Sequence Number.

5. Missing of frames.

3.4.3 Link Power Management

Requests to transition to low power link states are done at the link level during U0. Link

commands LGO_U1, LGO_U2, and LGO_U3 are sent by a port as a request to enter a

low power link state. LAU or LXU is sent by the other port as the response. LPMA is

sent by a port in response only to LAU.

3.4.3.1 Rules for a port to request or accept a low power link state

It has transmitted LGOOD_n, LCRD_x for all packets received.

• It has received and LGOOD_n, LCRD_x sequence for all packets transmitted.

• It has no pending packets for transmission.

• It is permitted to request or accept by the highest layer

• U3 is initiated only by software request to a downstream port, it must be

accepted.

3.5 Error Detection Algorithm (CRC)

Every modern communication protocol uses one or more error detection algorithms.

Cyclic Redundancy Check, or CRC, is by far the most popular one. CRC properties are

defined by the generator polynomial length and coefficients. The protocol specification

usually defines CRC in hex or polynomial notation. For example, CRC5 used in USB

3.0 protocol is represented as 0×5 in hex notation or as G(x)=x5+x2+1 in the

polynomial notation. Other CRC used in USB3.0 are CRC16 and CRC32 .CRC is

typically implemented in hardware as a linear feedback shift register (LFSR) with a

serial data input as shown in the Fig. 3.8.

Chapter 3 The Link Layer

28

Fig. 3.8: CRC5 Serial Remainder Generation Serial LFSR implementation of the CRC is suboptimal because of the serial data input it

only allows the CRC calculation of one data bit every clock. If a design has 32-bit wide

data path, meaning that every clock CRC module has to calculate CRC on 32-bit of

data, this approach will not work. To achieve higher throughput, serial LFSR

implementation of the CRC has to be converted into a parallel N-bit wide circuit, where

N is the design data path width, so that every clock N bits are processed. This is called

as parallel CRC implementation.

3.5.1 Method of Parallel CRC Computation

The method description is step-by-step and is accompanied by an example of parallel

CRC generation for the USB CRC5 polynomial G(x)=x5+x2+1 with 11-bit data width.

1. Let’s denote N=data width, M=CRC polynomial width. For example, if we want to

generate a parallel USB CRC5 for 11-bit data path, N=11, M=5.

2. Implement serial CRC generator of polynomial (00101b) in any language or

multisim.

3. Parallel CRC implementation is a function of N-bit data input as well as M-bit

current CRC state. In the following steps we’re going to build two matrices:

a) Mout (next CRC state) as a function of Min (current CRC state) when Nin=0.

This matrix is of size [MxM].

b) Mout as a function of Nin when Min=0. This matrix is of size [NxM]

4. Using the routine from step (2) calculate the CRC for the N values of Nin when

Min=0. Each of the Nin values is one-hot encoded, that is there is only one bit set.

Mout = CRC (Nin, Min=0).

Chapter 3 The Link Layer

29

5. Build the following [NxM] matrix. Each row contains the results from step (4) in

increasing order. For example, 1’st row contains the result of Nin =0×1, 2′nd row is

Nin =0×2, etc. The output is M-bit wide, which the desired CRC width. Figure.3.9

shows the matrix for USB CRC5 with N=11.

Table 3-2: NxM Matrix

Min=0 Mout[4] Mout[3] Mout[2] Mout[1] Mout[0] Nin[0] 1 1 1 1 1

Nin[1] 1 1 1 0 1

Nin[2] 1 1 1 0 0

Nin[3] 0 1 1 1 0

Nin[4] 0 0 1 1 1

Nin[5] 1 0 0 0 1

Nin[6] 1 1 0 1 0

Nin[7] 0 1 1 0 1

Nin[8] 1 0 1 0 0

Nin[9] 0 1 0 1 0

Nin[10] 0 0 1 0 1

6. Each column in this matrix represents an output bit Mout[i] as a function of Nin.

7. Using the routine from step (2) calculate CRC for the M values when Nin=0. Each

value is one-hot encoded, that is there is only one bit set. For M=5 the values are

0×1, 0×2, 0×4, 0×8, 0×10. Mout = CRC (Nin=0, Min).

8. Build the following [MxM] matrix, each row contains the results from (7) in

increasing order.

Table.3-3: MxM Matrix

Nin=0 Mout[0] Mout[1] Mout[2] Mout[3] Mout[4]

Min[0] 1 0 0 0 0

Min[1] 0 0 1 0 1

Min[2] 0 1 0 1 0

Min[3] 1 0 1 0 0

Min[4] 0 0 1 0 1

9. Build an equation for each Mout[i] bit: all Nin[j] and Min[k] bits in column [i] that

are set are polynomial coefficients and participate in the parallel CRC equation of

bit [i]. The participating inputs are XOR-ed together.

Chapter # 4

The Link Training and Status State Machine

(The LTSSM)

Chapter 4 The LTSSM

31

4.1 Introduction

The USB 3.0 architecture utilizes very efficient and productive algorithms for

maintaining reliable link, highly optimized power consumption and extremely fast and

flawless data transfer rate. The Link Training Status State Machine has been employed

as the foremost workhorse in these regards. Its functions and provisions contribute

matchlessly towards the “super speed” high class performance, delivered by USB 3.0.

The LTSSM tunes and trains the USB link for reliable data transfer. It also implements

various algorithms for link’s reliability maintenance and is also responsible to recover

the link from any errors as may arise. It also plays key role in power management by

greatly reducing link’s power consumption and nullifying any conditions that waste

power. The LTSSM allows or disallows data transaction over the USB 3.0 link based

upon the device state and the adopted power saving scheme. It blocks data path when

the device undergoes a serious error and then performs the error recovery itself. Then

when the link is ready to send or transmit data, it is the LTSSM that allows the data to

be communicated over the link. The LTSSM also performs operations for making the

link ready for data transaction in the very beginning when the device is plugged in.

Hence LTSSM is the “DATA FLOW GATEWAY CONTROL” for the device.

The core responsibilities of the LTSSM include:

• Link Training & Initialization

• Power Management

• Error Recovery

For the sake of performing these duties, it communicates and co-ordinates with almost

all the layers of the device namely the PHY, the MAC, the link layer and also the master

controller. Thus it behaves as the central traffic controller for the device. A brief

overview of the placement and interconnection of LTSSM with other layers is shown in

the Fig. 4.1.

Chapter 4 The LTSSM

32

USB 3.0 DEVIC

E

Host’s PHYSICAL

LAYER

DATA LINK

LAYER

MAC LAYER

PHYSICAL

MASTER CONTROL

LER

LTSSM

Fig. 4.1: LTSSM placement in USB 3.0 device

4.2 LTSSM’s Interconnections with other Layers

The layers that LTSSM shares signals with are, namely the PHY layer, the MAC layer,

the Data Link Layer (DLL) and the master controller. The LTSSM must inform all the

layers before opening or closing the gates for data transaction so as to save the device

from unnecessary resending or loss of data. Each signal has its own significance and

functionality. These signals have been designed up to the USB 3.0 specification’s

directions and requirements.

4.2.1 LTSSM and MAC Layer

The MAC layer acts as an interface between the PHY and LTSSM. It processes signals

coming from the LTSSM and correspondingly transmits the appropriate signals to the

PHY and vice versa. It also signals other layers about those signals if needed. These

signals are needed for a variety of operations like receiver detection, power

management and LFPS reception and sending. A detailed list of LTSSM – MAC

interface signals are given in the table 4-1.

Table 4-1: LTSSM – MAC Interface Signals

Name of Signal Direction Purpose

POWER DOWN[1:0] OUT Signals the PHY layer to adopt the appropriate power mode based upon the device usage and power saving scheme

RX_STATUS[2:0] IN Set by the PHY to signal the LTSSM

USB 3.0 Device

Chapter 4 The LTSSM

33

about presence or absence of a far end receiver

PHY_STATUS IN Set by the PHY to signal completion of receiver detection

LFPS_RECEIVED IN Set by the MAC to signal the LTSSM about the reception of LFPS signals.

TRANSMIT_LFPS OUT Set by the LTSSM when it desires to send the LFPS(Low Frequency Periodic Signal)

RX_TERMINATION OUT Set by the LTSSM to enable or disable the PHY’s termination resistors.

TRANSMIT OUT Set by the LTSSM when it is ready to allow MAC to carry out packet transactions.

VBUS IN Set by MAC to signal the LTSSM about presence or absence of power signal (vbus).

RECEIVER _DO OUT Set by the LTSSM to signal the PHY to initiate receiver detection

SEND_IDLE IN Set by the MAC to signal the sending of IDLEs

4.2.2 LTSSM and PHY Chip

The LTSSM also shares some direct signals with the PHY chip where MAC’s

interception is not necessary ant might overload MAC with extra tasks. These signals

enable the LTSSM to control the PHY directly when MAC or other layers are inactive

for the purpose of its own transactions, thereby reducing device’s power consumption

and increasing the efficiency. Such needs arise during link training and error recovery.

Table 4.-2: LTSSM – PHY Interface Signals

Name of Signal Direction Purpose TX [31:0] OUT Used to send 32-bit data to the host when needed

TX_K [3:0] OUT Used to send 4-bit data K word to the host in accordance with the “Tx” signal.

RX [31:0] IN Used to receive 32-bit data to the host when needed

RX_K [3:0] IN Used to receive 4-bit data K word to the host in accordance with the “Rx” signal.

RX_VALID IN Set by host to signal valid data reception

RX_EQ_TRAIN OUT Used to direct the PHY to bypass normal operations and perform receiver equalization

RESET_B IN Used to reset the device registers and buffers

Chapter 4 The LTSSM

34

CLK IN Provided by PHY as the operational clock frequency

TX_ELEC_IDLE OUT Set by LTSSM to start or stop data transaction

4.2.3 LTSSM and Data Link Layer

The Data Link layer frequently needs to communicate with the LTSSM during various

operations. It signals the LTSSM about the emergence of any link error that must drive

the LTSSM into error recovery procedures. The link layer also needs to tell the LTSSM

about any link power management commands from the host that might need LTSSM’s

intervention since almost all the power optimization is catered by the LTSSM. The

detailed list of LTSSM – DLL interface signals is given in the table 4-3.

Table 4-3: LTSSM – DLL Interface Signals

Name of Signal Direction Purpose

LGO_U1 IN Used to send the LTSSM to U1 from U0 LGO_U2 IN Used to send the LTSSM to U2 from U0 LGO_U3 IN Used to send the LTSSM to U3 from U0

LL_ADV_DONE IN Used to tell the LTSSM that U0 advertisement has been done

COMM_SENT IN To tell the LTSSM that command or data packets are being sent

SEND_LUP_EN OUT Used to instruct the DLL to send LUP link command word

ERROR_LL IN Used to tell the LTSSM to proceed to error recovery mode

U0 OUT Used to tell the device that device is in U0 RECOVERY_U0 OUT Used to tell DLL that U0 has been entered via

recovery

4.3 An Overview Of The LTSSM State Machine

The state machine of LTSSM is specified with 12 main states that carry out these

responsibilities. Four of these states are solely for power management. These four states

include U0, U1, U2, and U3. These four states provide different levels of energy saving

schemes U0 being the active most state having all the modules in the device active,

while U3 being the most dormant state with maximum power saving facilities, though at

a higher latency rate. The U1, and U2 are intermediate power states that provide certain

Chapter 4 The LTSSM

35

selective levels of power saving. Compared with U1, U2 allows for further power

saving opportunities with a penalty of increased exit latency. U3 is a link suspended

state where aggressive power saving opportunities is possible. The USB 3.0 has been

designed for maximum power saving, rendering the device completely inactive or

“sleeping” when not in use. This enhances up time for portable devices e.g. laptops etc.

The link training and initialization states are 2 in number namely RX DETECT and

POLLING. The RX DETECT state is designated for the link’s far end receiver detection

where as POLLING is mainly reserved for link training and receiver aligning.

RX.DETECT represents the initial power-on link state where a port is attempting to

determine if its SuperSpeed link partner is present. POLLING is a link state that is

defined for the two link partners to have their SuperSpeed transmitters and receivers

trained, synchronized, and ready for packet transfer.

The error recovery states are RECOVERY and COMPLIANCE. These states are

entered in case of any sort of miss matching of synchronization or malfunctioning.

Another state is LOOPBACK which is used as a ping to check the reliability and

operation of the receiver and transmitter. Next come two states that result in case of

recovery failure or when the device is disconnected. These states are SS.DISABLED

which is entered when device is completely rendered inactive or is disconnected, and

the SS.INACTIVE state which is entered when the device cannot operate in “super

speed” mode. In the last comes the HOT RESET mode which is entered when the host

desires to reset the device. The state machine of LTSSM can also be well described by a

state diagram in Fig. 4.2.

4.4 Detailed Description of LTSSM States

4.4.1 SS. DISABLED

It is a state where a port’s SuperSpeed connectivity is disabled with its receiver

termination removed. SS.Disabled is also a logical power-off state for a self-powered

USB device. The port does not receive or transmit any USB signals in this mode. Only

VBUS is detectable in this state.

Chapter 4 The LTSSM

36

4.4.2 SS. INACTIVE

This mode is entered as a result of far end receiver removal or other non recoverable

errors. During SS.Inactive, a port periodically performs a far-end link partner detection.

If a link partner is not detected, the device will return to RX.DETECT. Otherwise, the

link will stay in SS.Inactive until software intervention is made by issuing a warm reset.

The SS.INACTIVE contains sub states which are mentioned in the SS.INACTIVE state

machine (Fig. 4.3).

Fig. 4.2: LTSSM State Machine Diagram.

Chapter 4 The LTSSM

37

Fig. 4.3: SS.INACTIVE Sub state Machine

4.4.3 RX.DETECT

Rx.Detect is the power on state of the LTSSM for a USB device that is entered after

PowerOn Reset and Warm Reset, used to detect the impedance of far-end receiver. A

port will perform the far-end receiver termination detection periodically during

Rx.Detect. If the link partner is detected the LTSSM transitions to the link training state

called POLLING. Otherwise it stays in RX.DETECT. The RX.DETECT substate

machine is shown in the Fig. 4.4.

Fig. 4.4: RX.DETECT Substate Machine

4.4.4 POLLING

Polling is a state for link training. During Polling, a Polling.LFPS handshake shall take

place between the two ports before the SuperSpeed training is started. Bit lock, symbol

lock, and Rx equalization trainings are achieved using TSEQ, TS1, and TS2 training

Chapter 4 The LTSSM

38

ordered sets. The POLLING state contains several substates. During POLLING.LFPS,

the LFPS handshake is carried out to set link’s D.C operating point. Upon successful

completion of the handshake, the POLLING.RXEQ is entered during which more than

50,000 sets of TSEQ training sequences are sent to perform receiver equalization. Then

comes POLLING.ACTIVE where TS1 ordered sets are sent and received. A specified

number of TS1 sequence is required to be exchanged for a successful TS1 handshake.

Then in the POLLING.CONFIG sub-state TS2 ordered sets are exchanged which

contain different configuration settings. These configurations are decoded in the next

substate POLLING.IDLE where the next state is decided whether it is to be the active

U0 state or some other possible transition. IDLE symbols are exchanged during this

state. Upon successful accomplishment of all these steps the LTSSM is ready to put the

USB link in SuperSpeed packet transfer mode that is U0, where all types of packet

transfer is available and the link is fully active. The POLLING substate machine is

shown in the Fig. 4.5.

Fig. 4.5: POLLING Substate Machine

Chapter 4 The LTSSM

39

4.4.5 U0 – LINK ACTIVE

U0 is the normal operational state where packets can be transmitted and received. All

layers are active and working in this state. This state consumes maximum power so this

mode is sustained only as long as Super Speed packet transfer continues or is scheduled

to be made in very near feature. This state moves to lower power states as soon as high

speed packet transfer is not made or needed for a specified time.

4.4.6 U1 – LINK IDLE WITH FAST EXIT

U1 is a low power state where no packets are to be transmitted. This mode is “light

sleep” so it provides fastest transition back to other states. There are two possibilities of

exiting this state. If any packet transfers are needed again within a specified time out

period (U2 inactivity timeout), then it moves back to active state in the response of U1

EXIT LFPS signal, or, if that time out occurs and no activity is needed, it moves to an

even lower power mode U2.

Fig. 4.6: U1 Exit Conditions State Diagram

4.4.7 U2 – LINK IDLE WITH SLOW EXIT

U2 is an even lower power mode that provides even deeper power saving capability but

with an increased wake up time. The device goes into “moderate sleep” so it takes bit

longer than U1 to wake up. U2 can only result to U0 by either link partner when a

packet needs to be transmitted.

Chapter 4 The LTSSM

40

Fig. 4.7: U2 EXIT Conditions State Diagram

4.4.8 U3 – LINK SUSPENDED

It is the deepest low power link state where aggressive power saving is provided but its

exit latency is much higher then other two modes. The device goes into “deep sleep” so

it takes the longest to awake the device. It is entered by host through U0 and can also

exit to U0 only.

Fig. 4.8: U3 EXIT Conditions State Diagram

4.4.9 RECOVERY

The Recovery link state is entered to retrain the link after undergoing a serious error, or

to perform Hot Reset. The process of retraining is almost the same as initial training in

POLLING. However, in this case only TS1 and TS2 ordered sets are transmitted and

not TSEQ. The substate machine for RECOVERY is defined in Fig. 4.9.

Chapter 4 The LTSSM

41

Fig. 4.9: Recovery Substate Machine

4.4.10 LOOPBACK

Loopback is intended for testing the accuracy and compatibility of SuperSpeed receiver

and transmitter and also for fault isolation. Loopback includes a bit error rate test

(BERT) state machine. Loopback master is the port that starts loopback and slave is the

port that replies back.

4.4.11 COMPLIANCE

Compliance Mode is used to test the transmitter for compliance to voltage and timing

specifications. Several different test patterns are transmitted during compliance mode

that is designed for tuning different physical parameters of the physical layer. The

LTSSM transitions to RX.DETECT from this state upon the issuance of Warm Reset.

4.4.12 HOT RESET

The hot reset mode is used by either the device or the host to reset its and the partners

registers and timers, as required during the active data transmission. When the host

initiates reset, it shall transmit TS2 ordered sets with the Reset bit asserted, which is

then followed by the device. Once both ports receive the TS2 ordered sets with the

Chapter 4 The LTSSM

42

Reset bit de-asserted, they shall exit from Hot Reset.Active and return to U0 after

exchanging IDLE symbols.

Fig. 4.10: Hot Reset Substate Machine

4.5 Brief Description Of LTSSM’s Functionalities

4.5.1 Link Training & Initialization

One of the core tasks of the LTSSM is to train and make the USB 3.0 link ready for data

transaction. This process starts with the detection of a link partner at the far end of the

link in the RX.DETCT state. The detection starts as soon as the partner is plugged in to

the bus. Once the far end receiver detection is complete, the LTSSM then starts training

the link for synchronizing with the clock frequency and bit locking, in the POLLING

state, with the transmission and exchanging of TSEQ, TS1 and TS2. These training

sequences contain data bits that are designed to train and align the receivers of two link

partners. First of all TSEQ is sent for a specified number of times, then TS1 sequences

are sent and received as well since the far end partner is also designed to detect and send

back these sequences. Upon a successful handshake of all these link training stages

between the link partners, the link is then brought to the active power state, the U0 state,

where it is ready to carry out all super speed data transmissions and receptions.

Chapter 4 The LTSSM

43

Fig. 4.11: Link Initialization & Training Flow Chart

4.5.2 Power Management

The USB 3.0 architecture manages power consumption in a very peculiar and efficient

manner, which minimize power drainage from the host as long as possible. U0 is the

state where data packets are exchanged at super speed and all other communications are

openly made. As soon as this data transfer operation is completed, and there is no more

data transfer expected or scheduled, the system immediately puts device into lower

power mode, U1. If the link is still idle for a specified time in this period it sweeps to

U2 which provides even more power saving. Upon further idle behavior the device is

brought into deep sleep or “U3” mode which provides maximum power saving features

by turning off even the internal clocks for most modules. The device then stays in this

mode until re-triggered by the host.

4.5.3 Error Recovery

The RECOVERY state defined in LTSSM is entered whenever the link fails the

operation, or faces some errors or miss-matches. This state performs the retraining of

the link and the resets the device for the retrieval of the data transfer mode that it was

formerly in. The COMPLIANCE mode is chiefly meant to check if the receiver and

transmitter are in proper alignment.

Chapter 4 The LTSSM

44

Fig. 4.12: Power Management Flowchart

Fig. 4.13: Error Recovery Flow Charts

Chapter 4 The LTSSM

45

There are also various timers, counters and sequence senders associated with the

LTSSM that are constantly utilized for LTSSM’s operations. All these timers work

under the reluctances allowed in the specifications. All timeout values must be set to the

specified values after PowerOn Reset or Inband Reset.

The LTSSM is also associated with the very useful provision of USB 3.0, which are the

low frequency periodic signals (LFPS). These signals are of very low power and

perform very important tasks like hand shakes, reset generation and device active

pinging which enable power saving and even higher speeds. These LFPS signals are

characterized on the basis of their timings and repetitions as:

• Polling.LFPS – Sent during POLLING.LFPS as keep alive signal.

• Ping.LFPS – Sent during U1, U2 and COMPLIANCE as keep alive signal.

• U1/U2_EXIT_LFPS – Sent during U1/U2 to transit to recovery and then to U0.

• U3_WAKEUP – Sent during U3 to transit to recovery and then to U0.

• Warm Reset – Used to reset device registers and counters.

Chapter # 5

The Protocol Layer

Chapter 5 The Protocol Layer

47

5.1 Protocol Layer Overview

The protocol layer manages the end to end flow of data between a device and its host.

This layer is built on the assumption that the link layer guarantees delivery of certain

types of packets and this layer adds on end to end reliability for the rest of the packets

depending on the transfer type. This layer is responsible for making vital decisions of

managing a link, to control data flow and manage end-to-end connection which ensures

error-free end-to-end transactions and sending or sinking data sent by remote protocol

layer.

5.2 Types of Packets

SuperSpeed USB uses four basic packet types each with one or more subtypes. The four

packet types are:

• Link Management Packets (LMP) only travel between a pair of links (e.g., a pair

of directly connected ports and is primarily used to manage that link.

• Transaction Packets (TP) traverse all the links directly connecting the host to a

devise. They are used to control the flow of data packets, configure devices and

hubs, etc. transaction packets have no data payload.

• Data packets (DP) traverse all the links directly connecting the host to device.

Data Packets have two parts: a Data Packet Header(DPH) and Data Packet

Payload (DPP)

• Isochronous Timestamp Packets (ITP) are multicast on all the active links from

the host to one or more devices

NOTE: Detailed description of Packets’ format is given in USB 3.0 specifications.

5.3. Hardware Implementation of Protocol Layer Fig. 5.1 shows the basic building blocks of the implemented protocol layer.

Chapter 5 The Protocol Layer

48

 Fig. 5.1: Block Diagram of the Protocol Layer

5.3.1 Registers bank for Descriptors and Device Configuration

A device needs to be configured before its functionality could be used. The host can

read the device configuration to determine its capabilities and may set alternate settings

for configurations. A device descriptor describes general information about a device. It

includes information that applies globally to the device and all of the device’s

configurations. These register banks are accessed by the host through control transfers

using setup packets. Setup Packets are decoded by the packet disassembler while

protocol layer controller is responsible for fetching or setting these register bank.

5.3.2 Packet assembler

This module is associated with the assembling of packets. It involves the encoding of

header packets- proper placement of header fields’ contents as directed by the protocol

layer controller, crc16 generation and its appending, one word (16-bits) allocation for

link control word, proper placement of the desired data and its crc32 following the Data

Packet Payload (DPP) for DP. The DPP is fetched by the assembler from the SRAM

using read buffer interface and the assembled packets are placed into the dual-port-memory-bank-1 using write buffer interface.

Chapter 5 The Protocol Layer

49

Packet assembler is capable of assembling following packets:

1. Link Management Packets

Following subtypes are supported in the packet assembler

Set link function

U2 inactivity timeout

Port capability

Port configuration response.

2. Transaction Packets

Following subtypes are supported in the packet assembler

ACK

NRDY

ERDY

STALL

DEV_NOTIFICATION

PING_RESPONSE

3. Data Packets

Data packets are responsible for end-to-end data transfer. They don’t have subtypes.

Packet assembler is capable of latching the field required for assembling the packets

each time it is request by controller to assemble the packet so the controller doesn’t

need to drive valid configuration until the packet assembler is done. It provides

necessary hand-shaking signals to protocol layer controller for efficient performance.

NOTE: The assembled packets meet the entire requirements as per specifications.

5.3.3 Packet-disassembler

This module is associated with the disassembling or decoding of packets received from

the remote protocol layer. It involves the extraction of packet’s description i.e. packet

header type, subtype, sequence number etc., detection of crc32 error, DPP aborted, DPP

Chapter 5 The Protocol Layer

50

missing or Data length errors. Packet’s header information, extracted each time when a

valid packet is received, is provided to the protocol layer controller to initiate or resume

appropriate transactions while at the reception of valid data packet, packet-disassembler

either fills the data-buffers with data in DPP or return an appropriate response to the

host(as specified in the USB 3.0 specification). It communicates to the dual-port-

memory-bank-2 through read buffer interface to fetch the packets there while places

DPP, extracted from valid DP, into the SRAM. It provides valid descriptions extracted

from the previous valid disassembled packet until it is instructed to decode another

packet. It is capable of decoding the following packets:

1. Link Management Packets:

Following subtypes are supported in the packet assembler

Set link function

U2 inactivity timeout

Port capability

Port configuration.

2. Transaction Packets

Following subtypes are supported in the packet assembler

ACK

STATUS

3. Data Packets

Data packets are responsible for end-to-end data transfer. They don’t have subtypes.

5.3.4 Protocol layer controller

This module controls the overall operation of the protocol layer, making decisions on

transactions received and sent to the remote protocol layer. It provides the packet

assembler with the configurations to be sent with header packets also the number bytes

to be placed in the DPP if it is the DP. Packet assembler must fetch the configurations

when they are indicated as valid and master controller directs it to assemble the packet.

It also directs the master controller about the scheduling of encode section. When a

Chapter 5 The Protocol Layer

51

valid packet is disassembled by the packet-disassembler, protocol layer utilizes the

extracted information and the previously extracted information to furnish configurations

to the Packet-assembler. It also tells the packet-assembler and dis-assembler about the

base-address for fetching the desired data from the sram along with base-address when

data is to be written into the sram. Protocol layer controller can support control

transfers, IN and OUT transfers of maximum burst size of four.

5.3.4.1 IN Transfers

The protocol layer supports IN transfers of different burst sizes as furnished by the first

ACK TP it receives per transaction. Thus host can dynamically change the burst size if

desires. These transfers are supported by the data sinking into the host. These transfers

are initiated by the host by sending ACK TP of appropriate sequence number. Device

responds with Data packets with desired sequence number on the reception of valid

ACK TP.

This transfer sequence is implemented in the way that when protocol layer receive an

ACK TP from the host, it fetches specified number of bytes (as defined by maximum

packet size of device endpoint companion descriptor) from SRAM evaluates its crc-32

while it has already prepared it header with its crc-16 field. The packet assembler then

pushes the assembled packet into the dual-port-memory-bank1. The protocol layer

controller may wait for next ACK TP before sending new DP as decided through the

NumP and number of DP it has sent after the last ACK TP. ACK TP are fetched from

dual-port-memory-bank-2. These are the appropriate responses during IN transactions:

Chapter 5 The Protocol Layer

52

Fig. 5.2: SuperSpeed IN transfer sequence

Table 5-1: Responses to the TP requesting Data.

Invalid TP

Received

TP Received with Deferred

Bit Set

Device Tx Endpoint Halt

Feature Set

Device Ready to Transmit

Data

Action Taken

Yes Do not care Do not care Do not care The device shall ignore the TP.

No Yes Yes Do not care The device shall send an ERDY TP.

No Yes No No The device shall not respond. It shall send an ERDY TP when it is ready to resume.

No Yes No Yes The device shall send an ERDY TP indicating that it is ready to send data.

No No Yes Do not care Issue STALL TP No No No No Issue NRDY TP No No No Yes Start transmitting DPs

with sequence numbers requested by the host

NOTE: Since this is a memory device with only one master accessing the storage area,

device TX endpoint halt feature is a “don’t care”.

Chapter 5 The Protocol Layer

53

5.3.4.2 OUT Transfer During this transfer, host sources the data packets while device end point consumes

them.

Table 5-2: Host responses to the DP it receives from the device.

DPH has Invalid Values

Data Packet Payload Error

Host Can Accept Data

TP Returned by Host

Yes Do not care Do not care Discard data and do not send any TP. No Yes Do not care Discard data and send an ACK TP with

the Retry bit set requesting for one or more DPs with the Sequence Number field set to the sequence number of the DP that was corrupted.

No No No Discard data; send an ACK TP with the Retry bit set requesting for one or more DPs with the Sequence Number field set to the sequence number of the DP that the host was unable to receive. The ACK TP shall have the Host Error bit set to one to indicate that the host was unable to accept the data.

No No Yes Accept data and send an ACK TP requesting for zero or more DPs with the Sequence Number field set to the sequence number of the next DP expected. This is also an implicit acknowledgement that this DP was received successfully.

Each DP is acknowledged by the device having specified sequence number. Since the

device is capable of supporting data bursting of maximum burst size of four, the host

can send up to four DP before it waits for an ACK TP for the first DP it sent. DP

packets during OUT transfers are fetched from the dual-port-memory-bank-2, their

header information is extracted by the packet disassembler and if the packet received

was valid DP with no crc-32 errors is written into the SRAM. Header information is

provided to the protocol layer controller to provide appropriate response as given in the

table 5-3.

Chapter 5 The Protocol Layer

54

Table 5-3: Device responses to the DP received from the host.

DPH has

Invalid Values

DPH has Deferred Bit Set

Receiver Halt

Feature Set

Data Packet

Payload Error

Device Can Accept Data

TP Returned by Device

Yes Do not care

Do not care

Do not care Do not care Discard DP.

No Yes Yes Do not care

Do not care The device shall send an ERDY TP.

No Yes No Do not care

No The device shall not respond. It shall send an ERDY TP when it is ready to resume.

No Yes No Do not care

Yes The device shall send an ERDY TP.

No No Yes Do not care

Do not care The device shall send a STALL TP.

No No No Do not care No Discard DP, send an

NRDY TP.

No

No

No

Yes

Yes

Discard DP, send an ACK TP with the sequence number of the DP expected (thereby indicating that the DP was not received), the Retry bit set and the number of DPs that the device can receive for this endpoint.

No

No

No

No

Yes

Send an ACK TP indicating the sequence number of the next DP expected (thereby indicating that this DP was received successfully) and the number of DPs that the device can receive for this endpoint.

NOTE: Conditions for DPH to have deferred bit set or receiver halt feature are “don’t

care”.

Chapter 5 The Protocol Layer

55

 

Fig. 5.3: SuperSpeed OUT transfer sequence

5.3.5 Buffers for packet storage

Since the implemented USB3.0 device is capable of supporting burst transaction of

maximum burst size of four, protocol layer is implemented to have four data-packet

buffers for OUT transactions. This is rather accomplished in the SRAM entity because

intended system has nothing to do with the USB device operation. Nevertheless, in the

intended system there would be another dual-port-memory-bank serving as temporary

storage buffers and these buffers would be dumped into the hard-drive when they are

acknowledged.

5.3.6 Buffer controllers (buffer interfaces)

These controllers are meant for fetching and writing packets of specified data packet size on 

temporary  storage buffers  in  the device’s protocol  layer or  in dual‐port‐memory‐banks. The 

concept behind  these controllers  is  to remove  the burden  from  the protocol  layer controller 

for  fetching  and  writing  the  packets  in  the  buffer  storage  area.  They  provide  efficient 

handshaking signal for efficient performance. Data is fetched from the buffers via read buffer 

interfaces while written via write buffer interfaces.  

Chapter # 6

The Master Controller

Chapter 6 The Master Controller

57

6.1 Master Controller Overview

Master Controller is developed to command the communication flow between each

module. The centralized master controller monitors and controls the decoding and

encoding operation separately. Fig. 6.1 depicts the IO interface of master controller with

LTSSM, Physical Layer, Link Layer and Protocol Layer Controllers.

Fig. 6.1: Top Level Block Diagram of Master Controller; showing IO interface with each layer and LTSSM.

The control flow for encoding and decoding processes are described in the following

sections.

Chapter 6 The Master Controller

58

6.2 Decoding Path Controller

The decoding process is to take packet from the Phy chip and pass it to link layer

controller (decoder) and so forth. Master controller follows the protocols in the

sequence mentioned below.

1. When Phy Layer controller (Phy decoder) receives the complete packet, it

generates an indication signal to master controller which in turn will initialize

the Link Layer (LL) decoder, provided that LL decoder is not already in a busy

state. Meanwhile, master also sends the packet size to the LL decoder; it had

received from the Phy Layer decoder at the complete reception of packet.

2. When the packet is processed by the LL decoder, it generates an indication

signal to master controller which in turn will initialize the Protocol Layer (PL)

decoder, provided that it is not already busy. Link layer decoder de-assembles

the packet received (Chapter 2) and sends the new packet size (packet size

changes after passing through the packet de-assembler) to master controller.

Master then sends this new packet size to protocol layer decoder at the time of

its initialization.

Note: Master must deassert the initializing signal of Link Layer and Protocol Layer

decoders as soon as they acknowledged.

Fig. 6.2 depicts the timing diagram of decoding process.

 

 

 

 

 

 

Chapter 6 The Master Controller

59

 

Chapter 6 The Master Controller

60

6.3 Encoding Path Controller

The controlling protocols, mentioned below, are followed by the master controller in

order to encode the packet

1. Protocol Layer (PL) Encoder is initialized when master-configuration valid

signal is received by the Master controller provided that the PL encoder must not

already busy. As soon as the complete packet is encoded, PL encoder generates

a “pl_enc_done” signal (shown in Fig. 6.3) to the master informing it the packet

has been transferred into the buffer and ready to be fetched by Link Layer

controller. Master controller then generates a signal to initialize the Link Layer

(LL) encoder, provided that LL encoder is not already in a busy state.

Meanwhile, master also sends the packet size to the Link Layer encoder; it had

received from the PL encoder at the complete reception of packet.

2. After processing, assembling and transferring the complete packet in the buffer

(Chapter #2), LL encoder generates an indication signal to master controller

which in turn will initialize the Phy Layer encoder, provided that it is not already

busy. Link layer encoder also sends the new packet size (packet size changes

after passing through the packet assembler) to master controller. Master then

sends this new packet size to Phy layer encoder at the time of its initialization.

Note: Master must deassert the initializing signal of Protocol Layer, Link Layer and Phy

Layer Encoders as soon as they acknowledged.

Fig. 6.3 depicts the timing diagram of encoding process.

Chapter 6 The Master Controller

61

 

Chapter # 7

Functional Simulation of Implemented Device

63 Chapter 7 Functional Simulation of

Implemented Device

Simulation of the memory device is shown by looping back the data from the SRAM. In

the first phase of simulation, SRAM is completely filled with data sent from the

behavioral of host. In the second phase, the behavioral reads the filled SRAM and

checks whether the correct data is fetched. In this way operation from the PHY

controller, link layer up to protocol layer is verified. To meet this goal, several

compulsory features of each layer are also checked and verified.

7.1 Functional Verification of LTSSM

In order to prepare the device to handle data transmission and reception the links must

be initialized and trained. This process starts with the detection of a link partner at the

far end of the link in the RX.DETCT state. The detection starts as soon as the partner is

plugged in to the bus. Once the far end receiver detection is complete, the LTSSM then

starts training the link for synchronizing with the clock frequency and bit locking, in the

POLLING state, with the transmission and exchanging of TSEQ, TS1 and TS2. These

training sequences contain data bits that are designed to train and align the receivers of

two link partners. First of all TSEQ is sent for a specified number of times, then TS1

sequences are sent and received as well since the far end partner is also designed to

detect and send back these sequences. Upon a successful handshake of all these link

training stages between the link partners, the link is then brought to the active power

state, the U0 state, where it is ready to carry out all super speed data transmissions and

receptions.

7.2 Functional Verification of Phy Layer Controller

Phy decoder will play its role during the first phase of simulation as the data is coming

from the host. Phy decoder will remain idle unless RxValid signal (from behavioral of

Host, Fig. 2.2) is seen asserted. As soon as the rising edge of RxValid signal is sensed,

decoder requests Write Buffer Interface to received data coming from the Host. As soon

as it acknowledges, the Phy decoders starts fetching the data and place it on the ports

facing write buffer interface which in turn place the data into the buffer # 4 (Fig. 1.1).

Meanwhile it also looks for the control bytes (on RxData bus) on the basis of which it

could find out the size of packet (See Section 2.6). The operation of Phy encoder and

decoder are described in detail in sections 2.5 and 2.6, respectively.

64 Chapter 7 Functional Simulation of

Implemented Device

7.3 Functional Verification of Link Layer

When Link layer initializes to U0 it starts with the advertisement of last acknowledged

Header Sequence Number and Buffer Credit to its link partner. After advertisement link

layer starts sending LUP to its link partner until no transaction is carried out on encode

and decode path. When encode path initializes and protocol layer places the assembled

data to the dual port memory bank 1, link layer controller fetches the data and places the

link control word and already prepared and framed link commands of RX and TX with

the data and writes the encoded data on dual port memory bank-3 for PHY layer

,similarly when decode path initializes link layer reads the data from dual port memory

bank 4 and tests the data for CRC errors and matches the Header Sequence Number and

generate link command words and also decode the received link commands from link

partner and updates the parameter accordingly, reception of LBAD stops the transaction

of data unless LRTY is received or transmitted from link partners. Link commands for

Link power management are only accepted if all the header and data packets have been

acknowledged from link partners and removed from the buffers but

LGO_U3(requesting entry to U3) from host cannot be rejected.

7.4 Functional Verification of Protocol Layer

Master controller strobes protocol layer dis-assembler to initiate its operation. Protocol

layer dis-assembler decodes packet stored in the dual-port-memory bank # 3 extracts the

descriptions stored in the packet and route them towards the protocol layer controller

which decides about the packet to be sent in response, evaluates its configurations and

provides them to the protocol layer packet assembler while indicating the master

controller that packet assembler has valid configuration to initiate a valid packet. Master

controller then strobes the packet assembler which fetches required amount of data from

the SRAM along with its crc32 field (if it is an IN transaction) appends it with the

header packet thus assembling the packet and writing it into the dual-port-memory-bank

# 1. This is the most basic state machine of the protocol layer. Depending upon the

configurations fetched from the packets decoded and the packet assembled, protocol

layer either resumes the transaction with successive sequence numbers or wait for new

packet to be decoded. However, protocol layer assembler and protocol layer dis-

assembler works concurrently.

Bibliography

[1]. Universal Serial Bus 3.0 Specification, Revision 1.0, November 12, 2008.

[2]. Universal Serial Bus Specification, Revision 2.0, April 27, 2000.

[3]. PHY Interface for the PCI Express TM and USB Architectures, Version 2.90, Intel Corporation,

2007-08.

[4]. On-The-Go Supplement to the USB 2.0 Specification, Revision 1.3, December 5, 2006.

[5]. Inter-Chip USB Supplement to the USB 2.0 Specification, Revision 1.0, March 13, 2006.

[6]. High-Speed Inter-Chip USB Electrical Specification, Version 1.0, September 23, 2007.

[7]. UTMI+ Low Pin Interface (ULPI) Specification, Revision 1.1, October 20, 2004.

[8]. UTMI+ Specification, Revision 1.0, February 25, 2004.

[9]. USB System Architecture (USB 2.0), MindShare, Inc., Don Anderson.

[10]. Samir Palnitkar, Verilog HDL: A guide to Digital Design and Synthesis, Second Edition, Prentice

Hall, 2003.

[11]. Pong P. Chu, FPGA Prototyping by Verilog Examples, John Wiley & Sons, Inc., 2008.

[12]. Janick Bergeron, Writing Testbenches: Functional Verification of HDL Models.

[13]. Peter J. Ashenden, Digital Design: An Embedded System Approach using Verilog, Elsevier, 2008.