our ‘xmit1000.c’ driver

23
Our ‘xmit1000.c’ driver Implementing a ‘packet- transmit’ capability with the Intel 82573L network interface controller

Upload: abbott

Post on 12-Jan-2016

24 views

Category:

Documents


3 download

DESCRIPTION

Our ‘xmit1000.c’ driver. Implementing a ‘packet-transmit’ capability with the Intel 82573L network interface controller. Remenber ‘echo’ and ‘cat’?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Our ‘xmit1000.c’ driver

Our ‘xmit1000.c’ driver

Implementing a ‘packet-transmit’ capability with the Intel 82573L

network interface controller

Page 2: Our ‘xmit1000.c’ driver

Remenber ‘echo’ and ‘cat’?

• Your device-driver module (named ‘uart.c’) was supposed to allow two programs that are running on a pair of adjacent PCs to communicate via a “null-modem” cable

$ echo Hello > /dev/uart$ _

$ cat /dev/uartHello _

Receiving…Transmitting…

Page 3: Our ‘xmit1000.c’ driver

‘keep it simple’

• Let’s try to implement a ‘write()’ routine for our Intel Pro/1000 ethernet controllers that will provide the same basic functionality as we achieved with our serial UART driver

• It should allow us to transmit a message by using the familiar UNIX ‘cat’ command to redirect output to a character device-file

• Our device-file will be named ‘/dev/nic’

Page 4: Our ‘xmit1000.c’ driver

This function will program the actual data-transfer

Driver’s components

write

my_fops

my_write()

module_init() module_exit()

This function will allow us to inspect the transmit-descriptors

This function will detect and configure the hardware, define page-mappings, allocate and initialize the descriptors, start the ‘transmit’ engine, create the pseudo-file and register ‘my_fops’

This function will do needed ‘cleanup’ when it’s time to unload our driver – turn off the ‘transmit’ engine, free the memory, delete page-table entries, the pseudo-file, and the ‘my_fops’

‘struct’ holds one function-pointer

my_get_info()

Page 5: Our ‘xmit1000.c’ driver

Kzalloc()

• Linux kernels since 2.6.13 offer this convenient function for allocating pre-zeroed kernel memory

• It has the same syntax as the ‘kmalloc()’ function (described in our texts), but adds the after-effect of zeroing out the newly-allocated memory-area

• Thus it does two logically distinct actions (often coupled anyway) within a single function-call

void *kmem = kmalloc( region_size, GFP_KERNEL );memset( kmem, 0x00, region_size );

/* can be replaced with */void *kmem = kzalloc( region_size, GFP_KERNEL );

Page 6: Our ‘xmit1000.c’ driver

Single page-frame option

Packet-Buffer (3-KB)(reused for successive transmissions)

4KBPage-Frame

Descriptor-Buffer (1-KB)(room for up to 256 descriptors)

Page 7: Our ‘xmit1000.c’ driver

Our Tx-Descriptor ring

descriptor 0

Our ‘reusable’

transmit-buffer(1536 bytes)

descriptor 1

descriptor 2

descriptor 3

descriptor 4

descriptor 5

descriptor 6

descriptor 7Array of 8 transmit-descriptors one packet-buffer

TAIL HEAD

After writing the data into our packet-buffer, and writing its length to the the current TAIL descriptor, our driver will advance the TAIL index; the NIC responds by reading the current HEAD descriptor, fetching its data, then advancing the HEAD index as it sends our data out over the wire.

Page 8: Our ‘xmit1000.c’ driver

‘/proc/xmit1000’

• This pseudo-file can be examined anytime to find out what values (if any) the NIC has ‘written back’ into the transmit-descriptors (i.e., the descriptor-status information) and current values in registers TDH and TDT:

$ cat /proc/xmit1000

Page 9: Our ‘xmit1000.c’ driver

Direct Memory Access

• The NIC is able to ‘fetch’ descriptors from host-system’s memory (and also can read the data from our packet-buffer) as well as ‘store’ a status-report back into the host’s memory by temporarily becoming the Bus Master (taking control of the system-bus away from the CPU so that it can perform the ‘fetch’ and ‘store’ operations directly, without CPU involvement or interference)

Page 10: Our ‘xmit1000.c’ driver

Configuration registers

TIPG

TCTL

TDBAL

TDBAH

TDLEN

TDH

TDT

TXDCTL

CTRL

CTRL_EXT

Device Control

Extended Device Control

Transmit Inter-Packet Gap

Transmit Control

Transmit Descriptor-queue Base-Address (LOW)

Transmit Descriptor-queue Base-Address (HIGH)

Transmit Descriptor-queue Length

Transmit Descriptor-queue HEAD

Transmit Descriptor-queue TAIL

Transmit Descriptor-queue Control

Page 11: Our ‘xmit1000.c’ driver

The ‘initialization’ sequence

• Detect the network interface controller• Obtain its i/o-memory address and size • Remap the i/o-memory into kernel-space• Allocate memory for buffer and descriptors• Initialize the array of transmit-descriptors• Reset the NIC and configure its operations• Create the ‘/proc/xmit1000’ pseudo-file• Register our ‘write()’ driver-method

Page 12: Our ‘xmit1000.c’ driver

The ‘cleanup’ sequence

• Usually the steps here follow those in the initialization sequence -- but in backwards order:

• Unregister the device-driver’s file-operations• Delete the ‘/proc/xmit1000’ pseudo-file• Disable the NIC’s ‘transmit’ engine• Release the allocated kernel-memory • Unmap the NIC’s i/o-memory region

Page 13: Our ‘xmit1000.c’ driver

Our ‘write()’ algorithm

• Get index of the current TAIL descriptor• Confine the amount of user-data • Copy user-data into the packet-buffer• Setup the packet’s Ethernet Header• Setup packet-length in the TAIL descriptor• Now hand over this descriptor to the NIC

(by advancing the value in register TDT)• Tell the kernel how many bytes were sent

Page 14: Our ‘xmit1000.c’ driver

Recall Tx-Descriptor Layout

special

0x0

0x4

0x8

0xC

CMD

Buffer-Address high (bits 63..32)

Buffer-Address low (bits 31..0)

31 0

Packet Length (in bytes)CSO

statusCSS reserved=0

Buffer-Address = the packet-buffer’s 64-bit address in physical memory Packet-Length = number of bytes in the data-packet to be transmitted CMD = Command-field CSO/CSS = Checksum Offset/Start (in bytes) STA = Status-field

Page 15: Our ‘xmit1000.c’ driver

Suggested C syntax

typedef struct {unsigned long long base_addr;unsigned short pkt_length;unsigned char cksum_off;unsigned char desc_cmd;unsigned char desc_stat;unsigned char cksum_org;unsigned short special;} TX_DESCRIPTOR;

Page 16: Our ‘xmit1000.c’ driver

Transmit IPG (0x0410)

82573L

IPG

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

R=0

IPG After Deferral(Recommended value = 7)

IPG Part 1(Recommended value = 8)

IPG Back-To-Back(Recommended value = 8)

IPG = Inter-Packet Gap

This register controls the Inter-Packet Gap timer for the Ethernet controller.

Note that the recommended TIPG register-value to achieve IEEE 802.3 compliant minimum transfer IPG values in full- and half-duplex operations would be 00702008 (hexadecimal), equal to (7<<20) | (8<<10) | (8<<0).

Page 17: Our ‘xmit1000.c’ driver

Transmit Control (0x0400)

R=0

R=0

R=0

MULR TXCSCMTUNORTX RTLC R

=0

SWXOFF

COLD (upper 6-bits)(COLLISION DISTANCE)

COLD (lower 4-bits)(COLLISION DISTANCE) 0 ASDV

ILOS

SLU

TBImode

PSP

0 0 R=0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

R=0

EN

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

SPEEDCT

(COLLISION THRESHOLD)

EN = Transmit Enable SWXOFF = Software XOFF TransmissionPSP = Pad Short Packets RLTC = Retransmit on Late CollisionCT = Collision Threshold (=0xF) UNORTX = Underrun No Re-TransmitCOLD = Collision Distance (=0x3F) TXCSCMT = TxDescriptor Minimum Threshold

MULR = Multiple Request Support

82573L

Page 18: Our ‘xmit1000.c’ driver

Our driver’s elections

int tx_control = 0;

tx_control |= (0<<1); // EN-bit (Enable Transmit Engine)tx_control |= (1<<3); // PSP-bit (Pad Short Packets)tx_control |= (15<<4); // CT=15 (Collision Threshold)tx_control |= (63<<12); // COLD=63 (Collision Distance)tx_control |= (0<<22); // SWXOFF-bit (Software XOFF Tx)tx_control |= (1<<24); // RTLC-bit (Re-Transmit on Late Collision)tx_control |= (0<<25); // UNORTX-bit (Underrun No Re-Transmit)tx_control |= (0<<26); // TXCSMT=0 (Tx-descriptor Min Threshold)tx_control |= (0<<28); // MULR-bit (Multiple Request Support)

iowrite32( tx_control, io + E1000_TCTL ); // Transmit Control register

82573L

Here’s a C programming style that ‘documents’ the programmer’s choices.

Page 19: Our ‘xmit1000.c’ driver

An ‘e1000.c’ anomaly?

• The official Linux kernel is delivered with a device-driver supporting Intel’s ‘Pro/1000’ gigabit ethernet controllers (several)

• Often this driver will get loaded by default during the system’s startup procedures

• But it will interfere with your own driver if you try to write a substitute for ‘e1000.ko’

• So you will want to remove it with ‘rmmod’

Page 20: Our ‘xmit1000.c’ driver

Side-effect of ‘rmmod’

• We’ve observed an unexpected side-effect of ‘unloading’ the ‘e1000.ko’ device-driver

• The PCI Configuration Space’s command register gets modified in a way that keeps the NIC from working with your own driver

• Specifically, the Bus Mastering capability gets disabled (by clearing bit #2 in the PCI Configuration Space’s word at address 4)

Page 21: Our ‘xmit1000.c’ driver

What to do about it?

• This effect doesn’t arise on our ‘anchor’ cluster machines, but you may encounter it when you try using our demo elsewhere

• Here’s the simple “fix” to turn Bus Master capability back on (in your ‘module_init()’)

u16 pci_cmd;// declares a 16-bit variable

pci_read_config_word( devp, 4, &pci_cmd ); // read current wordpci_cmd |= (1<<2); // turn on the Bus Master enabled-bitpci_write_config_word( devp, 4, pci_cmd ); // write modification

Page 22: Our ‘xmit1000.c’ driver

In-class demo

• We demonstrate our ‘xmit1000.c’ driver on an ‘anchor’ machine, with some help from a companion-module (named ‘recv1000.c’) which is soon-to-be discussed in class

$ echo Hello > /dev/nic$ _

$ cat /dev/nicHello _

Receiving…

Transmitting…

anchor01 anchor05LAN

Page 23: Our ‘xmit1000.c’ driver

In-class exercise

• Open three or more terminal-windows on your PC’s graphical desktop, and login to a different ‘anchor’ machine in each one

• Install the ‘xmit1000.ko’ module on one of the anchor machines, and then install our ‘recv1000.ko’ module on the other stations

• Execute the ‘cat /dev/nic’ command on the receiver-stations, and then run an ‘echo’ command on the transmitter-station