(c) all rights reserved by professor wen-tsuen chen 1 chapter 6 the transport layer © all rights...

44
(C) All rights reserved (C) All rights reserved by Professor Wen-Tsuen by Professor Wen-Tsuen Chen Chen 1 Chapter 6 The Transport Layer rights reserved. No part of these slides may be reproduced, rights reserved. No part of these slides may be reproduced, orm or by any means, without permission in writing from orm or by any means, without permission in writing from rofessor Wen-Tsuen Chen (email: [email protected]). rofessor Wen-Tsuen Chen (email: [email protected]).

Post on 21-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

11

Chapter 6The Transport LayerChapter 6The Transport Layer

© All rights reserved. No part of these slides may be reproduced, in any© All rights reserved. No part of these slides may be reproduced, in any form or by any means, without permission in writing from form or by any means, without permission in writing from Professor Wen-Tsuen Chen (email: [email protected]).Professor Wen-Tsuen Chen (email: [email protected]).

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

22

A transport address: a port in Internet transport layer, a socketA transport address: a port in Internet transport layer, a socketin Berkeley UNIX.in Berkeley UNIX.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

33

Internet Transport ProtocolsInternet Transport Protocols

User Datagram Protocol (UDP)User Datagram Protocol (UDP) Provide an unreliable connectionless delivery service.Provide an unreliable connectionless delivery service. Use IP to transport messages between machines, but adding the abUse IP to transport messages between machines, but adding the ab

ility to distinguish among multiple destinations within a given host ility to distinguish among multiple destinations within a given host computer.computer.

RFC 768.RFC 768.

Transmission Control Protocol (TCP)Transmission Control Protocol (TCP) Provide reliable stream delivery service.Provide reliable stream delivery service. Virtual circuit connection.Virtual circuit connection. Unstructured byte stream.Unstructured byte stream. Full duplex connection.Full duplex connection. RFC 793, RFC 1122, RFC 1323.RFC 793, RFC 1122, RFC 1323.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

44

UDP HeaderUDP Header

UDP accepts incoming datagrams from the IP and demultiplexes baseUDP accepts incoming datagrams from the IP and demultiplexes based on the UDP destination Port.d on the UDP destination Port.

UDP length contains the number of bytes including the UDP header aUDP length contains the number of bytes including the UDP header and the user data.nd the user data.

To compute the UDP checksum, first store zero in the checksum field, To compute the UDP checksum, first store zero in the checksum field, then accumulates a 16-bit one’s complement sum of the following psethen accumulates a 16-bit one’s complement sum of the following pseudo-header, UDP header, and user data.udo-header, UDP header, and user data.

Source IP AddressSource IP AddressDestination IP AddressDestination IP Address

00 3131

UDP lengthUDP lengthProtocol = 17Protocol = 170000000000000000

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

55

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

66

TCP: Transmission Control ProtocolTCP: Transmission Control Protocol

TCP uses the connection,TCP uses the connection, not the protocol port, as its not the protocol port, as its fundamental abstraction.fundamental abstraction.

A connection is identified by a pair of endpoints. A connection is identified by a pair of endpoints. An endpoint is a pair (An endpoint is a pair (hosthost,,portport), where ), where hosthost is the is the IP address for a host and IP address for a host and portport is a TCP port on that is a TCP port on that host.host.

A given TCP port number in a host can be shared A given TCP port number in a host can be shared by multiple connections.by multiple connections.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

77

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

88

TCP views the data stream as a sequence of bytes, TCP views the data stream as a sequence of bytes, not a message stream, that it divides into segments not a message stream, that it divides into segments for transmission, each as a single IP datagram.for transmission, each as a single IP datagram.

The basic TCP protocol is the sliding window protThe basic TCP protocol is the sliding window protocol .ocol .

TCP assumes little about the underlying communiTCP assumes little about the underlying communication system. TCP can be implemented to use IP cation system. TCP can be implemented to use IP datagram delivery service, dialup telephone lines, datagram delivery service, dialup telephone lines, a LAN, or a WAN network.a LAN, or a WAN network.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

99

TCP Segment FormatTCP Segment Format

Source port and destination port contain the TCP port numbers that identify Source port and destination port contain the TCP port numbers that identify the application programs at the end of the connection.the application programs at the end of the connection.

Sequence number identifies the position in the sender’s byte stream of the data Sequence number identifies the position in the sender’s byte stream of the data in the segment.in the segment.

Acknowledgement number specifies the next byte expected.Acknowledgement number specifies the next byte expected. Window size tells how many bytes may be sent starting at the byte Window size tells how many bytes may be sent starting at the byte

acknowledged.acknowledged.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

1010

TCP header length contains the length of the segment header in 32-bit TCP header length contains the length of the segment header in 32-bit multiples.multiples.

URG is set if the Urgent pointer is in use. The Urgent pointer indicates a byte URG is set if the Urgent pointer is in use. The Urgent pointer indicates a byte offset from the current sequence number at which urgent data locates. Urgent offset from the current sequence number at which urgent data locates. Urgent data should be processed as quickly as possible, regardless of its position in data should be processed as quickly as possible, regardless of its position in the stream.the stream.

ACK is set if the Acknowledgement number is valid.ACK is set if the Acknowledgement number is valid. PSH indicated a push segment. The segment is delivered to the application PSH indicated a push segment. The segment is delivered to the application

program upon arrived to the application program upon arrival and not program upon arrived to the application program upon arrival and not buffered.buffered.

RST is used to reset a connection.RST is used to reset a connection. SYN=1 and ACK=0 indicate connection request; SYN=1 and ACK=1 indicate SYN=1 and ACK=0 indicate connection request; SYN=1 and ACK=1 indicate

connection reply.connection reply. FIN is used to indicate the end of the byte stream and release a connection.FIN is used to indicate the end of the byte stream and release a connection.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

1111

The algorithm to compute TCP checksum is the same as The algorithm to compute TCP checksum is the same as that of computing UDP checksum with the following that of computing UDP checksum with the following pseudo-header.pseudo-header.

Options provides a way to add extra facilities not covered Options provides a way to add extra facilities not covered by the regular header.by the regular header. Window scale option for negotiating a window scale factor, Window scale option for negotiating a window scale factor,

allowing windows of up to bytes.allowing windows of up to bytes. Option proposed in RFC 1106 allows the use of selective repeat Option proposed in RFC 1106 allows the use of selective repeat

instead of go back n protocol.instead of go back n protocol.

322

IP

IP

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

1212

TCP Connection EstablishmentTCP Connection Establishment

Initial sequence number should be carefully chosen to prevent the delayed Initial sequence number should be carefully chosen to prevent the delayed duplicate problem.duplicate problem.

Use a three-way handshake. Each side need not start with the same Use a three-way handshake. Each side need not start with the same sequence number.sequence number.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

1313

TCP Connection ReleaseTCP Connection Release

Four TCP segments are needed to release a connection, one FIN and one Four TCP segments are needed to release a connection, one FIN and one ACK for each direction.ACK for each direction.

However, the first ACK and the second FIN maybe combined, reducing However, the first ACK and the second FIN maybe combined, reducing the total to three.the total to three.

FIN(SEQ=X)FIN(SEQ=X)

ACK Y+1ACK Y+1

ACK X+1ACK X+1

FIN(SEQ=Y, ACK=X+1)FIN(SEQ=Y, ACK=X+1)

TimeTime

Host1Host1 Host2Host2

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

1414

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

1515

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

1616

TCP Finite State MachineTCP Finite State Machine

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

1717

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

1818

Berkeley UNIX Socket PrimitivesBerkeley UNIX Socket Primitives

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

1919

TCP Transmission PolicyTCP Transmission Policy

Using Variable Sliding Window Protocol.Using Variable Sliding Window Protocol. ““Receiver’s Window” is sent back to advertise the number of bytes the receiver Receiver’s Window” is sent back to advertise the number of bytes the receiver

prepares to accept.prepares to accept.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

2020

Design IssuesDesign Issues

How long is the retransmission timer to How long is the retransmission timer to accommodate varying delays in the Internet?accommodate varying delays in the Internet?

How to response to congestion?How to response to congestion? Silly Window Syndrome.Silly Window Syndrome.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

2121

TCP Timer ManagementTCP Timer Management

How long is the TCP retransmission Timer Interval?How long is the TCP retransmission Timer Interval?

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

2222

TCP Congestion ControlTCP Congestion Control By dynamically adjusting the window size. The number of By dynamically adjusting the window size. The number of

bytes that may be sent is min{bytes that may be sent is min{congestion windowcongestion window, receiver, receiver’s window}.’s window}.

Slow startSlow start algorithm to adjust the congestion window. algorithm to adjust the congestion window. Initially, congestion window = size of max. segment.Initially, congestion window = size of max. segment. double the window size until either a timeout occurs or the receivedouble the window size until either a timeout occurs or the receive

r’s window is reached (ie. Increase the window size by 1 segment r’s window is reached (ie. Increase the window size by 1 segment each time an acknowledgement arrives).each time an acknowledgement arrives).

ThresholdThresholdinging initially 64K.initially 64K. when a timeout occurs, the threshold is set to half of the current cowhen a timeout occurs, the threshold is set to half of the current co

ngestion window and the CW is reset to one max. segment. Then tngestion window and the CW is reset to one max. segment. Then the he slow start slow start algorithm is used.algorithm is used.

When the threshold is hit, the CW grows linearly (CW is increased When the threshold is hit, the CW grows linearly (CW is increased by 1 only if all segments in the window have been acknowledged).by 1 only if all segments in the window have been acknowledged).

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

2323

Slow start

Congestion avoidance

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

2424

Various Implementations of TCP Various Implementations of TCP

Tahoe TCPTahoe TCP Reno TCPReno TCP New-Reno TCPNew-Reno TCP Selective ACK TCPSelective ACK TCP

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

2525

Tahoe TCPTahoe TCP

Early TCP implementations use cumulative positive ACK aEarly TCP implementations use cumulative positive ACK and requiring a retransmission timer expiration to re-send pand requiring a retransmission timer expiration to re-send packets lost during transport.ckets lost during transport. These TCPs do little to minimize network congestion.These TCPs do little to minimize network congestion.

Tahoe TCP adds a number of new algorithms and refinemeTahoe TCP adds a number of new algorithms and refinements to earlier implementations.nts to earlier implementations. Slow-Start, Congestion Avoidance Slow-Start, Congestion Avoidance and and Fast Retransmit.Fast Retransmit. Fast Retransmit: After receiving a small number of duplicate ACFast Retransmit: After receiving a small number of duplicate AC

Ks for the same TCP connection, the data sender infers that a pacKs for the same TCP connection, the data sender infers that a packet has been lost and retransmits the packet without waiting for a ket has been lost and retransmits the packet without waiting for a retransmission timer to expire.retransmission timer to expire.

Higher network utilization and throughput.Higher network utilization and throughput.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

2626

Reno TCPReno TCP

Reno TCP retains the enhancements incorporated Reno TCP retains the enhancements incorporated into Tahoe, but modify the Fast Retransmit procedinto Tahoe, but modify the Fast Retransmit procedure to include ure to include Fast RecoveryFast Recovery.. Fast Recovery avoids the need to Fast Recovery avoids the need to Slow-StartSlow-Start after a sin after a sin

gle packet loss.gle packet loss. The sender enters The sender enters fast recoveryfast recovery phase after receiving t phase after receiving t

hree duplicate ACKs, the sender retransmits the lost pahree duplicate ACKs, the sender retransmits the lost packet and reduces its congestion window by one half.cket and reduces its congestion window by one half.

Upon receipt of an ACK for new data, the sender exits Upon receipt of an ACK for new data, the sender exits Fast Recovery phase.Fast Recovery phase.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

2727

New-Reno TCPNew-Reno TCP

New-Reno TCP includes a small change to the Reno at the New-Reno TCP includes a small change to the Reno at the sender when a sender when a partialpartial ACK ACK is received during is received during Fast Fast Recovery Recovery phase.phase. PartialPartial ACK: acknowledges some but not all of the packets that ACK: acknowledges some but not all of the packets that

were outstanding at the start of Fast Recovery.were outstanding at the start of Fast Recovery. In Reno: In Reno: partial partial ACK take TCP out of Fast Recovery phase by ACK take TCP out of Fast Recovery phase by

deflating the size of congestion window.deflating the size of congestion window. In New-Reno: In New-Reno: PartialPartial ACK do not take TCP out of Fast ACK do not take TCP out of Fast

Recovery phase.Recovery phase. Partial ACK indicates the packet immediately following the Partial ACK indicates the packet immediately following the

acknowledged packet has been lost and should be retransmitted.acknowledged packet has been lost and should be retransmitted. Thus, New-Reno can recover multiple packets from a single window of Thus, New-Reno can recover multiple packets from a single window of

data without a retransmission timeout.data without a retransmission timeout.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

2828

Selective ACK TCPSelective ACK TCP

RFC 2018RFC 2018 Each ACK contains information about up to Each ACK contains information about up to

three noncontiguous block of data that have three noncontiguous block of data that have been received by the receiver.been received by the receiver. Each block of data is described by its starting Each block of data is described by its starting

and ending sequence number.and ending sequence number. The sender can then retransmit only the The sender can then retransmit only the

missing data packets.missing data packets.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

2929

Wireless TCP and UDPWireless TCP and UDP

In wired network, Timeouts are mostly caused by In wired network, Timeouts are mostly caused by congestion, not by lost packets. => The sender should slow congestion, not by lost packets. => The sender should slow down.down.

In wireless network, transmission links are highly In wireless network, transmission links are highly unreliable. => The sender should try to send harder. unreliable. => The sender should try to send harder.

Wireless link is often characterized by high bit-error rate Wireless link is often characterized by high bit-error rate and intermittent connectivity due to handoff.and intermittent connectivity due to handoff. will be mistaken as the congestionwill be mistaken as the congestion result in significant throughput degradation and high interactive result in significant throughput degradation and high interactive

delaydelay

The path from the sender to the receiver is The path from the sender to the receiver is inhomogeneous.inhomogeneous.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

3030

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

3131

Silly Window SyndromeSilly Window Syndrome

To avoid Receive-Side Silly Window Syndrome, before sending an updated wTo avoid Receive-Side Silly Window Syndrome, before sending an updated window advertisement after advertizing a zero window, wait for space to becomindow advertisement after advertizing a zero window, wait for space to become available that is either at least 50% of the total buffer size or equal to a maxie available that is either at least 50% of the total buffer size or equal to a maximum sized segment.mum sized segment.

Receiver-side:Receiver-side:

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

3232

Sender-Side:Sender-Side:

Silly Window Syndrome occurs when a TCP Silly Window Syndrome occurs when a TCP implementation aggressively sends data whenever it is implementation aggressively sends data whenever it is available (e.q., a TELENT connection to an interactive available (e.q., a TELENT connection to an interactive editor that reacts on every keystroke.)editor that reacts on every keystroke.)

To avoid sender-side Silly Window Syndrome, use the To avoid sender-side Silly Window Syndrome, use the Nagle’s algorithm: When a sending application generates Nagle’s algorithm: When a sending application generates additional data to be sent over a connection for which additional data to be sent over a connection for which previous data has been transmitted but not acknowledged, previous data has been transmitted but not acknowledged, place the new data in the output buffer and do not send place the new data in the output buffer and do not send additional segments until there is sufficient data to fill a additional segments until there is sufficient data to fill a maximum-sized segment. When an acknowledgement maximum-sized segment. When an acknowledgement arrives, send all data accumulated in the buffer.arrives, send all data accumulated in the buffer.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

3333

Dynamic algorithm to adjust the timeout interval. - Jacobson, Proc. of SIGCOMM’88 “Congestion Avoidance and Control”.Dynamic algorithm to adjust the timeout interval. - Jacobson, Proc. of SIGCOMM’88 “Congestion Avoidance and Control”.

For each successful transmission, For each successful transmission, RTTRTTnewnew = a RTT = a RTToldold + (1-a)M + (1-a)M where where MM is new measurement of the round-trip time (RTT) is new measurement of the round-trip time (RTT) typically a = 7/8.typically a = 7/8.Deviation of the RTTDeviation of the RTT DDnewnew = bD = bDoldold + (1-b) + (1-b)||RTTRTToldold-M|, typically b=3/4.-M|, typically b=3/4. Timeout = RTTTimeout = RTTnewnew + c* D + c* Dnewnew, typically c=3., typically c=3.To estimate RTT use Kareen’s algorithm with a timer backoff To estimate RTT use Kareen’s algorithm with a timer backoff strategy of unsuccessful transmission, do not update RTT and thestrategy of unsuccessful transmission, do not update RTT and theTimeout is updated with Timeout*Timeout is updated with Timeout*rr on each failure until the on each failure until the segment get through. Typicallysegment get through. Typically r=2. r=2.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

3434

Protocols for Gigabit NetworksProtocols for Gigabit Networks Problems in current protocolsProblems in current protocols

sequence numbersequence number protocol CPU processing timeprotocol CPU processing time go back n protocolgo back n protocol bandwidth limited vs.delay limited.bandwidth limited vs.delay limited. new applications, real-time…..new applications, real-time…..

High-speed Transport protocols:High-speed Transport protocols: XTP…..XTP…..

Reduce header processingReduce header processing increase sequence numberincrease sequence number rate-based instead of credit-basedrate-based instead of credit-based reserve resources at connection setup timereserve resources at connection setup time connection-oriented operationconnection-oriented operation ……......

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

3535

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

3636

ATM Adaptation LayerATM Adaptation Layer

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

3737

AAL1,2,3,4 for class A, B,C, and D respectively.AAL1,2,3,4 for class A, B,C, and D respectively. AAL3 and AAL4 are combined as AAL3/4.AAL3 and AAL4 are combined as AAL3/4. AAL5, originally called SEAL (Simple Efficient AAL5, originally called SEAL (Simple Efficient

Adaptation Layer), is adopted by the ATM Forum to Adaptation Layer), is adopted by the ATM Forum to replace AAL3/4.replace AAL3/4.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

3838

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

3939

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

4040

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

4141

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

4242

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

4343

The message is transmitted by passing it to the SAR sublayThe message is transmitted by passing it to the SAR sublayer, which does not add any headers or trailer. It breaks the er, which does not add any headers or trailer. It breaks the message into 48-byte units and passes to the ATM layer fomessage into 48-byte units and passes to the ATM layer for transmission.r transmission.

The Internet is expected to transport IP packets over ATM The Internet is expected to transport IP packets over ATM networks with the AAL5 payload field. RFC 1483 and 157networks with the AAL5 payload field. RFC 1483 and 1577.7.

(C) All rights reserved by Professo(C) All rights reserved by Professor Wen-Tsuen Chenr Wen-Tsuen Chen

4444