© d.zinchin [[email protected]] introduction to network programming in unix & linux © daniel...

58
© D.Zinchin [[email protected]] Introduction to Network Programming in UNIX & LINUX © Daniel Zinchin Tel-Ran Ltd 2008- 2009

Upload: garret-colton

Post on 16-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

© D.Zinchin [[email protected]]

Introduction to Network Programming in

UNIX & LINUX

© Daniel Zinchin

Tel-Ran Ltd 2008-2009

Introduction to Network Programming in UNIX & LINUX 1-2© D.Zinchin [[email protected]]

Rabbi Avraham Yaakov 100 years ago taught his Chassids:Rabbi Avraham Yaakov 100 years ago taught his Chassids:- It is possible to learn from anything.  Everything in this - It is possible to learn from anything.  Everything in this world exists to edify us. world exists to edify us. Not only what the Lord has made, but also what people have Not only what the Lord has made, but also what people have made, makes us wise. made, makes us wise. - What, for example, do we learn from a railway? - one - What, for example, do we learn from a railway? - one Chassid asked in doubt Chassid asked in doubt - That, having been late for an instant, it is possible to miss - That, having been late for an instant, it is possible to miss entirely. entirely. - And telegraph? - And telegraph? - That each word is taken into account.- That each word is taken into account.- And phone? - And phone? - That everything told by you here, is audible there. - That everything told by you here, is audible there.

And what could we learn And what could we learn from thefrom the

INTERNET ? INTERNET ?

Instead of Instead of Preamble…Preamble…

בס"דבס"ד

Introduction to Network Programming in UNIX & LINUX 1-3© D.Zinchin [[email protected]]

Course Contents

Process

Thread

Host

Independent Network

INTERNET•Network Primes

•Inter-Process

Communication (IPC)

•Inter-Host

Communication (IHC)

•Multi-Threading and

Synchronization

Introduction to Network Programming in UNIX & LINUX 1-4© D.Zinchin [[email protected]]

Recommended Literature and References:

1. W. Richard Stevens.UNIX Network Programming, Prentice Hall, 1990.

3. W. Richard Stevens. TCP/IP Illustrated. Addison-Wesley, 1994-1996 Vol.1 The Protocols Vol.2 The Implementation Vol.3 TCP for Transactions, HTTP, NNTP, and the UNIX Domain Protocols

2. W. Richard Stevens. UNIX Network Programming, 2nd Edition, Prentice Hall, 1998-99. Vol.1 Networking APIs Vol.2 Interprocess Communications

4. Douglas E. Cormer. Internetworking with TCP/IP

W. Richard Stevens personal site:

www.kohala.com

Kolman Shkolnik.Introduction to Internetworking in UNIX.

Tel-Ran 1995

With gratitude to my Tel-Ran teacher Kolman Shkolnik, whose book was used during the preparation

of materials for this course.

Introduction to Network Programming in UNIX & LINUX 1-5© D.Zinchin [[email protected]]

The History<1960 - Data transportation in the form of bits1960 - Transportation of data packets1965 - Bell Labs, General Electrics and MIT develop Multix

(Multiplexed Information and Computing Service)1969 - Ken Thompson and Dennis Ritchie in AT&T begin to develop

UNICS (UNiplexed Information and Computing Service) 1971-75 - First releases of UNIX in AT&T1976 - First UNIX network application UUCP (UNIX-to-UNIX copy program) .

File transfer and electronic mail. AT&T Version 7 UNIX (1978)1978 - Berkley Version 7 UNIX (Eric Schmidt, Berkley University)

File transfer, e-mail, remote printing1980 - DARPA (Defense Advanced Research Projects Agency) ARPANET,

TCP/IP protocols development for Berkley Unix1983 - 4.2BSD UNIX system (Berkley Software Distribution) with Socket Interface1983 - Richard Stallman announces the GNU Project, an attempt to create a free operating system 1984-86 - AT&T System V UNIX with Transport Layer Interface (TLI)

File transfer, e-mail, remote printing, remote command execution.1986 - NSFNET (National Science Foundation) – remote access to super-computer network.1988 - 4.3 BSD UNIX. Available source code of DARPA TCP/IP, Xerox NS protocols.End 80s - Microsoft Xenix, System V for Intel 8086, 80286, 80386 processors1988 - IEEE (Institute of Electrical & Electronical Engineers) defines

POSIX – standard operation system interface.1989 - Unix System V Release 4.0 (SVR4). Merge of AT&T System V with SunOS (4.xBSD derivative)

Provides ANSI compliant C compiler.1991 - Linus Torvalds (Finland) introduces Linux – freeware OS for PC.1992 - Sun introduces Solaris based on SVR4.1994 - Red Hat Linux released.2000 - Sun releases Solaris 8 having big success>2000 - Multiple releases and popularity growth of Linux derivatives

Introduction to Network Programming in UNIX & LINUX 1-6© D.Zinchin [[email protected]]

Basic Terms

Computer NetworkCommunication system for connecting end-systems. Enables to

share data, programs and resources (distributed systems).There are physical networks and logical networks.

HostSingle computer, end-system. Could be personal computer, dedicated

system (print or file server) or time-sharing system.Process

Any program which is executed by computer’s operation system.Thread

Separate part of process, providing it’s specific working flow,and sharing the process data and resources with other threads.

Inter-Process CommunicationSharing of information and resources by two or more different processes.

Inter-Host CommunicationCommunication between two or more processes, running on different

hosts in the network.

Communication ProtocolSet of rules and conventions that communication participants must

follow.

Introduction to Network Programming in UNIX & LINUX 1-7© D.Zinchin [[email protected]]

Internet

•Set of interconnected independent computer networks. •Uses common suite of protocols called TCP/IP.•Managed by groups of representatives:

•IAB -Internet Activity Board•NIC -National Information Center•FNC -Federal Network Center

Internet Services

Transport Level:•Unreliable packet delivery•Reliable stream transport

Application Level:•File Transfer (FTP, TFTP)•Electronic Mail (SMTP)•Remote Login (TELNET)•Network File System (NFS)•Remote Program Execution•Shared peripheral devices

RFC - Request For Comments

The name of the result and the process for creating a standard on the Internet.

The proposals are reviewed by the Internet Engineering Task Force.

New standards are proposed and published on the Internet, as a Request For Comments with acronym RFC <#>.

What is Internet

Introduction to Network Programming in UNIX & LINUX 1-8© D.Zinchin [[email protected]]

Layering and Protocol Family

Knowledge

Concepts

Speech

Voice

Transmitter

Knowledge

Concepts

Speech

Voice

Receiver

Sound Waves

Transfer of Knowledge by Means of Inter-Person Verbal Communication

Example of Layering in Communication System.

Layering is decomposition of task into subsystems (pieces), designed as sequence of horizontal layers.

As result, each layer:• concentrate on providing a particular function• is built in terms of one layer below• provides means to building various types of upper neighbor layer.

Protocol Family (Suite) is set of interfaces between layers or inside layer.

Peer-to-Peer Protocol is the protocol used between two entities of the same layer.

Introduction to Network Programming in UNIX & LINUX 1-9© D.Zinchin [[email protected]]

7-Layer OSI Model

Application

Presentation

Session

Transport

Network

Data Link

Physical

7

6

5

4

3

2

1

(inter-host level)

(inter-process level)

(dialog management)

(kinds of compression)

bits

frames

packets

datagrams

messages

messages

messages

(network topology)

OSI Model

Open System Interconnection Model

ISO

International Standard Organization

The 7-Layer OSI Model, developed by ISO

(1984), is the guide, providing a detailed

standard for describing of a network .

Advantage of layering is to provide well-defined

interfaces between the layers, when change

in one layer doesn’t affect an adjacent layer.

Introduction to Network Programming in UNIX & LINUX 1-10© D.Zinchin [[email protected]]

4-Layer Simplified Model of TCP/IP

TCP/IP Protocol Suite actually was developed (1980-83) before formulation of 7-layer ISO OSI Model (1984).

It implements 4-layer Simplified Communication Model:

(inter-host level)

Physical

Data Link

Network

Transport

Session

Presentation

Application7

6

5

4

3

2

1

(inter-host level)

(inter-process level)

(dialog management)

(kinds of compression)

bits

frames

packets

datagrams

messages

(network topology)

Data Link

Network

Transport

Application

4

3

2

1

(inter-process level)

(communication

end-point)

(network topology & physical connection)

(inter-process level)

Introduction to Network Programming in UNIX & LINUX 1-11© D.Zinchin [[email protected]]

Protocol Suite For 4-Layer TCP/IP Model

Application

Transport

Network

Data Link

physical connection

Application

Transport

Network

Data Link

application layer protocols

data link protocols

network layer protocols

transport layer protocols

Protocol Suite for 4-Layer TCP/IP Model contains 4 types of peer-to-peer protocols:

• Application Layer protocols – end-point communication• Transport Layer protocols – inter-process communication• Network layer protocols – inter-host communication• Data Link protocols – topology-specific interface with physical network

bits

frames

packets

datagrams

messages

Introduction to Network Programming in UNIX & LINUX 1-12© D.Zinchin [[email protected]]

TCP/IP Model and Protocol Suite

Application 0

Transport

Network

Data Link

TFTP

UDP TCP

IP ARPICMP RARP

Ethernet Token Ring

bits

frames

packets

datagrams

messages/stream

Application 1 user application

Ph

ysic

al A

dd

ress

IP A

dd

ress

+ P

ort

physical connection

Provided by kernel of OS

UNIX

TFTF

Trivial File Transfer Protocol

UDP

User Datagram Protocol

TCP

Transmission Control Protocol

IP

Internet Protocol

ICMP

Internet Control Message Protocol

ARP

Address Resolution Protocol

RARP

Reverse Address Resolution Protocol

Ethernet

A local area network architecture with broadcast bus topology

Token Ring

A local area network architecture with ring topology and token passing scheme

Introduction to Network Programming in UNIX & LINUX 1-13© D.Zinchin [[email protected]]

Network

LAN Local Area Network is a computer network comprises a local area, like a home, office, or group of buildings.

MAN Metropolitan Area Network is large computer network usually spanning a city.

WAN Wide Area Network is a computer network covering a broad geographical area. Largest and most well-known example of a WAN is the Internet.

Network Type

Technology Speed

LAN

Coaxial cable,

fiber optics,

Wi-Fi (wireless

technology)

4 Mbit/s –

2 Gbit/s

MANCoaxial cable,

microwave link

56 Kbit/s –

155 Mbit/s

WAN

Telephone lines,

microwave link,

satellite channels

9.6 Kbit/s –

45 Mbit/s

Gateway is a system, that interconnects two or more networks

Introduction to Network Programming in UNIX & LINUX 1-14© D.Zinchin [[email protected]]

Communication Activities

• Data TransmissionCommunication Networks can be divided into two basic types by method of data transmission: circuit-switched and packet-switched.

• EncapsulationEncapsulation is hiding of object data from rest of the world. For protocol suite this means adding of control information to data when going one layer down.

• Multiplexing and DemultiplexingMultiplexing means “to combine many into one”. For network this means combining of data accepted from different functionalities of neighbor layer.Demultiplexing is reverse of multiplexing.

• RoutingRouting is making decision, what route the packet should take. Static Routing is based on precomputed information.Dynamic Routing is depends on state of network configuration in the specific moment of time.

• Fragmentation and ReassemblingFragmentation (or segmentation) is breaking up of a packet into smaller pieces (MTU – maximal transmission unit)Reassembling is reverse of fragmentation, it is restoring of original packet from smaller pieces used for transmission.

Introduction to Network Programming in UNIX & LINUX 1-15© D.Zinchin [[email protected]]

Data Transmission MethodCommunication Networks can be divided into two basic types by method of data transmission: circuit-switched and packet-switched.

Circuit-Switched Data Transmission

Packet-Switched Data Transmission

hop hop hop hop hop

• Non-shared dedicated communication line is established• Information transmitted without division• Connection established once, then all data transmitted through this connection.

• Shared communication links are used instead of dedicated line• Information is divided into pieces – packets. • Each packet contains the address of destination and separately routed over shared data links.

circuit

The TCP/IP Internet uses packet-switched data transmission, provided by IP (Network) layer, responsible for forwarding of IP packets.

Introduction to Network Programming in UNIX & LINUX 1-16© D.Zinchin [[email protected]]

Data Encapsulation in TCP/IP

Encapsulation is hiding of object data from rest of the world. For protocol suite this means adding of control information to data when going one layer down.

Example. Data encapsulation during mail delivery

address

address

veryinteresting

letter

address

Data

Data

Data

DataEthernet

trailer

TFTP header

TFTP header

TFTP header

TFTP header

Data

UDP header

UDP header

IP header

IP header

Ethernetheader

Ethernet frame

IP packet

UDP datagram

TFTP message

Application data

Bytes: 22 20 8 4 400 4

UDP header

Introduction to Network Programming in UNIX & LINUX 1-17© D.Zinchin [[email protected]]

Multiplexing and Demultiplexing in TCP/IP.Example.

Process 1 Process 2 Process 3 Process 4

UDP TCP

IP

Multi-homed Multi-user Host

Ethernetinterface

Ethernetinterface

Ethernet cable 1

Ethernet cable 2

TCP/IP Suite XNS Suite

Process 5

Multiplexing means “to combine many into one”. For network this means combining of data accepted from different functionalities of neighbor layer.Demultiplexing is reverse of multiplexing.

Multiplexing

Demultiplexing

Introduction to Network Programming in UNIX & LINUX 1-18© D.Zinchin [[email protected]]

Routing

Router is “intelligent” gateway, making a decision, what route (path) the packet should take.

Data Link

IP

TCP

Application

Data Link

IP

TCP

Application

Data Link

IP(router)

Data Link

IP (router)

LAN1 LAN2 LAN3

Gateway 1 Gateway 2Host 1 Host 2

hop 1 hop 2 hop 3

Remote packetis sent to Router

Local packetis sent to Recipient

In TCP/IP routing is made on IP Layer. Each packet could have its own route.

The TCP/IP Internet uses Distributed Dynamic Routing.

Distributed Dynamic Routing

uses a mixture of global and local information to make routing decision.

Introduction to Network Programming in UNIX & LINUX 1-19© D.Zinchin [[email protected]]

Fragmentation and Reassembling

Fragmentation is breaking up of a packet into smaller pieces (transmission units).

Maximal Transmission Unit (MTU) is maximal packet size held by network layer, depending on Data Link characteristics

Reassembling is reverse of fragmentation.

Data Link

IP (router)fragmentation

Data Link

IP (router)reassembling

LAN 1 WAN connection LAN 2

Gateway 1 Gateway 2

MTU=2 Kb MTU=2 KbMTU=128 b

Packet 1 Kb

Packet 1 Kb

128b

128b

128b

128b

128b

128b

In TCP/IP fragmentation and reassembling is done at IP layer.

The IP layer performs these activities depending on requirement of specific Data Link layers,

hiding the technological differences between the networks.

. . .

Introduction to Network Programming in UNIX & LINUX 1-20© D.Zinchin [[email protected]]

Client-Server Model

Client-Server Model is standard model for network applications.

Typical Server scenario:

• Starts and connects to network

• Waits for request from potential clients

• Accepts requests and provides service

• Waits for the next potential client…

Typical Client scenario:

• Starts and connects to network

• Sends service request to known server

• Accepts the service

Client Server

Servicerequests provides

Protocol

uses uses

is described by

Introduction to Network Programming in UNIX & LINUX 1-21© D.Zinchin [[email protected]]

Modes of Communication Service

Datagram delivery service

hop-by-hop

Connectionless Service

• Provides hop-by-hop transmission of separate messages.

• Each message transmitted independently and contains all the information (address) required for delivery.

Connection-Oriented Service• Provides establishment of dedicated end-to-end

virtual circuit for data transmission.

• Connection-oriented data exchange involves three following steps:o Connection establishment (performed once, requires overhead activity)o Data transfer (can be lengthy)o Connection termination

The dedicated circuit is called virtual, because it could be provided even on network with packet-switched data transmission. A connection-oriented service is often used when more than one message is to be exchanged between the two peer entities.

Virtual circuit service

end-to-end

Communication Service provided between two peer entities at any layer of the OSI Model.

Connection Mode

Introduction to Network Programming in UNIX & LINUX 1-22© D.Zinchin [[email protected]]

Service is Reliable if it provides Sequencing and Error Control. Most of reliable services provide also Flow Control.

Sequencing• Means that the data is received by the receiver in the same order as it is transmitted by the sender.

In a packet-switched network, it is possible for two consecutive packets to take different routes, and thusarrive at their destination in a different order from the order in which they were sent.

Error Control • Guarantees that error-free data is received at the destination.

There are two conditions that can generate errors: - the data gets corrupted (modified during transmission), - the data gets lost.The network implementation has to provide for recovery from both these situations.

Flow control (pacing)•Assures that the sender does not send data at a rate faster than the receiver can process the data.

If Flow Control is not provided, it is possible for the receiver to lose data because of lack of resources.

Message Service•Provides record boundaries .

Byte Stream•Does not provide record boundaries.

Full-duplex - connection allows data to be transferred in both directions in the same time.Half-duplex - connection allows data to be transferred in both direction, but only one side to transfer at a time

Simplex - connection allows data to be transferred only in one direction (one end can only transmit and the other end can only receive.)

Data Stream Format

Reliability

Modes of Communication Service (continuation)

Introduction to Network Programming in UNIX & LINUX 1-23© D.Zinchin [[email protected]]

Summary: Communication Activities and Service Modes

     

combine/split of data from different Network (north) protocols

hide Network layer data

frame transmission in LAN

Ethernet/ Token Ring

Data Link

unreliablepacket delivery

connection-less

brake up (/recompose) packet into (/from) transition units

distributed dynamic routing

combine/split of data from different Transport (north) and Data Link (south) protocols

hide Transport layer data

Packet -switched (hop-by-hop)

IPNetwork

unreliabledatagram delivery

connection-less

 UDP

reliable (sequencing, error control, flow control)

full-duplex (bi-directional) byte stream

connection-oriented

brake up (/recompose) stream into (from) IP packets 

combine/split of data from different Application (north) processes

hide Application layer data

 

TCP

Transport

(application-depended)

(application-depended)

(application- depended)

(application-depended)

 

combine/split of data from different Transport (south) protocols

(application-depended)

 application protocols

Application

ReliabilityData Stream Format

Connection Mode

Fragmentation/ Reassembling

RoutingMultiplexing/ Demultiplexing

EncapsulationData Transmission

ProtocolModel Layer

Communication Service ModesCommunication Activities

Introduction to Network Programming in UNIX & LINUX 1-24© D.Zinchin [[email protected]]

IP – Internet ProtocolProvides unreliable connectionless packet delivery service, containing:

• Routing,• Fragmentation / Reassembling,• Multiplexing / Demultiplexing.

Works with different Data Link protocols and topologies, hiding the technological differences between the networks.

Communication ServicesProvided by TCP/IP Protocol Suite

Network Layer

UDP – User Datagram ProtocolProvides unreliable connectionless datagram delivery service.

TCP – Transmission Control ProtocolProvides reliable connection-oriented full-duplex byte stream serviceover unreliable connectionless packet-switched IP Network.

Transport Layer

Introduction to Network Programming in UNIX & LINUX 1-25© D.Zinchin [[email protected]]

Data Link Connection and Topology.

Transceiver

Host

Host Interface

Coaxial cable (ether)

Transceiver

This is communication device capable of transmitting and receiving of signals.

Performs analog - digital - analog translation.

Host Interface

It provides physical address associated with interface hardware and filters incoming packets

Bus Topology

The main characteristic of this topology is that it is a passive structure: when a node is down, the network is not affected.

Ring Topology

This kind of topology is less efficient and reliable but it is quite cheap. As soon as two lines are cut the network no longer works.

Star Topology

This topology is quite efficient and cheap. Most small local networks is built on this model by using a central Hub that connects computers together. A hub can imitate different network topology configurations.

Introduction to Network Programming in UNIX & LINUX 1-26© D.Zinchin [[email protected]]

Data Link Equipment

Repeater is a hardware component that transmits frames from one wire and places them on another.

Repeaters are a simple way to extend a LAN segment.

Bridge is a hardware component that filters frames according to destination address.

Bridges are used to connect multiple segments.

Hub is a common connection point for devices in a network.

Hub can imitate a bus or a ring or could be more sophisticated. In this case it called Switch.

I

I O

O

O

O II

O

I

I

HubI O

I

O

IO

O

I

Hub

Ring Imitation

Buss Imitation

A B

C

Repeater

A to B

A to C

All frames are transmitted to both segments

A B

C

Bridge

A to B

A to C

Local frames are filtered out

Introduction to Network Programming in UNIX & LINUX 1-27© D.Zinchin [[email protected]]

Data Link Protocols: Ethernet

Ethernet is broadcast bus technology with best effort delivery.

Was developed in the beginning of 70-s by Xerox corporation. Following development and standardizing was performed by DIX (Digital, Intel, Xerox) in 1979-1980. In 1983 it was approved as standard by IEEE .

Currently Fast Ethernet is one of most popular LAN technologies.

IEEE (Institute of Electrical and Electronics Engineers) - professional world-wide society for electronics and electrical engineers)

Technology description:• Each host interface has preset unique 48 bit physical address.• Transceiver senses when ether is in use and detects collisions.• When data is transmitted, all hosts connected to the bus can hear the transmission. • In case of collision both hosts wait for a random amount of time, before sending the information again.

Collision occurs when two devices on a network try to transmit information at the same time.

Ethernet Frame

P DA SA L/T Data CRC

Bytes: 8 6 6 2 46-1500 4

P – Preamble for synchronizing. The last

byte is SFD – Start of Frame Delimiter

DA – Destination Address

SA – Source Address

L/T – (Length/Type) Length of Data and

optional ID of upper level protocol

CRC – Cyclic Redundancy Check sum

Introduction to Network Programming in UNIX & LINUX 1-28© D.Zinchin [[email protected]]

BeginFrame Sending

1. Prepare Frame2. Trial Counter = 0

Wait untilIPG is passed

Send the 1-st bit of Frame

Frame is SentSuccessfully

Frame Sending Failure.Too many collisions.

Send the 32-bit JAM

Trial Counter ++

CalculateRandom Delay Time

WaitRandom Delay Time

Delay

Waiting

Transmission

Recovery After Collision

YES

NO

NO

YES

YES

NO

YES

NO

YES

NO

Send the Next bit of Frame

YES

NO

CSMA/CD

Carrier Sense Multiple Access with Collision Detection

IPG

Inter-Packet Gap

JAM

32 bit frame for collision signaling

Ethernet CSMA/CD algorithm

Is Trial Counter Greater then 16 ?

Is other node transmitting ?

Is IPG intervalPassed ?

Is Collisiondetected ?

Is Last bitsent ?

Is other node transmission

finished ?

Introduction to Network Programming in UNIX & LINUX 1-29© D.Zinchin [[email protected]]

Token Ring is deterministic technology with predictable delay.

Was developed in 1980 – 1985 by IBM. Approved as standard by IEEE.

Technology description:• This is not continuous wire, consists of connections among host interfaces, connecting to Ring by means of Multiple Access Units (MAU). (No more than 8 hosts per MAU).• Physical address is configurable by means of switches.• Control frame named “Token” is passed from one host to another, allowing to this host (and only this) to send the packet.• To send the frame, host performs the following steps:

• waits for the arriving Token• converts it to data frame and copies to the next host in the Ring• waits for the frame to return after delivery• deletes the frame and sends out a new Token

• Each moment of time no more than 1 host sends the data, all other hosts copy the data by chain.• Host interface could be in following modes:

• transmit mode – sending host• copying mode – all other hosts in ring• recovery mode – recovery in case of token loss

• When copying host recognizes its destination address, it cleans refuse bit.• Sender, accepting the original frame, detects if frame was delivered, checking the refuse bit.

Fist connected to Ring host accepts the status of Active Monitor. It responsible for:• Token creation and recovery• Check frame delivery timeout• Deletion of frames not deleted by other hosts.• Notification of other hosts about its presence in Ring (sends “Active Monitor Present” frame)In case of Active Monitor problem, other hosts compete to accept its status.

Data Link Protocols: Token Ring

Multiple Access Unit

In Token Ring collisions never occur. This ensures good performance of network under big loads (30%-40%)

Introduction to Network Programming in UNIX & LINUX 1-30© D.Zinchin [[email protected]]

SD AC FC DA SA Data CRC ED FS

SD - Start Delimiter

AC - Access Control - packet priority, type (token/data), active monitor bit

FC - Frame Control

DA - Destination Address

SA - Source Address

CRC - Cyclic Redundancy Check sum

ED - End Delimiter - contains 1 bit – Last Packet bit, Error flag bit

FS - Frame Status - contains Parity bit (copy indicator),

Refuse bit (destination reached)

Bytes: 1 1 1 6 6 0..4096 4 1 1

Token Ring Data Frame

ETR – Early Token Release technology

After sending of frame, the same host generates new Token. As result, many sequential frames could circulate in the Ring in the same time. But no more than one Token could present in each moment of time.

This technology improves the performance of Token Ring.

SD AC ED

Token Ring “Token”

1 1 1

Token Ring (continuation)

Introduction to Network Programming in UNIX & LINUX 1-31© D.Zinchin [[email protected]]

Network Layer: Internet Protocol

Ethernet

IP

(host)

TCP/UDP

application

TokenRing

IP

(host)

TCP/UDP

application

IP(router)

Ether net

IP (router)

LAN LAN LAN

hop 1 hop 2 hop 3

Internet Protocol (IP) Provides unreliable connectionless packet delivery service, containing:

• Routing,• Fragmentation / Reassembling,• Multiplexing / Demultiplexing.

Works with different Data Link layers, hiding the technological differences between the networks.

Data Link

Network

Transport

Application

Token Ring

Ether net

Ether net

Physical address

IP address

Introduction to Network Programming in UNIX & LINUX 1-32© D.Zinchin [[email protected]]

Internet Protocol: IP(v4) Packet Structure

• Version – IP Version• Header length – IP Header total length in 32bit words (max = 15*4=60 bytes)• DSCP, ECN – type of service fields, used by upper protocols• Total Length – total packet length (header + data).• Identification, DF (don't fragment), MF (more fragments), Fragment Offset – used for fragmentation and reassembly. • TTL - time-to-live – maximal number of hops, set by the sender, decremented by each router.• Protocol - upper layer protocol (1=ICMP, 2=IGMP, 6=TCP, 17=UDP). • Header Checksum - calculated over just the IP header including any options. • Source IP address (32 bit)• Destination IP address (32-bit)• Options (<=40 bytes) , used by upper protocols

1-st byte

5-th byte

9-th byte

13-th byte

17-th byte

bits:

Introduction to Network Programming in UNIX & LINUX 1-33© D.Zinchin [[email protected]]

Internet Address FormatsInternet Address (IP address) is mandatory unique logical address which must have every host in Internet.IP Address (IPv4) is 32-bit number. Decimal Dotted Notation is human-oriented representation of IP Address as sequence of 4 decimal numbers separated by dot.

The IP Address has internal structure. There are 5 classes of IP Address:

Class A Host IDNetwork ID0

Bits: 1 7 24

Class B Host IDNetwork ID1

Bits: 1 1 14 16

0

Class C Host IDNetwork ID1

Bits: 1 1 1 21 8

1 0

Class D Multicast address1

Bits: 1 1 1 1 28

1 01

Class E (reserved experimental format)1

Bits: 1 1 1 1 1 27

1 011

*Note:The 2 types of bit sequences: • All Bits equal 1• All Bits equal 0are not used as Network IDs andHost IDs

Class Range

A 0.0.0.0 to 127.255.255.255

B 128.0.0.0 to 191.255.255.255

C 192.0.0.0 to 223.255.255.255

D 224.0.0.0 to 239.255.255.255

E 240.0.0.0 to 247.255.255.255

Networks Per Class

2^7 – 2 = 126

2^14 – 2 = 16,382

2^21 – 2 = 2,097,150

N/A

N/A

Hosts Per Network

2^24 – 2 = 16,777,214

2^16 – 2 = 65,534

2^8 – 2 = 254

N/A

N/A

* *

Introduction to Network Programming in UNIX & LINUX 1-34© D.Zinchin [[email protected]]

NetmaskGateways, to locate the network, need only Network ID part of IP Address and don’t need to know the locationof every host. This is important concept of routing.To calculate Network ID from IP Address, Gateways use Netmask.

NETWORK_ID = IP_ADDRESS & NETMASK

Example. Calculation of Network ID for IP Address = 145.11.99.243 (Class B)

10001001 11010000 11000110 11001111

3F36B019

24399.11.145.

Host IDNetwork IDclass B

0x

11111111 11111111 00000000 00000000

0000FFFF

00.255.255.

0x

10001001 11010000 00000000 00000000

0000B019

00.11.145.

0x

IP Address

Netmask of Class B

Network ID

Class

A

B

C

Network ID

1 byte

2 bytes

3 bytes

Netmask

255. 0. 0. 0

255. 255. 0. 0

255. 255. 255. 0

Introduction to Network Programming in UNIX & LINUX 1-35© D.Zinchin [[email protected]]

IP Addresses are assigned by specific authority:

InterNIC- Internet Network Information Center. The InterNIC assigns only Network IDs. The assignment of Host IDs is responsibility of local site system administrator.

IP addresses are often subnetted. Subnet adds additional level to the address hierarchy:• Network ID (assigned to site)• Subnet ID (chosen by site)• Host ID (chosen by site)

In the example above local gateway needs only 8 bits of Subnet ID for routing. Adding new host to existing sub-network will not require any changes to the internal gateways.

Network ID and Subnet ID

Example. Subnetting of Network with Class B address, using 8 bit Subnet ID.

All the hosts on a given subnet share a common Subnet Mask, and this mask specifies the boundary between the subnet ID and the host ID. Bits of 1 in the subnet mask cover the network ID and subnet ID, and bits of 0 cover the host ID.

SUBNETWORK_ADDR = IP_ADDRESS & SUBNET_MASK

Subnet ID0

0Class B Host IDNetwork ID1

Bits: 1 1 14 16

Class B Host IDNetwork ID1

Bits: 1 1 14 8 8

Network mask:

Subnet mask:

255. 255. 0. 0

255. 255. 255. 0

=0xFFFF0000

=0xFFFFFF00

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Introduction to Network Programming in UNIX & LINUX 1-36© D.Zinchin [[email protected]]

There are three Destination Types of IP addresses: • Unicast (destined for a single host)

• Broadcast (destined for all hosts on a given network)

• Limited Broadcast

to all hosts inside local network

(Network ID = all 1, Host ID = all 1)

• Net-Directed Broadcast

to all hosts of other local network

(Network ID = netID, Host ID = all 1)

• Subnet-Directed Broadcast

to all hosts in specific sub-network

(Network ID = netID, Subnet ID = subnetId, Host ID = all 1)

• Multicast (destined for a set of hosts that belong to a multicast group or sub-network).

Multicast Address is a special type of address that is recognizable by multiple hosts joined a Multicast Group .

A Multicast Address is sometimes known as a Functional Address or a Group Address.

Hosts that are interested in receiving data flowing to a particular group must join the group using:

IGMP - Internet Group Management Protocol .

• Broadcasting and Multicasting needs support by hardware (Data Link layer).

• Broadcast and Multicast addresses could not be used as Source IP Address

multicast group

Types of Destination IP Address

Introduction to Network Programming in UNIX & LINUX 1-37© D.Zinchin [[email protected]]

Loopback AddressesBy convention, the address 127.0.0.1 is assigned to the Loopback Interface.Anything sent to this IP address loops around and becomes IP input without ever leaving the machine.This address id often used when testing a client and server on the same host. Any address on the network 127/8 can be assigned to the loopback interface, but 127.0.0.1 is and is often configured automatically by the IP stack. (This address is known as INADDR_LOOPBACK )

Unspecified AddressThe address consisting of 32 zero bits is Unspecified Address.It is only permitted to appear as the source address in packets sent by a node that is bootstrapping before the node learns its IP address. (This address is known as INADDR_ANY).

Private AddressesThree address ranges are set aside for “Private Internets“.These are the networks that do not connect directly to the public Internet.Small sites use these private addresses and Network Address Translation (NAT) to a single public IP address visible to the Internet.

NAT - Network Address TranslationAlso known as Network Masquerading or IP-masquerading is a technique in which the source and/or destination addresses of IP packets are rewritten as they pass through a router or firewall.

It is most commonly used to enable multiple hosts on a private network to access the Internet using a single public IP address.

Class RangeNumber of addresses

A 10.0.0.0 to 10.255.255.255 16,777,216

B 172.16.0.0 to 172.31.255.255 1,048,576

C 192.168.0.0 to 192.168.255.255 65,536

Special Case IP Addresses

Introduction to Network Programming in UNIX & LINUX 1-38© D.Zinchin [[email protected]]

Multihomed Host This is a host with multiple interfaces.Each interface must have a unique IP address. (Loopback interface is not counted)

A router, by definition, is multihomed since it forwards packets from one interface to another one.But, a multihomed host is not a router unless it forwards packets.

There are two types of Multihoming:• Physical Multihoming

Host has multiple physical interfaces, each interface has its own IP address. • Logical Multihoming

Newer hosts have the capability to assigning multiple IP addresses to the same physical interface.Each additional IP address, after the first (primary), is called an Alias or Logical Interface.

Multihomed NetworkThis is a network that has multiple connections to the Internet.For example, some sites have two connections to the Internet instead of one, providing a backup capability.

ifconfig

UNIX/LINUX utility for configuring network interface parameters

$ ifconfig -aeth0 Link encap:Ethernet HWaddr 00:11:25:0C:DE:88 inet addr:145.9.228.95 Bcast:145.9.228.255 Mask:255.255.255.0 inet6 addr: fe80::211:25ff:fe0c:de88/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2295264 errors:0 dropped:0 overruns:0 frame:0 TX packets:783513 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:319104166 (304.3 MiB) TX bytes:75720636 (72.2 MiB) Base address:0x2000 Memory:e8100000-e8120000

lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:804848 errors:0 dropped:0 overruns:0 frame:0 TX packets:804848 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:49311882 (47.0 MiB) TX bytes:49311882 (47.0 MiB)

Multihoming and Address Aliases

Introduction to Network Programming in UNIX & LINUX 1-39© D.Zinchin [[email protected]]

Domain Name System (DNS) This is a distributed database that provides the mapping between IP addresses and hostnames.Hostnames are more suitable for human use, than IP addresses.

The DNS is distributed because no single site on the Internet knows all the information. Each site maintains its own database of information and runs a DNS Server program that other systems across the Internet (clients) can query. The DNS provides the protocol that allows clients and servers to communicate with each other.

Applications access the DNS through a Resolver. The resolver contacts one or more name servers to do the mapping.

The DNS Name Space is Hierarchical Tree. The InterNIC maintains the top-level domains:

• Generic Domains (com, edu, gov, int, mil, net, org)• Country Domains (us, uk, il, ru, etc.)

InterNIC delegates responsibility to others for specific Zones. A Zone is a subtree of the DNS tree that is administered separately. Many second-level domains then divide their zone into smaller zones.

To accept the DNS information, every Name Server must know how to contact the Root Name Servers. The root server tells the requesting server to contact another server, and so on.

A fundamental property of the DNS is Caching. Name Server caches accepted {IP Address ; Hostname} information for following reuse.

$nslookup gate88.mot.comServer: abcde.mot.comAddress: 145.19.17.68

Name: gate88.mot.comAddress: 145.19.238.87

nslookupUNIX utility for DNS information access

Domain Name System

Introduction to Network Programming in UNIX & LINUX 1-40© D.Zinchin [[email protected]]

Address Resolution

Data Link

Ethernet

Network

IP

Data Link

Ethernet

Network

IP

physical connection

IP packet

Ethernet frame

IP Address

(logical)

Ethernet Address

(physical)

Address Resolution is translation between Network Layer logical address and Data Link physical address.

Address Resolution Problem

I want to send IP packet to another host with known IP Address. What is Physical Address of that host?

• Known: IPA, PHA, IPB.

• Unknown: PHB

IPA IPB

PHA PHB

Host A Host B

Reverse Address Resolution Problem

I’m diskless workstation. What is my own IP address ?

• Known: PHA.

• Unknown: IPA.

Introduction to Network Programming in UNIX & LINUX 1-41© D.Zinchin [[email protected]]

ARPARP – Address Resolution Protocol

Solves Address Resolution Problem.

Provides dynamic mapping from an IP Address to the corresponding Physical Address on the same physical network.

X A Y B Z

( [ IPA , PHA] , [ IPB , ? ] )

X A Y B Z

( [ IPA , PHA] , [ IPB , PHB ] )ARP Reply

ARP Request

ARP Request is Ethernet Broadcast frame (Destination Ph Address = 0xFFFFFFFFFFFF) is sent to all hosts in physical network.

Only the host, recognizing its own IP Address in the Request, sends the ARP Reply, containing its Physical address.

ARP Cache

• Maintains the recent mappings from Internet addresses to Physical addresses.

• Removes its entries after expiration time

Proxy ARP

Lets to Router to answer ARP requests, addressed to another physical network, substituting router’s physical address instead of target foreign host address.

Than, accepting the frame, router forwards it to the target foreign host.

$arp –a

sun (140.252.13.33) at 8:0:20:3:f6:42svr4 (140.252.13.34) at 0:0:c0:c2:9b:26

arpUNIX utility for ARP Cache access

Introduction to Network Programming in UNIX & LINUX 1-42© D.Zinchin [[email protected]]

RARP Server

RARP ProtocolsRARP – Reverse Address Resolution Protocol

Solves Reverse Address Resolution Problem. Used by

diskless hosts during its initialization (bootstrap).

RARP Request is Ethernet Broadcast frame ,sent to all

hosts in physical network.

Only the RARP Server, containing the required information,

sends RARP Reply, containing IP Address of requestor.

RARP Server serves single LAN. It could be multiple RARP

Servers in LAN. They handle Distributed Data Base of IP

Addresses.

RARP Servers provide delay mechanism to avoid

simultaneous response to requestor from multiple RARP

Servers in the same time.

X A Y RARP Server

Z

( [ ? , PHA ] )

X A Y Z

( [ IPA , PHA ] )RARP Reply

RARP Request

BOOTP – Bootstrap Protocol • Enables a diskless workstation to discover its own IP address, BOOTP Server IP address, and a file to be

loaded into memory to boot the machine.

• Needs manual pre-configuration of the host information.

• Could be routed and serve more than one LAN.

DHCP – Dynamic Host Configuration Protocol• Allows dynamic allocation of network addresses and configurations to newly attached hosts.

• Allows recovery and reallocation of network addresses through a leasing mechanism.

• Does not require manual pre-configuration of the host information.

• Could be routed and serve more than one LAN.

Introduction to Network Programming in UNIX & LINUX 1-43© D.Zinchin [[email protected]]

ICMP ICMP - Internet Control Message Protocol.

ICMP handles Error and Control information messages between routers and hosts.

These messages are normally generated by and processed by the TCP/IP networking software.

Example of usage: ping and traceroute programs use ICMP.

ICMP DataICMP HeaderIP HeaderCRCCodeType

Type Description Query Error

0 echo reply (ping reply) *  

3 destination unreachable   *

4 quench (flow control)   *

5 redirect (use another router)   *

8 echo request (ping request) *  

9 router advertisement (reply to solicitation) *  

10 router solicitation (request for advertisement) *  

11 time exceeded (TTL=0)   *

12 parameter problem (bad IP header)   *

13 time stamp request (what time is it now) *  

14 timestamp reply (current time is…) *  

17 address mask request (give my subnet mask) *  

16 address mask reply (your subnet mask is …) *  

ICMP Message

Introduction to Network Programming in UNIX & LINUX 1-44© D.Zinchin [[email protected]]

istanbul% traceroute sanfranciscotraceroute to sanfrancisco (172.29.64.39), 30 hops max, 40 byte packets 1 frbldg7c-86 (172.31.86.1) 1.516 ms 1.283 ms 1.362 ms 2 bldg1a-001 (172.31.1.211) 2.277 ms 1.773 ms 2.186 ms 3 bldg4-bldg1 (172.30.4.42) 1.978 ms 1.986 ms 13.996 ms 4 bldg6-bldg4 (172.30.4.49) 2.655 ms 3.042 ms 2.344 ms 5 ferbldg11a-001 (172.29.1.236) 2.636 ms 3.432 ms 3.830 ms 6 frbldg12b-153 (172.29.153.72) 3.452 ms 3.146 ms 2.962 ms 7 sanfrancisco (172.29.64.39) 3.430 ms 3.312 ms 3.451 ms

Ethernet

IP (host)

TCP/UDP

application

TokenRing

IP (host)

TCP/UDP

application

IP(router)

Ether net

IP (router)

LAN LAN LAN

hop 1 hop 2 hop 3

Token RingEther net

Ether net

Remote packetis sent to

Router

Local packetis sent to Recipient

IP RoutingIP Routing is distributed. Each hop of the specific packet is calculated separately.

tracerouteUNIX utility, printing the route which packets take to network host

G G

hop1

hop2

hop3

Introduction to Network Programming in UNIX & LINUX 1-45© D.Zinchin [[email protected]]

IP Routing: Routing TableRouting TableThe IP layer has a Routing Table in memory that it searches each time it receives a datagram to send.The Routing Table contains the following information:• Destination – IP Address of Destination Host (flag “H”) or Destination Network (no flag ”H”)• Gateway – IP Address of Hop Router for Remote Network (flag “G”)

or IP Address of Local Host in Directly Connected Network (no flag “G”)• Flags – “U”- route is up, “G”- route to Remote Network via Gateway, “H” - route to specific Host,

“D”- route was added because of an ICMP Redirect Message.• Ref – The number of times the route used to establish a connection.• Use – The number of transmitted packages• Interface – The Network Interface used by route

UNIX Utilities:route - Utility for manual manipulation with Routing Table

netstat - Utility, showing network status

# netstat -nr# netstat -nr

Routing Table: IPv4Routing Table: IPv4

Destination Gateway Flags Ref Use InterfaceDestination Gateway Flags Ref Use Interface

------------- ------------ ----- --- ------- ---------------------- ------------ ----- --- ------- ---------

127.0.0.1 127.0.0.1 UH 1 298 lo0 127.0.0.1 127.0.0.1 UH 1 298 lo0

default 175.16.12.1 UG 2 50360 default 175.16.12.1 UG 2 50360

175.16.12.0 175.16.12.2 U 40 111379 bge0 175.16.12.0 175.16.12.2 U 40 111379 bge0

175.16.2.0 175.16.12.3 UG 4 1179 175.16.2.0 175.16.12.3 UG 4 1179

175.16.1.0 175.16.12.3 UG 10 1113 175.16.1.0 175.16.12.3 UG 10 1113

175.16.3.0 175.16.12.3 UG 2 1379 175.16.3.0 175.16.12.3 UG 2 1379

175.10.4.3 175.16.12.5 UGH 1 1119175.10.4.3 175.16.12.5 UGH 1 1119

Host Gateway

Destination

interface

- Loopback Route (Local Host) via interface lo0

- Default Router 175.16.12.1

- Directly Connected Network 175.16.12.0 via interface bge0

- Route to Remote Network 175.16.2.0 via Gateway 175.16.12.3

- Route to Remote Network 175.16.1.0 via Gateway 175.16.12.3

- Route to Remote Network 175.16.3.0 via Gateway 175.16.12.3

- Route to Remote Host 175.10.4.3 via Gateway 175.16.12.5

Introduction to Network Programming in UNIX & LINUX 1-46© D.Zinchin [[email protected]]

IP Routing AlgorithmSingle IP Routing algorithm for Host an Router

Most multi-user systems today (almost every Unix system),

can be configured to act as a Router. This means, that Host

and Router could have the same routing algorithm.

Note: Unlike Router, the Host never forwards packets

from one of its interfaces to another. yes

no

no

yes

yes

yes

yes

no

no

no

no

yes

yes

Discard IP Packet.Send ICMP Error.

Is Destination IP Address Found ?

Extract Network ID

Is Local Network ?

Is Subnet Mask Specified ?

Extract Sub-Network ID

Is Sub-Network ID Found ?

Is Network ID Found ?

Is Default Route Found ?

Is Directly Connected Network ?

Get Gateway IP Address

Find Directly Connected Network

Get Interface

Send IP Packet

no

Make Routing Decision

Introduction to Network Programming in UNIX & LINUX 1-47© D.Zinchin [[email protected]]

Static and Dynamic IP RoutingStatic Routing

During Static Routing the Routing Table:• created during interface configuration • added by the route command • or created by an ICMP redirect (if the wrong “default” was used)

It is fine if the network is small, has a single connection point to other networks

and does not have redundant routes (which could be used if a primary route fails)

Dynamic Routing

During Dynamic Routing the Routing Table also updated by Routing Daemon process.• the Routing Daemon is running on the Router• communicates with another Routers using a Routing Protocol • dynamically updates the kernel's routing table with information it receives from neighbor

routers

Routing Protocols are separated to:• IGP – Interior (Intra-Domain) Gateway Protocols

Used between the Routers of one autonomous system.

Examples: RIP – Routing Information Protocol , OSPF - Open Shortest Path First protocol. • EGP – Exterior (Inter-Domain) Gateway Protocols

Used between the Routers of different autonomous systems.

Example: BGP – Border Gateway Protocol

Introduction to Network Programming in UNIX & LINUX 1-48© D.Zinchin [[email protected]]

IP Routing Schema

UDP TCP

ICMP IGMP

IP Layer

Physical Network Interfaces

ICMP Redirect

Updates fromadjacent Routers

yes

no

Routing Daemon

route

command

netstatcommand

Process IP options.Is this Source Routing(prescribed by sender) ?

Manual updatesby Sysadmin

yes

no

yes

Is HopCalculated ?

IP output: Calculate Next Hop(to Destination Host or Hop Router)

yes

Discard IP Packet.Send ICMP Error.

Is IP ForwardingEnabled ?

no

no

Receive IP Packet.Put it to IP Input Queue

Deliver Message

RoutingTable

IP Input

IP Output

Routing Table write

Routing Table read

Is this My Packet ?(Destined to my IP Address

or Broadcast Address)

Send IP Packet.

Introduction to Network Programming in UNIX & LINUX 1-49© D.Zinchin [[email protected]]

Transport Layer: Port Numbers

Ethernet/ TokenRing

IP

TCP/UDP

application

Data Link

Network

Transport

Application

Physical address

IP address

Port number

Port Number is unique 16-bit identifier of Application Communication Endpoint, used by Transport Protocols.

TCP ports and UDP ports are independent.

Servers are normally known by their well-known port number. For example:

• FTP server always uses TCP port 21. • Telnet server is on TCP port 23. • TFTP server is on UDP port 69.

Clients usually need any unique free port. Client port numbers are called ephemeral ports.

Port number reservation:• 1 – 1023 Well-Known Ports.

Used by well-known servers across all Internet.• 1024 – 49151 Registered Ports. Used by servers known across specific networks.• 49152 – 65535 Dynamic Ports.

Used as ephemeral port numbers.

The ranges of Registered and Dynamic Ports are configurable.Port 1 Port 2 Port 3

Transport Protocol(UDP/TCP)

IP

Transport Protocols provide multiplexing / demultiplexing

based on Port Number

File /etc/services

on most Unix systems the well-known port numbers are contained in this file.

#grep telnet /etc/services telnet 23/tcp

#grep domain /etc/servicesdomain 53/udpdomain 53/tcp

Introduction to Network Programming in UNIX & LINUX 1-50© D.Zinchin [[email protected]]

Association and Transport Address

Communication link between two processes is completely specified by 5-tuple Association:

• Transport Protocol (UDP / TCP)

• Local IP Address

• Local Port

• Foreign IP Address

• Foreign Port

Transport Address (or Socket) is half-Association:

• Transport Protocol (UDP / TCP)

• Local IP Address

• Local Port

Socket Pair defines the Association

data link

IP

UDP

application

IP Addr B

Port B

data link

IP

UDP

application

IP Addr A

Port A

UDP Association Example

Socket A = { UDP, Port A, IP Address A }

Socket B = { UDP, Port B, IP Address B }

Association = { UDP, Port A, IP Address A, Port B , IP Address B }

Process A Process B

Introduction to Network Programming in UNIX & LINUX 1-51© D.Zinchin [[email protected]]

Transport Layer: UDP ProtocolUDP – User Datagram Protocol

Provides unreliable connectionless datagram delivery service.

UDP is a simple protocol: each output operation by a process produces exactly one UDP datagram, which causes one IP datagram to be sent.

UDP Datagram Structure

•Source and Destination Port Numbers - identify the sending process and the receiving process.

• UDP length - the length of the UDP header and the UDP data in bytes.

(it is redundant, because could be calculated from IP header)

• UDP checksum - covers the UDP Pseudo-Header (see below) and the UDP data.

To let UDP double-check that the data has arrived at the correct destination, the UDP checksum is calculated on UDP Pseudo-Header, containing:

UDP header itself

IP Header fields: Source IP Address, Destination IP Address, Protocol type.

• DATA field may be empty

Introduction to Network Programming in UNIX & LINUX 1-52© D.Zinchin [[email protected]]

TCP provides reliability by doing the following: • The application data is broken into TCP Segments - the best sized chunks passed by TCP to IP. Each TCP Segment has its Sequence Number.• When TCP receives data from the other end of the connection, it sends an Acknowledgment.• When TCP sends a segment it maintains a Timer. If an acknowledgment isn't received in time, the segment is retransmitted. • TCP maintains a Checksum on its header and data. If a segment arrives with an invalid checksum, TCP discards and doesn't acknowledge it, expecting the retransmission from sender after expiration of its timeout.• TCP Re-Sequences the data if necessary, passing the received data in the correct order to the application. • TCP provides discarding of duplicate data. • TCP provides Flow Control. Each end of a TCP connection has a finite amount of buffer space. A receiving TCP allows the sender to send as much data as the receiver has buffers for.

Transport Layer: TCP ProtocolTCP – Transmission Control Protocol

Provides reliable connection-oriented full-duplex byte stream serviceover unreliable connectionless packet-switched IP Network

SYN m

SYN n, ACK m+1

ACK n+1

active side

passive side

TCP Connection Establishing.Three – Way Handshake.

FIN m

ACK m+1

ACK n+1

TCP Connection Termination.Modified Three – Way Handshake.

FIN n

ACK n

data n-1

data m

ACK m+1

ACK n+1

Simple data exchange example.

data m+1

data n, ACK m+2

Introduction to Network Programming in UNIX & LINUX 1-53© D.Zinchin [[email protected]]

TCP Segment Structure

• Source and Destination port numbers - identify the sending process and the receiving process.

• Sequence Number – unique identifier of TCP Segment, equal to the sequence number of segment 1 st data byte in

stream between TCP sender and TCP receiver. Byte numeration begins from ISN (initial sequence number)

chosen by TCP sender. Wraps back around to 0 after reaching 2^32 - 1.

• Acknowledgement Number – used with ACK flag. The number of next byte, which TCP receiver is ready to receive.

• Header Length - the length of the header in 32-bit words.

• Flags: URG – urgent data (used with Urgent Pointer), ACK – acknowledge (used with Acknowledge Number)

PSH - flush receiver data from its cache to process, RST – reset the connection (“port unreachable” reply)

SYN – connect (used with Sequence Number = ISN), FIN – finalize the connection

• Window Size – number of bytes, which receiver is ready to accept (flow control)

• TCP Checksum - covers TCP Pseudo-Header (TCP Header + IP Addresses and Protocol fields) + TCP Data

• Urgent Pointer – used with URG flag. Offset from Sequence Number to last byte of urgent data.

• Options (<=40 bytes). Maximal Segment Size (MSS) announcement, Timestamp, Window scale, etc.

Introduction to Network Programming in UNIX & LINUX 1-54© D.Zinchin [[email protected]]

TCP Sliding WindowSliding Window is algorithm of positive acknowledgement with re-transmission.

• Receive Window – consists of any space in Receive Buffer that is not occupied by data. Data remains in Receive buffer until target application will accept it. The size of Receive Window is advertised by TCP Receiver to TCP Sender.

• Send Buffer - begins from first un-acknowledged segment. Has the size of advertised Receive Window Size.

• Send Window - covers unused part of the Send Buffer.

1 2 3 4 … … … … … … … …

ACKed,

accepted

by application

TCP Segments

1 2 3 4 5 6 7 8 9 10 11 …

Sent, ACKed

Sent, NOT ACKed

Can Sent immediately

Can’t Sent, until window moves

Send Window

Send Buffer

TCP Segments

ACKed,

NOT accepted

by application

Receive Window

Receive BufferTCP Receiver

TCP Sender

Slidingdirection

Introduction to Network Programming in UNIX & LINUX 1-55© D.Zinchin [[email protected]]

TCP State Transition Diagram

normal transition for Client

normal transition for Server

appl: state transition when application issues operation

recv: state transition taken when segment received

send: what is sent for this transaction

The Legend:

TIME_WAIT State

The endpoint that initiates termination of connection, goes through TIME_WAIT state and remains there for the period:

2 * MSL (maximum segment lifetime)

MSL is configurable life time of IP packet in the network. The actual duration of 2MSL varies from 30 sec up to 4 min.

The TIME_WAIT state has two reasons:

1.Reliable termination of TCP connection (ability to reply ACK for resent FIN)

2.To allow old duplicate segments to expire in the network (1MSL for original duplicate segments + 1 MSL for replies to be lost from network)

For the period of 2MSL the TCP prevents the incarnation (the restart from the same port) of the connection.

Introduction to Network Programming in UNIX & LINUX 1-56© D.Zinchin [[email protected]]

Modern protocols in TCP/IP Protocol Suite

IPv6 - Internet Protocol version 6.

Network Layer protocol, designed in the mid-1990s as a replacement for IPv4. The major change is a larger address comprising 128 bits to deal with the explosive growth of the Internet in the 1990s.

IPv6 addresses are 128 bits long and are usually written as eight 16-bit hexadecimal numbers.

IPv6 provides packet delivery service for TCP, UDP, SCTP, and ICMPv6.

ICMPv6 - Internet Control Message Protocol version 6.

Is integral part of IPv6 implementation.

ICMPv6 combines the functionality of ICMPv4, IGMP, and ARP.

SCTP - Stream Control Transmission Protocol.

Transport Layer protocol, providing Reliable Connection-Oriented Full-Duplex Message Service.

Designed in 2000-2002.

SCTP connection called Association, because SCTP is Multihomed and involves a set of IP addresses and a single port for each side of an Association.

SCTP provides a Message Service, which maintains record boundaries. As with TCP and UDP, SCTP can use either IPv4 or IPv6, but it can also use both IPv4 and IPv6 simultaneously on the same association.

SCTP can provide multiple streams between connection endpoints, each with its own reliable sequenced delivery of messages. A lost message in one of these streams does not block delivery of messages in any of the other streams.

Introduction to Network Programming in UNIX & LINUX 1-57© D.Zinchin [[email protected]]

Virtual Network and Tunneling. The 6bone.

The 6bone is a virtual network that was created in 1996 for users with islands of IPv6-capable hosts wanted to connect them together using a virtual network without waiting for all the intermediate routers to become IPv6-capable. The 6bone is established on top of the existing IPv4 Internet using tunnels.

Introduction to Network Programming in UNIX & LINUX 1-58© D.Zinchin [[email protected]]

IPv4

Internet Protocol version 4.

IPv6

Internet Protocol version 6.

TCP

Transmission Control Protocol.

UDP

User Datagram Protocol.

SCTP

Stream Control Transmission Protocol.

ICMP

Internet Control Message Protocol

(version 4).

IGMP

Internet Group Management Protocol.

ARP

Address Resolution Protocol.

RARP

Reverse Address Resolution Protocol.

ICMPv6

Internet Control Message Protocol version 6.

BPF

BSD Packet Filter. Provides access to the datalink layer in Berkeley-derived kernels.

DLPI

Datalink Provider Interface. Provides access to the datalink layer in System V R4.

TCP/IP Summary

ping

ICMP v6

IPv4 Applications IPv6 Applications

tcp-dump

m-routed

trace-route

appl.ping appl.appl.appl.appl.appl. trace-route

data-link

BPFDLPI

ARPRARP

IPv4 IPv6IGMP

ICMP

TCP SCTP UDP

32-bitaddress

128-bitaddress

API

ICMP v6

ping