avishai wool lecture 11 - 1 introduction to systems programming lecture 11 tcp/ip

43
Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Post on 22-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 1

Introduction to Systems Programming Lecture 11

TCP/IP

Page 2: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 2

Internet Protocol (IP) – Cont.

Layer 3

Page 3: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 3

Network layer functions• transport packet from sending

to receiving hosts

• network layer protocols in every host, router

important functions:

• path determination: route taken by packets from source to dest. Routing algorithms

• switching: move packets from router’s input to appropriate router output

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

Page 4: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 4

Issues in layer 3• Point-to-point, datagram service• Connect multiple LANs to each other• Addressing: 32-bit IP-addresses.

– Must be unique in entire network (whole Internet!)

• Get the packets to their destination (routing).• Connectionless

– each packet carries its source & destination IP addresses

– each packet routed independently through network

• Unreliable– packets can arrive out of order or be dropped entirely

Page 5: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 5

Routing

Graph abstraction for routing algorithms:

• graph nodes are routers• graph edges are physical

links– link cost: delay, $ cost, or

congestion level

Goal: determine “good” path(sequence of routers) thru

network from source to dest.

Routing protocol

A

ED

CB

F

2

2

13

1

1

2

53

5

• “good” path:– typically means minimum

cost path

– other def’s possible

Page 6: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 6

Hierarchical Routing

scale: millions of IPs:• can’t store all dest’s in

routing tables!• Network changes all the time: • routing table exchange would

swamp links!

administrative autonomy• internet = network of networks

• each network admin may want to control routing in its own network

• Algorithms to find “shortest paths” through graph exist• They usually assume:

– all nodes are equal

– the network is “flat”

• This does not work well because:

Page 7: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 7

Routing and Subnets

• A router needs to aggregate routes (“all traffic to IP addresses 132.66.*.* goes that way”) - to keep routing tables small.

• How to aggregate? Use subnets

• What is a subnet (IP perspective)? – A a set of IP addresses that have the same “Most

Significant Bits” of IP address (the “network part” of the IP address)

Page 8: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 8

Subnets: the old days

0network host

10 network host

110 network host

1110 multicast address

A

B

C

D

class1.0.0.0 to127.255.255.255

128.0.0.0 to191.255.255.255

192.0.0.0 to223.255.255.255

224.0.0.0 to239.255.255.255

32 bits

“class-full” addressing:

Only 4 possible lengths (“classes”) of the “network part”

Page 9: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 9

Subnets: CIDR blocks• Classfull addressing:

– inefficient use of address space, e.g., class B net allocated enough addresses for 65K hosts, even if only 2K hosts in that network

• CIDR: Classless InterDomain Routing– Allow network part of address to have any length– One address format: a.b.c.d/x, where x is the prefix (#

bits) in network portion of address– Alternative address format: IP=a.b.c.d netmask=w.x.y.z

Page 10: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 10

CIDR blocks - example

• Can be written as:– 200.23.16.0 netmask 255.255.254.0

• The range of IP addresses in the subnet:– low: 11001000 00010111 00010000 00000000

– high: 11001000 00010111 00010001 11111111

– In dotted-decimal: 200.23.16.0 – 200.23.17.255

11001000 00010111 00010000 00000000

networkpart:

23 bits

hostpart:9 bits

200.23.16.0/23

Page 11: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 11

Hierarchical addressing: Route aggregation

“Send me anythingwith addresses beginning 200.23.16.0/20”

200.23.16.0/23

200.23.18.0/23

200.23.30.0/23

Fly-By-Night-ISP

Organization 0

Organization 7Internet

Organization 1

ISPs-R-Us“Send me anythingwith addresses beginning 199.31.0.0/16”

200.23.20.0/23Organization 2

...

...

Hierarchical addressing allows efficient advertisement of routing information:

Page 12: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 12

Hierarchical addressing: more specific routes

ISPs-R-Us has a more specific route to Organization 1

“Send me anythingwith addresses beginning 200.23.16.0/20”

200.23.16.0/23

200.23.18.0/23

200.23.30.0/23

Fly-By-Night-ISP

Organization 0

Organization 7Internet

Organization 1

ISPs-R-Us“Send me anythingwith addresses beginning 199.31.0.0/16or 200.23.18.0/23”

200.23.20.0/23Organization 2

...

...

Page 13: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 13

IP Routing Semantics

• Most specific route is used

• Same as route with the longest prefix

• Sometime called “longest prefix match”

• When you read a routing table – order does not matter!

Page 14: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 14

Special IP addresses

• 127.0.0.1 is always “this computer” / “self loop”

• On a subnet, the “all-1” IP address is “broadcast” (all hosts on that subnet receive traffic). – In “Class C” 132.66.10.* 132.66.10.255 is

broadcast– In 192.168.1.0/26 192.168.1.63 is broadcast

Page 15: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Interaction between IP and layer 2

• Directly connected computer:– Use arp to find MAC address, encapsulate and send

via layer 2

• How does a layer 3 device (router) know the computer is directly connected?– Hosts on a LAN must have IP addresses in one subnet– That subnet is somehow marked as “directly

connected” in the routing table• Example: list the network interface name

Avishai Woollecture 11 - 15

Page 16: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 16

Example of a routing table

bakara:~ 591 > netstat -v -rn

Destination Mask Gateway Device

------------------ --------------- -------------------- ------

default 0.0.0.0 132.66.48.1

132.66.0.0 255.255.0.0 132.66.48.12 hme0

224.0.0.0 240.0.0.0 132.66.48.12 hme0

127.0.0.1 255.255.255.255 127.0.0.1 lo0

IP address of bakara is 132.66.48.12Class B network, /16 mask

• Routing to 132.66.100.1 is done directly (via layer 2)

• Routing to 64.236.24.4 goes through router 132.66.48.1

Page 17: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 17

TCP

Transport Layer (layer 4)

Page 18: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 18

Transport services and protocols• provide logical communication

between app’ processes running on different hosts

• transport protocols run in end systems

• transport vs network layer services:

• network layer: data transfer between end systems

• transport layer: data transfer between processes – relies on, enhances, network layer

services

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysicalnetwork

data linkphysical

logical end-end transport

Page 19: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 19

TCP service model• End point computers identified by IP addresses.• To allow multiple connections to/from the same

end-points, TCP has port numbers:– 16 bit numbers (0-65535)

• A TCP connection is identified by 2 ports: source port and destination port.

• Provides a stream of bytes

Page 20: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 20

applicationtransportnetwork

MP2

applicationtransportnetwork

Multiplexing/demultiplexing

- unit of data exchanged between transport layer entities: segment

receiver

HtHn

Demultiplexing: delivering received segments to correct app layer processes

segment

segment Mapplicationtransportnetwork

P1M

M MP3 P4

segmentheader

application-layerdata

Page 21: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 21

Multiplexing/demultiplexing

• based on sender, receiver port numbers, IP addresses

– source, dest port #s in each segment

gathering data from multiple app processes, enveloping data with header (later used for demultiplexing)

source port # dest port #

32 bits

applicationdata

(message)

other header fields

TCP/UDP segment format

Multiplexing:

Page 22: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 22

TCP segment structure

source port # dest port #

32 bits

applicationdata

(variable length)

sequence number

acknowledgement numberrcvr window size

ptr urgent datachecksum

FSRPAUheadlen

notused

Options (variable length)

ACK: ACK #valid

RST, SYN, FIN:connection estab(setup, teardown

commands)

countingby bytes of data(not segments!)

Page 23: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 23

3-way handshake• Server listens on a port• Client knows the IP address of the server and the port

number.

1. Client Server: SYN packet (header bit)includes client’s IP address (return address) and port #Connection is “half open”.

2. Server Client: SYN+ACK (both bits set)3. Client Server: ACK (header bit)

• At this point connection is considered OPEN.• Client can already send data with the ACK segment

Page 24: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 24

Well known port numbers• Port numbers 0-1023 considered “well known”• Unix: only kernel-privilege can listen on well-

known port• Allocated by IANA• Examples:

– ftp (file transfer): port 21– smtp (email): port 25– http (web): port 80

• Important for inter-operability, and for firewalls!

Page 25: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 25

Sequence numbers

• Sequence numbers count bytes.

• Receiver ACKs (the bytes in) each segment: ACK flag set in header and ACK# field is full.

• If sender doesn’t get ACK within timeout it retransmits the packet.

• Message delivered to receiver’s code only when all the pieces are available

Page 26: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 26

TCP seq. #’s and ACKs

Seq. #’s:

– byte stream “number” of first byte in segment’s data

ACKs:

– seq # of next byte expected from other side

– cumulative ACK

Q: how receiver handles out-of-order segments

– A: TCP spec doesn’t say, - up to implementor

Host A Host B

Seq=42, ACK=79, data = ‘C’

Seq=79, ACK=43, data = ‘C’

Seq=43, ACK=80

Usertypes

‘C’

host ACKsreceipt

of echoed‘C’

host ACKsreceipt of

‘C’, echoesback ‘C’

timesimple telnet scenario

Page 27: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 27

When does TCP send a segment?• TCP provides a “stream of bytes”• When does it “finish” a segment and send it?• RFC 793 (standard) says to send the data

– “… in segments at its own convenience”

• Most implementations send a segment per write() system call

• if the write() is too big then several segments are sent.

Page 28: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 28

Buffering and Congestion in IP

• Congestion:– suppose router has 3 connections, A, B, C– both A & B have a segment for C at same time– only one segment can leave on link C, other has to

wait

• To hide congestion, routers have buffers.• If A & B continue sending to C, buffer space will

run out, router drops packets• Also: if incoming link faster than outgoing link,

buffers will overflow segments dropped.

Page 29: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 29

Causes/costs of congestion

• one router, finite buffers • sender retransmission of lost packet

Page 30: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 30

TCP congestion control

• Idea: if segments are starting to drop (timeout before ACK is received)Congestion somewhere between sender & receiverSender should slow down

• Assumption is that loss is not due to transmission errors (noise)

Page 31: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 31

The congestion window

• Sender side keeps a window of how many segments it is allowed to send without waiting for an ACK.

• Window size behavior:starts at 1For each successful window sent, window increasesTimeout drops window size back to 1

Page 32: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 32

Domain Name Service

Page 33: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 33

Internet Naming• IP addresses are hard for humans to remember:

– Better: symbolic names, with hierarchy (“domain names” e.g., www.eng.tau.ac.il).

– Hierarchy (and dots) usually not related to dotted decimal notation

• To facilitate: DNS servers translate from domain names to IP addresses.

• DNS == Domain Name Service

Page 34: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 34

DNS servers• Layer 5 application (on top of TCP/UDP)

• Hierarchy of servers across the Internet• Each server is authoritative for it’s domain• If it doesn’t know the IP address of a name, it forwards

the translation query to a higher-up server.• 10-15 root DNS servers: critical Internet infrastructure.

• DNS traffic is always on port 53

Page 35: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 35

Useful programs to know about

• ping – sends a packet to a host (by name or IP address), calculates round-trip time

• traceroute (tracert on MS-Win) – discovers all the routers between your computer and another

• nslookup translate a name to an IP or vice-versa. Shows the name of your DNS server.

Page 36: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 36

Programming with TCP

The BSD socket API

Page 37: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 37

Server/Client Model

• Server knows how to do something. forever {

getRequest();

doRequest();

sendReply();

}

• Client has something to do.sendRequest();

getReply();

Page 38: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 38

BSD Sockets API• socket(TCP): creates the data structure, returns

socket descriptor

• Both server & client create sockets

• After 3-way handshake, socket can be used as if it is a file descriptor:– write() and read()

Page 39: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 39

Server side

• s = socket(…): create data structure• bind(s, port): associate port with socket• listen(): starts listening on port

• accept(): blocks, waits for 3-way handshake to complete, returns a new socket that is initialized with client-side information. – Original socket still open, waiting for more

connections

Page 40: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 40

Client side

• s = socket(…): create data structure• connect(): specifies IP address of server, and

port number. Starts 3-way handshake.

• Client’s port number chosen dynamically by OS, in the range 1024-65535

Page 41: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 41

Server with BSD socketsmain() {s = socket(...);bind(s,portnum);listen(s,...);for (;;) {

s1 = accept(s);if (pid=fork()) /* parent */

close(s1);else { /* child */

close(s);doOperation(s1);

}}

Page 42: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 42

Example in pictures

1.

listening to port, blocked

in “accept” 555

accepting connection

from B 555

connect to555 on A

1342

listening to port, blocked

in “accept”

555

connectedto 555 on A

1342connected to1342 on B 555

machine A machine A

machine A

machine B

machine B

2.

3.

Page 43: Avishai Wool lecture 11 - 1 Introduction to Systems Programming Lecture 11 TCP/IP

Avishai Woollecture 11 - 43

Concepts for review• Routing Protocol• IP Subnet• CIDR• Route aggregation• Most-specific route• Longest prefix match• Self-loop IP address• IP Broadcast address• TCP port numbers

• 3-way handshake• Well-known ports• Seq# and ACK• Congestion• Domain-Name-Service

(DNS)• Sockets