avishai wool lecture 11 - 1 introduction to systems programming lecture 11 tcp/ip
Post on 22-Dec-2015
216 views
TRANSCRIPT
Avishai Woollecture 11 - 1
Introduction to Systems Programming Lecture 11
TCP/IP
Avishai Woollecture 11 - 2
Internet Protocol (IP) – Cont.
Layer 3
Avishai Woollecture 11 - 3
Network layer functions• transport packet from sending
to receiving hosts
• network layer protocols in every host, router
important functions:
• path determination: route taken by packets from source to dest. Routing algorithms
• switching: move packets from router’s input to appropriate router output
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
Avishai Woollecture 11 - 4
Issues in layer 3• Point-to-point, datagram service• Connect multiple LANs to each other• Addressing: 32-bit IP-addresses.
– Must be unique in entire network (whole Internet!)
• Get the packets to their destination (routing).• Connectionless
– each packet carries its source & destination IP addresses
– each packet routed independently through network
• Unreliable– packets can arrive out of order or be dropped entirely
Avishai Woollecture 11 - 5
Routing
Graph abstraction for routing algorithms:
• graph nodes are routers• graph edges are physical
links– link cost: delay, $ cost, or
congestion level
Goal: determine “good” path(sequence of routers) thru
network from source to dest.
Routing protocol
A
ED
CB
F
2
2
13
1
1
2
53
5
• “good” path:– typically means minimum
cost path
– other def’s possible
Avishai Woollecture 11 - 6
Hierarchical Routing
scale: millions of IPs:• can’t store all dest’s in
routing tables!• Network changes all the time: • routing table exchange would
swamp links!
administrative autonomy• internet = network of networks
• each network admin may want to control routing in its own network
• Algorithms to find “shortest paths” through graph exist• They usually assume:
– all nodes are equal
– the network is “flat”
• This does not work well because:
Avishai Woollecture 11 - 7
Routing and Subnets
• A router needs to aggregate routes (“all traffic to IP addresses 132.66.*.* goes that way”) - to keep routing tables small.
• How to aggregate? Use subnets
• What is a subnet (IP perspective)? – A a set of IP addresses that have the same “Most
Significant Bits” of IP address (the “network part” of the IP address)
Avishai Woollecture 11 - 8
Subnets: the old days
0network host
10 network host
110 network host
1110 multicast address
A
B
C
D
class1.0.0.0 to127.255.255.255
128.0.0.0 to191.255.255.255
192.0.0.0 to223.255.255.255
224.0.0.0 to239.255.255.255
32 bits
“class-full” addressing:
Only 4 possible lengths (“classes”) of the “network part”
Avishai Woollecture 11 - 9
Subnets: CIDR blocks• Classfull addressing:
– inefficient use of address space, e.g., class B net allocated enough addresses for 65K hosts, even if only 2K hosts in that network
• CIDR: Classless InterDomain Routing– Allow network part of address to have any length– One address format: a.b.c.d/x, where x is the prefix (#
bits) in network portion of address– Alternative address format: IP=a.b.c.d netmask=w.x.y.z
Avishai Woollecture 11 - 10
CIDR blocks - example
• Can be written as:– 200.23.16.0 netmask 255.255.254.0
• The range of IP addresses in the subnet:– low: 11001000 00010111 00010000 00000000
– high: 11001000 00010111 00010001 11111111
– In dotted-decimal: 200.23.16.0 – 200.23.17.255
11001000 00010111 00010000 00000000
networkpart:
23 bits
hostpart:9 bits
200.23.16.0/23
Avishai Woollecture 11 - 11
Hierarchical addressing: Route aggregation
“Send me anythingwith addresses beginning 200.23.16.0/20”
200.23.16.0/23
200.23.18.0/23
200.23.30.0/23
Fly-By-Night-ISP
Organization 0
Organization 7Internet
Organization 1
ISPs-R-Us“Send me anythingwith addresses beginning 199.31.0.0/16”
200.23.20.0/23Organization 2
...
...
Hierarchical addressing allows efficient advertisement of routing information:
Avishai Woollecture 11 - 12
Hierarchical addressing: more specific routes
ISPs-R-Us has a more specific route to Organization 1
“Send me anythingwith addresses beginning 200.23.16.0/20”
200.23.16.0/23
200.23.18.0/23
200.23.30.0/23
Fly-By-Night-ISP
Organization 0
Organization 7Internet
Organization 1
ISPs-R-Us“Send me anythingwith addresses beginning 199.31.0.0/16or 200.23.18.0/23”
200.23.20.0/23Organization 2
...
...
Avishai Woollecture 11 - 13
IP Routing Semantics
• Most specific route is used
• Same as route with the longest prefix
• Sometime called “longest prefix match”
• When you read a routing table – order does not matter!
Avishai Woollecture 11 - 14
Special IP addresses
• 127.0.0.1 is always “this computer” / “self loop”
• On a subnet, the “all-1” IP address is “broadcast” (all hosts on that subnet receive traffic). – In “Class C” 132.66.10.* 132.66.10.255 is
broadcast– In 192.168.1.0/26 192.168.1.63 is broadcast
Interaction between IP and layer 2
• Directly connected computer:– Use arp to find MAC address, encapsulate and send
via layer 2
• How does a layer 3 device (router) know the computer is directly connected?– Hosts on a LAN must have IP addresses in one subnet– That subnet is somehow marked as “directly
connected” in the routing table• Example: list the network interface name
Avishai Woollecture 11 - 15
Avishai Woollecture 11 - 16
Example of a routing table
bakara:~ 591 > netstat -v -rn
Destination Mask Gateway Device
------------------ --------------- -------------------- ------
default 0.0.0.0 132.66.48.1
132.66.0.0 255.255.0.0 132.66.48.12 hme0
224.0.0.0 240.0.0.0 132.66.48.12 hme0
127.0.0.1 255.255.255.255 127.0.0.1 lo0
IP address of bakara is 132.66.48.12Class B network, /16 mask
• Routing to 132.66.100.1 is done directly (via layer 2)
• Routing to 64.236.24.4 goes through router 132.66.48.1
Avishai Woollecture 11 - 17
TCP
Transport Layer (layer 4)
Avishai Woollecture 11 - 18
Transport services and protocols• provide logical communication
between app’ processes running on different hosts
• transport protocols run in end systems
• transport vs network layer services:
• network layer: data transfer between end systems
• transport layer: data transfer between processes – relies on, enhances, network layer
services
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
logical end-end transport
Avishai Woollecture 11 - 19
TCP service model• End point computers identified by IP addresses.• To allow multiple connections to/from the same
end-points, TCP has port numbers:– 16 bit numbers (0-65535)
• A TCP connection is identified by 2 ports: source port and destination port.
• Provides a stream of bytes
Avishai Woollecture 11 - 20
applicationtransportnetwork
MP2
applicationtransportnetwork
Multiplexing/demultiplexing
- unit of data exchanged between transport layer entities: segment
receiver
HtHn
Demultiplexing: delivering received segments to correct app layer processes
segment
segment Mapplicationtransportnetwork
P1M
M MP3 P4
segmentheader
application-layerdata
Avishai Woollecture 11 - 21
Multiplexing/demultiplexing
• based on sender, receiver port numbers, IP addresses
– source, dest port #s in each segment
gathering data from multiple app processes, enveloping data with header (later used for demultiplexing)
source port # dest port #
32 bits
applicationdata
(message)
other header fields
TCP/UDP segment format
Multiplexing:
Avishai Woollecture 11 - 22
TCP segment structure
source port # dest port #
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberrcvr window size
ptr urgent datachecksum
FSRPAUheadlen
notused
Options (variable length)
ACK: ACK #valid
RST, SYN, FIN:connection estab(setup, teardown
commands)
countingby bytes of data(not segments!)
Avishai Woollecture 11 - 23
3-way handshake• Server listens on a port• Client knows the IP address of the server and the port
number.
1. Client Server: SYN packet (header bit)includes client’s IP address (return address) and port #Connection is “half open”.
2. Server Client: SYN+ACK (both bits set)3. Client Server: ACK (header bit)
• At this point connection is considered OPEN.• Client can already send data with the ACK segment
Avishai Woollecture 11 - 24
Well known port numbers• Port numbers 0-1023 considered “well known”• Unix: only kernel-privilege can listen on well-
known port• Allocated by IANA• Examples:
– ftp (file transfer): port 21– smtp (email): port 25– http (web): port 80
• Important for inter-operability, and for firewalls!
Avishai Woollecture 11 - 25
Sequence numbers
• Sequence numbers count bytes.
• Receiver ACKs (the bytes in) each segment: ACK flag set in header and ACK# field is full.
• If sender doesn’t get ACK within timeout it retransmits the packet.
• Message delivered to receiver’s code only when all the pieces are available
Avishai Woollecture 11 - 26
TCP seq. #’s and ACKs
Seq. #’s:
– byte stream “number” of first byte in segment’s data
ACKs:
– seq # of next byte expected from other side
– cumulative ACK
Q: how receiver handles out-of-order segments
– A: TCP spec doesn’t say, - up to implementor
Host A Host B
Seq=42, ACK=79, data = ‘C’
Seq=79, ACK=43, data = ‘C’
Seq=43, ACK=80
Usertypes
‘C’
host ACKsreceipt
of echoed‘C’
host ACKsreceipt of
‘C’, echoesback ‘C’
timesimple telnet scenario
Avishai Woollecture 11 - 27
When does TCP send a segment?• TCP provides a “stream of bytes”• When does it “finish” a segment and send it?• RFC 793 (standard) says to send the data
– “… in segments at its own convenience”
• Most implementations send a segment per write() system call
• if the write() is too big then several segments are sent.
Avishai Woollecture 11 - 28
Buffering and Congestion in IP
• Congestion:– suppose router has 3 connections, A, B, C– both A & B have a segment for C at same time– only one segment can leave on link C, other has to
wait
• To hide congestion, routers have buffers.• If A & B continue sending to C, buffer space will
run out, router drops packets• Also: if incoming link faster than outgoing link,
buffers will overflow segments dropped.
Avishai Woollecture 11 - 29
Causes/costs of congestion
• one router, finite buffers • sender retransmission of lost packet
Avishai Woollecture 11 - 30
TCP congestion control
• Idea: if segments are starting to drop (timeout before ACK is received)Congestion somewhere between sender & receiverSender should slow down
• Assumption is that loss is not due to transmission errors (noise)
Avishai Woollecture 11 - 31
The congestion window
• Sender side keeps a window of how many segments it is allowed to send without waiting for an ACK.
• Window size behavior:starts at 1For each successful window sent, window increasesTimeout drops window size back to 1
Avishai Woollecture 11 - 32
Domain Name Service
Avishai Woollecture 11 - 33
Internet Naming• IP addresses are hard for humans to remember:
– Better: symbolic names, with hierarchy (“domain names” e.g., www.eng.tau.ac.il).
– Hierarchy (and dots) usually not related to dotted decimal notation
• To facilitate: DNS servers translate from domain names to IP addresses.
• DNS == Domain Name Service
Avishai Woollecture 11 - 34
DNS servers• Layer 5 application (on top of TCP/UDP)
• Hierarchy of servers across the Internet• Each server is authoritative for it’s domain• If it doesn’t know the IP address of a name, it forwards
the translation query to a higher-up server.• 10-15 root DNS servers: critical Internet infrastructure.
• DNS traffic is always on port 53
Avishai Woollecture 11 - 35
Useful programs to know about
• ping – sends a packet to a host (by name or IP address), calculates round-trip time
• traceroute (tracert on MS-Win) – discovers all the routers between your computer and another
• nslookup translate a name to an IP or vice-versa. Shows the name of your DNS server.
Avishai Woollecture 11 - 36
Programming with TCP
The BSD socket API
Avishai Woollecture 11 - 37
Server/Client Model
• Server knows how to do something. forever {
getRequest();
doRequest();
sendReply();
}
• Client has something to do.sendRequest();
getReply();
Avishai Woollecture 11 - 38
BSD Sockets API• socket(TCP): creates the data structure, returns
socket descriptor
• Both server & client create sockets
• After 3-way handshake, socket can be used as if it is a file descriptor:– write() and read()
Avishai Woollecture 11 - 39
Server side
• s = socket(…): create data structure• bind(s, port): associate port with socket• listen(): starts listening on port
• accept(): blocks, waits for 3-way handshake to complete, returns a new socket that is initialized with client-side information. – Original socket still open, waiting for more
connections
Avishai Woollecture 11 - 40
Client side
• s = socket(…): create data structure• connect(): specifies IP address of server, and
port number. Starts 3-way handshake.
• Client’s port number chosen dynamically by OS, in the range 1024-65535
Avishai Woollecture 11 - 41
Server with BSD socketsmain() {s = socket(...);bind(s,portnum);listen(s,...);for (;;) {
s1 = accept(s);if (pid=fork()) /* parent */
close(s1);else { /* child */
close(s);doOperation(s1);
}}
Avishai Woollecture 11 - 42
Example in pictures
1.
listening to port, blocked
in “accept” 555
accepting connection
from B 555
connect to555 on A
1342
listening to port, blocked
in “accept”
555
connectedto 555 on A
1342connected to1342 on B 555
machine A machine A
machine A
machine B
machine B
2.
3.
Avishai Woollecture 11 - 43
Concepts for review• Routing Protocol• IP Subnet• CIDR• Route aggregation• Most-specific route• Longest prefix match• Self-loop IP address• IP Broadcast address• TCP port numbers
• 3-way handshake• Well-known ports• Seq# and ACK• Congestion• Domain-Name-Service
(DNS)• Sockets