how internet works emc 165 computer and communication networks feb 3, 2004
TRANSCRIPT
How Internet Works
EMC 165 Computer and Communication Networks
Feb 3, 2004
Outline
How Internet Instrastructure Works How Routers Work How TCP/IP networks work How Routing Algorithms Work How NAT works
What is the Internet?
It is a global collection of networks, both big and small.
Recall in Lecture 2, we mentioned that one of the greatest things about the Internet is that nobody really owns it.
These networks connect together in many different ways to form the single entity that we know as the Internet. In fact, the very name comes from this idea of interconnected networks.
The Internet Concept
The Internet Concept (Cont’d)
Internet: Network of Networks
Every computer that is connected to the Internet is part of a network, even the one in your home.
For example, you may use a modem and dial a local number to connect to an Internet Service Provider (ISP).
At school/work, you may be part of a local area network (LAN), but you most likely still connect to the Internet using an ISP that your school/company has contracted with.
Internet: Network of Networks (Cont’d)
When you connect to your ISP, you become part of their network.
The ISP may then connect to a larger network and become part of their network.
The Internet is simply a network of networks.
Connecting Network of Networks
The amazing thing here is that there is no overall controlling network.
Instead, there are several high-level networks connecting to each other through Network Access Points or NAPs.
All the networks that make up the Internet rely on NAPs, backbones and routers to talk to each other.
History of Internet 1962, Paul Baran of the RAND Corporation was commissioned by the
US Air Force to do a study on how it could maintain its command and control over its missiles and bombers, after a nuclear attack.
Baran’s final proposal was a packet switched network. 1968, Advanced Research Project Agency (ARPA) awarded the
APRPANET contract to BBN. The physical network was constructed in 1969, linking 4 nodes: UCLA, SRI (Stanford), UCSB, University of Utah via 50 Kbps circuits.
The 1st email program was created by Ray Tomlison of BBN in 1972. ARPA was later renamed the Defense Advanced Research Projects
Agency (DARPA) in 1972. In 1973, developments began on the protocol later to be called TCP/IP
by a group headed by Vinton Cerf from Stanford and Bob Kahn from DARPA.
The term Internet was coined by Vint Cerf and Bob Kahn in their paper on TCP in 1974
History of Internet - contd
Dr R. Metcalfe developed Ethernet in 1976, which allowed coaxiable cable to move data extremely fast.
Dept of Defense begain experimenting with the TCP/IP protocol in 1976 and soon decided to require it for use on ARPANET.
Total number of hosts on the backbones in 1976: 111+
History of Internet - contd
National Science Foundation (NSF) created the 1st high-speed backbone in 1987 called the NSFNET.
NSFNET is a T1 line that connected 170 smaller networks together and operated at 1.544 Mbps.
IBM, MCI, and Merit worked with NSF to create the backbone and developed a T3 (45 Mbps) backbone the following year
Total number of hosts in the Internet: 56,000 in 1988. In 1990, this number has jumped up to 313,000
In 1992, World-Wide Web was released by CERN. NSFNET backbone completely upgraded to T3.
Total number of hosts in the Internet in 1992 – 1.136 millions
History of Internet - contd
In 1994, ATM (145 Mbps) backbone is installed on NSFNET.
Total number of hosts has increased to 3.864 millions in 1994
Most Internet traffic is carried by backbones of independent ISPs including MCI, AT&T, Sprint, UUNet, BBN planet etc.
The total number of hosts in 1999 was around 15 millions and growing rapidly.
Backbones
Backbones are typically fiber optic trunk lines The trunk line has multiple fiber optic cables
combined together to increase the capacity. Fiber optic cables are designated OC for
optical carrier such as OC-3, OC-12 or OC-48.
An OC-3 line is capable of transmitting 155 Mbps while an OC-48 can transmit 2,488 Mbps (2.488 Gbps).
Logical addresses Every piece of equipment that connects to a network
has a physical address. This is an address unique to the piece of equipment.
The physical address is also called the Medium Access Control (MAC) address. It has 2 parts each 3 bytes long. The 1st 3 bytes identify the company that made the Network Interface Card (NIC), and the 2nd 3 bytes are the serial number of the NIC itself.
The interesting thing to note is a computer can have several logical addresses at the same time.
Logical addresses like IP address are assigned statically or dynamically.
Internet Protocol: IP addresses Every machine on the Internet has a unique underlying number, called
an IP address. The IP stands for Internet Protocol which is the language that
computers use to communicate over the Internet. A protocol is a pre-defined way that someone who wants to use a
service talks with that service. That someone could be a person, but more often it is a computer program like a Web-browser.
A typical IP address looks like this 216.27.61.137The four numbers in an IP address are called octets, because they each have
eight bits.Each octet can contain any value between zero and 255.So, combining 4 octets give us 232 possible unique values.
Certain values are restricted from use as typical IP addresses e.g. 0.0.0.0 is reserved for the default network and 255.255.255.255 is reserved for broadcasts
How TCP/IP network works.
IPv4 Header
Version HLength Type of Service Total Length
Header Checksum
Identification
(Next) ProtocolTime-to-Live
Flags Fragment Offset
Source Address
Destination Address
IP Options
Data
Payload up to 65,535 bytes
0 31
IP Addresses - Motivation Key aspect of a virtual network is a single, uniform
address format Can't use hardware addresses because different
technologies have different address formats Format must be independent of any particular
hardware address format
Sending host puts destination internet address in packet
Destination address can be interpreted by any intermediate router Routers examine address and forward packet on
to the destination
Properties 32-bit number globally unique (with a few exceptions!) hierarchical: network + host Classes of addresses for specific types of networks
Classfull Addresses
Generally assigned by authorities
except from: A-class net: 10.0.0.0 B-class net 172.16.0.0 C-class net 192.16.8.0
Some college have a B-class net e.g.134.226.0.0 Can arrange for Dept. of Comp. Science. to have a
number of subnets in this domain e.g. 134.226.32.0, 134.226.51.0
Classfull Addresses
Summary
Virtual network needs uniform addressing scheme, independent of hardware
IP address is a 32-bit address IP address is composed of a network address
and a host address Network addresses are divided into classes e.g.
A, B and C Dotted decimal notation is a standard format for
Internet addresses: 134.226.32.57
IP Address & Ethernet Address
Computer A
Computer B
Computer X
Computer Y
172.16.1.100-08-74-32-24-89
IP Address
MAC Address
Address Resolution Protocol (ARP)
Computer A
Computer B
Computer X
Computer Y
172.16.1.100-08-74-32-24-89
172.16.1.2MAC ADDRESS???
forComputer B
ARP’s “Who has…?” Packet
Computer A
Computer B
Computer X
Computer Y
172.16.1.100-08-74-32-24-89
Who has 172.16.1.2
ARP’s Reply Packet
Computer A
Computer B
Computer X
Computer Y
172.16.1.100-08-74-32-24-89
172.16.1.200-08-74-21-20-D7
172.16.1.200-08-74-21-20-D7
Routed (Sub-)Networks
Router
Packet Size Matters!!!
Packet Size=
7000 bytes7000
Network-specific MTU*
*Maximum Transfer Unit
Fragmentation One technique - limit datagram size to smallest MTU
of any network
However: This approach requires knowledge about all networks involved in communication
IP uses fragmentation - datagrams can be split into
pieces to fit in network with small MTU Router detects datagram larger than network MTU
Splits into pieces Each piece smaller than outbound network MTU
Fragmentation (details) Each fragment is an independent datagram
Includes all header fields Bit in header indicates datagram is a fragment Other fields have information for reconstructing original datagram FRAGMENT OFFSET gives original location of fragment
Router uses local MTU to compute size of each fragment Puts part of data from original datagram in each fragment Puts other information into header
Fields for Fragmentation
Version HLength Type of Service Total Length
Header Checksum
Identification
(Next) ProtocolTime-to-Live
Flags Fragment Offset
Source Address
Destination Address
IP Options
Data
Payload up to 65,535 bytes
0 31
Ethernet to Tokenring
Tokenring to Ethernet
Tokenring to Ethernet
Each network has a Maximum Transmission Unit (MTU) IP datagrams can be larger than most hardware MTUs
IP: 216 - 1 Ethernet: 1500 Token ring: 2048 or 4096
Strategy fragment when necessary (Datagram > MTU) try to avoid fragmentation at source host re-fragmentation is possible fragments are self-contained datagrams delay reassembly until destination host do not recover from lost fragments
Fragmentation & Reassembly
IP may drop fragment What happens to original datagram?
Destination drops entire original datagram How does destination identify lost fragment?
Sets timer with each fragment If timer expires before all fragments arrive, fragment
assumed lost Datagram dropped
Source (application layer protocol) assumed to retransmit
Best Effort Delivery
Fragment Loss
Internet Protocol: Domain Name System
If there are only a few hosts, then working with IP addresses is fine but with more and more hosts that came online, it becomes unwieldly.
The first solution is a simple text file maintained by the Network Information Center that mapped names to IP addresses.
But soon this text file became so large that it was too cumbersome to manage.
So, in 1983, University of Wisconsin created the Domain Name System which maps a hostname to an IP address automatically.
Uniform Resource Locators When you use the Web or send an email message, you use a
domain name to do it. For example, the URL http://www.howstuffworks.com contains the domain name howstuffworks.com. So does the email address: [email protected].
Everytime we use a domain name, we use the Internet’s DNS servers to translate the human-readable domain name into the machine-readable IP address.
Top-level domain names include .com, .org, .net, .edu, .gov. Within every top-level domain, there is a huge list of 2nd-level domains. For example, in the .com 1st-level domain, there is Yahoo Microsoft AmazonEvery name in the .com top-level domain must be unique.
Internet Naming Hierarchy
.com .net .ie .uk
.co.tcd
www
.ac
The silent dot at theend of all addresses
How to find www.cse.lehigh.edu?
www
Name server in Berkeley, CA
1. Ask top-level server for edu-server
2. Ask .edu server for lehigh-server
3. Ask .lehigh server for cse-server
4. Ask .cse server for “www” machine
cse
lehigh
edu
Domains
DNS server
134.226.32.57
DNS DNS servers accept requests from programs, and
other name servers, to convert domain names into IP addresses. When a request comes in, the DNS server can do one of the 4 things with it: It can answer the request with an IP address because it
already knows the IP address for the requested domain. It can contact another DNS server and try to find the IP
address for the name requested. It may have to do this multiple times
It can say, “I don’t know the IP address for the domain but here’s the IP address for a DNS server that knows more than I do”
It can return an error message because the requested domain name is invalid or does not exist.
Name agent (Resolver) Interface with the local user programs Identifies objects based on symbolic names
Name server Converts symbolic names to addresses Queries other name servers if the name is
unknown
Name Server Architecture
Name agent
Name server
Name server
Name serverName server
Recursive Name Server
Iterative Name Server
Name agent
Name server
Name serverName server
Name server
Transitive Name Server
Name agent
Name serverName server
Name serverName server
Domain Name System (DNS)
Name server Serves a hierarchical name space Maps names to addresses Stores auxiliary information
Authoritative name serverMail exchangerRound robin (load balancing)
Putting it together
Router in CSE
Router in Lehigh
Router at AT&T
Router in New York
Router at Berkeley
Computer B
www
knows that 134.226.0.0is routed into direction of east coast
has an agreement with AT&T
has an agreement with Lehigh
knows aboutthe Dept. ofComp. Sc.
knows that 134.226.32.57 is onthe local ethernet and uses ARPto get its ethernet address
uses a DNS query to find out what IP addresswww.cse.lehigh.edu has
Berkeley DNS
replies with134.226.36.57
cse.lehigh.edu
Computer B in Berkeley, CS wants to find a web page at “www.cse.lehigh.edu”
Best-Effort Delivery
D2
D1
• Transfer of datagrams D1 & D2• Possible deliveries:
D2 D1
D1 D2
D1
D2
nothing
How Routers Work Assume that there is a small company with 10 employees, each
with a computer. 4 of the employees are animators, while the rest are in sales, accounting and management.
The animators send many very large files back and forth to one another. To do this, they will need a network.
When one animator sends a file to another, every one sees the traffic if the network used is Ethernet. Each computer checks to see if the packet is meant for its address. But since the file is big, this makes the network run very slowly for other users.
So, to keep the animators’ work from interfering with others, the company sets up 2 separate networks, one for the animators and one for the rest of the company. A router links the two networks and connects both networks to the Internet.
How Routers Work - contd Router is the only device that sees every message sent by any
computer. When the animator sends a huge file to another animator, the
router looks at the recipient’s address and keeps the traffic on the animators’ network.
When the animator sends a message to the bookeeper, the router sees the recipient’s address and forwards the message between the two networks.
One of the tools a router uses to decide where to forward a packet is a configuration table. Such a table contains the following information Information on which connections lead to particular groups of
addresses Priorities for connections to be used Rules for handling both routine and special cases of traffic.
How Routers Work – contd A router has 2 separate but related tasks
It ensures that information does not go where it is not needed.
It makes sure that information does make it to the intended destination.
As the number of networks attached to one another grows, the configuration table for handling traffic among them grows, and the processing power of the router is increased.
Recall that the Internet is a packet-switched network which means each packet may take a different route to reach its destination. Each packet contains a header that tells its source and destination address.
Routing packets: An example Consider a medium-sized router in a company’s office network
with 50 computers and devices and the Internet. The office network connects to the router through an Ethernet
connection (e.g. 100 base-T connection meaning 100 Mbps). There are 2 connections between the router and the ISP. One is
a T1 connection (1.5 Mbps) and the other is an ISDN line (128 Kbps).
The configuration table tells it that all out-bound packets are to use the T1 line, unless it is not available. If T1 is not available, then the ISDN line will be used.
The router also has rules limiting how computers from outside the network can connect to computers inside the network and how the office network appears to the outside world, and other security functions.
Routing packets: An example One of the crucial tasks for any router is knowing when a packet
of information stays on the local network. For this, a router uses a mechanism called a subnet mask. The subnet mask looks like an IP address but usually reads
“255.255.255.0”. This tells the router all the messages with the sender and receiver having an address sharing the 1st 3 groups of numbers are on the same network, and shouldn’t be sent to another network.
Here is an example: The computer at address 15.57.31.40 sends a request to the computer at 15.57.31.52. The router, which sees all packets, matches the 1st 3 groups in the address of both sender and receiver (15.57.31), and keeps the packet on the local network.
Service Model Connectionless (datagram-based) Best-effort delivery (unreliable service)
packets may be lost packets may be delivered out of order duplicate copies of a packet may be delivered packets can be delayed for a long time
Datagram format Version HLen TOS Length
Ident Flags Offset
TTL Protocol Checksum
SourceAddr
DestinationAddr
Options (variable) Pad(variable)
0 4 8 16 19 31
Data
Basics of Routing Algorithms Routers use routing algorithms to find the best route to
the destination. What does it mean by best route?
Based on some metrics e.g. the number of hops, time delay and communication cost of packet transmission
Two categories of routing algorithms Global routing algorithms
Every router has complete info about all other routers in the network and the traffic status of the network
Sometimes known as link state (LS) algorithms. Decentralized routing algorithms
Every router has information about the routers it is directly connected to.
Sometimes known as distance vector (DV) algorithms.
LS Algorithms Every router follows the following steps
Identify the routers that are physically connected to them and get their IP addresses. When a router starts working, it first sends a “hello” packet over network. Each router that receives the packet replies with a message that contains its IP address
Measure the delay time for neighbor routers. In order to do that, routers send echo packets over the network. Every router that receives these packets replies with an echo reply packet. By dividing round trip time by 2, routers can count the delay time. This time includes both transmission and processing times – the time it takes the packets to reach the destination and the time it takes the receiver to process it and reply.
Broadcast its information over the network for other routers and receive the other routers’ information.
Using an appropriate algorithm, identify the best router between 2 nodes of the network. A well known algorithm is called the Dijkstra shortest path algorithm. In this algorithm, a router based on the information that has been collected from other routers, builds a graph of the network. This graph shows the location of routers in the network and their links to each other. Every link is labeled with a number called the weight or cost. This number is a function of delay time, average traffic, and sometimes simply the number of hops between nodes. The router chooses the link with the lowest weight.
Example: Dijkstra Algorithm
Let’s try to find the best route between A and E. There are 6 possible routes between A and E
(ABE,ACE,ABDE,ACDE,ABDCE,ACDBE). Let start from A.
Example: Dijkstra Algorithm
A know that it is directly linked to (B,C). Since B has less weight, it has been chosen as the next hop.
The status record set of tentative nodes that have a direct link to B is (D,E). Since D has less weight, it has been chosen as the next hop.
Example: Dijkstra Algorithm At D, there is only one node E but it is also the destination. So, we find the route ABDE.
In general, each router builds a graph of the network and identifies the source and destination nodes.
Then, it builds a matrix, called the adjacency matrix. In this matrix, a coordinate indicates weight. For example, [I,j] is the weight of a link between Vi and Vj. If there is no direct link between Vi and Vj, then the weight is set to infinity
The router builds a status record set for every node on the network. The record contains 3 fields: predecessor, length, label field. The length shows the sum of the weights from the source to that node. The label shows the status of node – whether it is permanent or tentative.
The router initializes the parameters of the status record set (for all nodes) and sets their length to infinity and label to tentative.
If a node is picked as the next hop,its label is changed to permanent and update its length. The router updates the status record set for all tentative nodes that are directly linked to the next
hop once this next hop node is identified and repeat the procedure until the destination node is reached.
Distance Vector (DV) algorithms
DV algorithms are also known as Bellman-Ford routing algorithms and Ford-Fulkerson routing algorithms.
Every router has a routing table that shows the best route for any destination. A typical graph and routing table for router J is shown in the next slide.
DV algorithms
Destination Weight Line
A 8 A
B 20 A
C 28 I
D 20 H
E 17 I
F 30 I
G 18 H
H 12 H
I 10 I
J 0 ---
K 6 K
L 15 K
Routing Table forRouter J
DV algorithms – contd
In DV algorithms, each router has to follow these steps It counts the weight of the links directly connected
to it and saves the information to its table In a specific period of time, it sends its table to its
neighbor routers and receive the routing table of each of its neighbors
Based on the info in its neighbors’ routing tables, it updates its own.
Counting to Infinity Problem
Imagine that the link between A and B is cut. B corrects its table. After a specific amount of time, routers
exchange their tables, and so B receives C’s routing table. Since C doesn’t know what has happened to the link between A
and B, it says that it has a link to A with the weight of 2. B thinks that there is a separate link between C and A, so it corrects its table and changes infinity to 3.
A B C D
A 0,- 1,A 2,B 3,C
B 1,B 0,- 2,C 3,D
C 2,B 1,C 0,- 1,C
D 3,B 2,C 1,D 0,-
Counting to Infinity Problem – contd
Once again the routers exchange their tables. When C receives B’s routing table, it sees that B has changed the weight of its link to A from 1 to 3, so C updates its table and change the weight of the link to A to 4.
This process loops until all nodes find out that the weight of link to A is infinity. This situation shows that DV algorithms have a slow convergence rate.
One way to solve this problem is for routers to send information only to neighbors that are not exclusive links to the destination. For example, C shouldn’t send any information to B about A because B is the only way to A.
Hierarchical Routing
When the network size grows, the number of routers in the network increases.
The size of routing tables increases as well, and routers can’t handle network traffic efficiently.
Thus, hierarchical routing is used to overcome this scaling problem.
Let us see an example
Hierarchical Routing Example
If we use DV algorithms to find best routes between nodes, each node has to save a routing table with 17
records.
Hierarchical Routing Example
Destination Line Weight
A --- ---
B B 1
C C 1
D B 2
E B 3
F B 3
G B 4
H B 5
I C 5
J C 6
K C 5
L C 4
M C 4
N C 3
O C 4
P C 2
Q C 3
Node A’s routingtable
Hierarchical Routing Example In hierarchical routing, routes are classified in groups known as
regions. Each router has only the information about the routers in its own region and has no information about routers in other regions.
So routers just save one record in their table for every other region. In the example before, we have 5 regions.
Hierarchical Routing
Hierarchical Routing
Destination Line Weight
A --- ---
B B 1
C C 1
Region 2 B 2
Region 3 C 2
Region 4 C 3
Region 5 C 4
Routing TableAt Node A
References
How Internet Infrastructure works at www.howstuffwork.com
How routing algorithms work at www.howstuffwork.com
How routers work at www.howstuffwork.com