how internet works emc 165 computer and communication networks feb 3, 2004

How Internet Works

EMC 165 Computer and Communication Networks

Feb 3, 2004

Outline

How Internet Instrastructure Works How Routers Work How TCP/IP networks work How Routing Algorithms Work How NAT works

What is the Internet?

It is a global collection of networks, both big and small.

Recall in Lecture 2, we mentioned that one of the greatest things about the Internet is that nobody really owns it.

These networks connect together in many different ways to form the single entity that we know as the Internet. In fact, the very name comes from this idea of interconnected networks.

The Internet Concept

The Internet Concept (Cont’d)

Internet: Network of Networks

Every computer that is connected to the Internet is part of a network, even the one in your home.

For example, you may use a modem and dial a local number to connect to an Internet Service Provider (ISP).

At school/work, you may be part of a local area network (LAN), but you most likely still connect to the Internet using an ISP that your school/company has contracted with.

Internet: Network of Networks (Cont’d)

When you connect to your ISP, you become part of their network.

The ISP may then connect to a larger network and become part of their network.

The Internet is simply a network of networks.

Connecting Network of Networks

The amazing thing here is that there is no overall controlling network.

Instead, there are several high-level networks connecting to each other through Network Access Points or NAPs.

All the networks that make up the Internet rely on NAPs, backbones and routers to talk to each other.

History of Internet 1962, Paul Baran of the RAND Corporation was commissioned by the

US Air Force to do a study on how it could maintain its command and control over its missiles and bombers, after a nuclear attack.

Baran’s final proposal was a packet switched network. 1968, Advanced Research Project Agency (ARPA) awarded the

APRPANET contract to BBN. The physical network was constructed in 1969, linking 4 nodes: UCLA, SRI (Stanford), UCSB, University of Utah via 50 Kbps circuits.

The 1st email program was created by Ray Tomlison of BBN in 1972. ARPA was later renamed the Defense Advanced Research Projects

Agency (DARPA) in 1972. In 1973, developments began on the protocol later to be called TCP/IP

by a group headed by Vinton Cerf from Stanford and Bob Kahn from DARPA.

The term Internet was coined by Vint Cerf and Bob Kahn in their paper on TCP in 1974

History of Internet - contd

Dr R. Metcalfe developed Ethernet in 1976, which allowed coaxiable cable to move data extremely fast.

Dept of Defense begain experimenting with the TCP/IP protocol in 1976 and soon decided to require it for use on ARPANET.

Total number of hosts on the backbones in 1976: 111+


National Science Foundation (NSF) created the 1st high-speed backbone in 1987 called the NSFNET.

NSFNET is a T1 line that connected 170 smaller networks together and operated at 1.544 Mbps.

IBM, MCI, and Merit worked with NSF to create the backbone and developed a T3 (45 Mbps) backbone the following year

Total number of hosts in the Internet: 56,000 in 1988. In 1990, this number has jumped up to 313,000

In 1992, World-Wide Web was released by CERN. NSFNET backbone completely upgraded to T3.

Total number of hosts in the Internet in 1992 – 1.136 millions


In 1994, ATM (145 Mbps) backbone is installed on NSFNET.

Total number of hosts has increased to 3.864 millions in 1994

Most Internet traffic is carried by backbones of independent ISPs including MCI, AT&T, Sprint, UUNet, BBN planet etc.

The total number of hosts in 1999 was around 15 millions and growing rapidly.

Backbones

Backbones are typically fiber optic trunk lines The trunk line has multiple fiber optic cables

combined together to increase the capacity. Fiber optic cables are designated OC for

optical carrier such as OC-3, OC-12 or OC-48.

An OC-3 line is capable of transmitting 155 Mbps while an OC-48 can transmit 2,488 Mbps (2.488 Gbps).

Logical addresses Every piece of equipment that connects to a network

has a physical address. This is an address unique to the piece of equipment.

The physical address is also called the Medium Access Control (MAC) address. It has 2 parts each 3 bytes long. The 1st 3 bytes identify the company that made the Network Interface Card (NIC), and the 2nd 3 bytes are the serial number of the NIC itself.

The interesting thing to note is a computer can have several logical addresses at the same time.

Logical addresses like IP address are assigned statically or dynamically.

Internet Protocol: IP addresses Every machine on the Internet has a unique underlying number, called

an IP address. The IP stands for Internet Protocol which is the language that

computers use to communicate over the Internet. A protocol is a pre-defined way that someone who wants to use a

service talks with that service. That someone could be a person, but more often it is a computer program like a Web-browser.

A typical IP address looks like this 216.27.61.137The four numbers in an IP address are called octets, because they each have

eight bits.Each octet can contain any value between zero and 255.So, combining 4 octets give us 232 possible unique values.

Certain values are restricted from use as typical IP addresses e.g. 0.0.0.0 is reserved for the default network and 255.255.255.255 is reserved for broadcasts

How TCP/IP network works.

IPv4 Header

Version HLength Type of Service Total Length

Header Checksum

Identification

(Next) ProtocolTime-to-Live

Flags Fragment Offset

Source Address

Destination Address

IP Options

Data

Payload up to 65,535 bytes

0 31

IP Addresses - Motivation Key aspect of a virtual network is a single, uniform

address format Can't use hardware addresses because different

technologies have different address formats Format must be independent of any particular

hardware address format

Sending host puts destination internet address in packet

Destination address can be interpreted by any intermediate router Routers examine address and forward packet on

to the destination

Properties 32-bit number globally unique (with a few exceptions!) hierarchical: network + host Classes of addresses for specific types of networks

Classfull Addresses

Generally assigned by authorities

except from: A-class net: 10.0.0.0 B-class net 172.16.0.0 C-class net 192.16.8.0

Some college have a B-class net e.g.134.226.0.0 Can arrange for Dept. of Comp. Science. to have a

number of subnets in this domain e.g. 134.226.32.0, 134.226.51.0

Classfull Addresses

Summary

Virtual network needs uniform addressing scheme, independent of hardware

IP address is a 32-bit address IP address is composed of a network address

and a host address Network addresses are divided into classes e.g.

A, B and C Dotted decimal notation is a standard format for

Internet addresses: 134.226.32.57

IP Address & Ethernet Address

Computer A

Computer B

Computer X

Computer Y

172.16.1.100-08-74-32-24-89

IP Address

MAC Address

Address Resolution Protocol (ARP)

Computer A

Computer B

Computer X

Computer Y

172.16.1.100-08-74-32-24-89

172.16.1.2MAC ADDRESS???

forComputer B

ARP’s “Who has…?” Packet

Computer A

Computer B

Computer X

Computer Y

172.16.1.100-08-74-32-24-89

Who has 172.16.1.2

ARP’s Reply Packet

Computer A

Computer B

Computer X

Computer Y

172.16.1.100-08-74-32-24-89

172.16.1.200-08-74-21-20-D7

172.16.1.200-08-74-21-20-D7

Routed (Sub-)Networks

Router

Packet Size Matters!!!

Packet Size=

7000 bytes7000

Network-specific MTU*

*Maximum Transfer Unit

Fragmentation One technique - limit datagram size to smallest MTU

of any network

However: This approach requires knowledge about all networks involved in communication

IP uses fragmentation - datagrams can be split into

pieces to fit in network with small MTU Router detects datagram larger than network MTU

Splits into pieces Each piece smaller than outbound network MTU

Fragmentation (details) Each fragment is an independent datagram

Includes all header fields Bit in header indicates datagram is a fragment Other fields have information for reconstructing original datagram FRAGMENT OFFSET gives original location of fragment

Router uses local MTU to compute size of each fragment Puts part of data from original datagram in each fragment Puts other information into header

Fields for Fragmentation

Version HLength Type of Service Total Length

Header Checksum

Identification

(Next) ProtocolTime-to-Live

Flags Fragment Offset

Source Address

Destination Address

IP Options

Data

Payload up to 65,535 bytes

0 31

Ethernet to Tokenring

Tokenring to Ethernet

Each network has a Maximum Transmission Unit (MTU) IP datagrams can be larger than most hardware MTUs

IP: 216 - 1 Ethernet: 1500 Token ring: 2048 or 4096

Strategy fragment when necessary (Datagram > MTU) try to avoid fragmentation at source host re-fragmentation is possible fragments are self-contained datagrams delay reassembly until destination host do not recover from lost fragments

Fragmentation & Reassembly

IP may drop fragment What happens to original datagram?

Destination drops entire original datagram How does destination identify lost fragment?

Sets timer with each fragment If timer expires before all fragments arrive, fragment

assumed lost Datagram dropped

Source (application layer protocol) assumed to retransmit

Best Effort Delivery

Fragment Loss

Internet Protocol: Domain Name System

If there are only a few hosts, then working with IP addresses is fine but with more and more hosts that came online, it becomes unwieldly.

The first solution is a simple text file maintained by the Network Information Center that mapped names to IP addresses.

But soon this text file became so large that it was too cumbersome to manage.

So, in 1983, University of Wisconsin created the Domain Name System which maps a hostname to an IP address automatically.

Uniform Resource Locators When you use the Web or send an email message, you use a

domain name to do it. For example, the URL http://www.howstuffworks.com contains the domain name howstuffworks.com. So does the email address: [email protected].

Everytime we use a domain name, we use the Internet’s DNS servers to translate the human-readable domain name into the machine-readable IP address.

Top-level domain names include .com, .org, .net, .edu, .gov. Within every top-level domain, there is a huge list of 2nd-level domains. For example, in the .com 1st-level domain, there is Yahoo Microsoft AmazonEvery name in the .com top-level domain must be unique.

Internet Naming Hierarchy

.com .net .ie .uk

.co.tcd

www

.ac

The silent dot at theend of all addresses

How to find www.cse.lehigh.edu?

www

Name server in Berkeley, CA

1. Ask top-level server for edu-server

2. Ask .edu server for lehigh-server

3. Ask .lehigh server for cse-server

4. Ask .cse server for “www” machine

cse

lehigh

edu

Domains

DNS server

134.226.32.57

DNS DNS servers accept requests from programs, and

other name servers, to convert domain names into IP addresses. When a request comes in, the DNS server can do one of the 4 things with it: It can answer the request with an IP address because it

already knows the IP address for the requested domain. It can contact another DNS server and try to find the IP

address for the name requested. It may have to do this multiple times

It can say, “I don’t know the IP address for the domain but here’s the IP address for a DNS server that knows more than I do”

It can return an error message because the requested domain name is invalid or does not exist.

Name agent (Resolver) Interface with the local user programs Identifies objects based on symbolic names

Name server Converts symbolic names to addresses Queries other name servers if the name is

unknown

Name Server Architecture

Name agent

Name server

Name server

Name serverName server

Recursive Name Server

Iterative Name Server

Name agent

Name server


Name server

Transitive Name Server

Name agent



Domain Name System (DNS)

Name server Serves a hierarchical name space Maps names to addresses Stores auxiliary information

Authoritative name serverMail exchangerRound robin (load balancing)

Putting it together

Router in CSE

Router in Lehigh

Router at AT&T

Router in New York

Router at Berkeley

Computer B

www

knows that 134.226.0.0is routed into direction of east coast

has an agreement with AT&T

has an agreement with Lehigh

knows aboutthe Dept. ofComp. Sc.

knows that 134.226.32.57 is onthe local ethernet and uses ARPto get its ethernet address

uses a DNS query to find out what IP addresswww.cse.lehigh.edu has

Berkeley DNS

replies with134.226.36.57

cse.lehigh.edu

Computer B in Berkeley, CS wants to find a web page at “www.cse.lehigh.edu”

Best-Effort Delivery

D2

D1

• Transfer of datagrams D1 & D2• Possible deliveries:

D2 D1

D1 D2

D1

D2

nothing

How Routers Work Assume that there is a small company with 10 employees, each

with a computer. 4 of the employees are animators, while the rest are in sales, accounting and management.

The animators send many very large files back and forth to one another. To do this, they will need a network.

When one animator sends a file to another, every one sees the traffic if the network used is Ethernet. Each computer checks to see if the packet is meant for its address. But since the file is big, this makes the network run very slowly for other users.

So, to keep the animators’ work from interfering with others, the company sets up 2 separate networks, one for the animators and one for the rest of the company. A router links the two networks and connects both networks to the Internet.

How Routers Work - contd Router is the only device that sees every message sent by any

computer. When the animator sends a huge file to another animator, the

router looks at the recipient’s address and keeps the traffic on the animators’ network.

When the animator sends a message to the bookeeper, the router sees the recipient’s address and forwards the message between the two networks.

One of the tools a router uses to decide where to forward a packet is a configuration table. Such a table contains the following information Information on which connections lead to particular groups of

addresses Priorities for connections to be used Rules for handling both routine and special cases of traffic.

How Routers Work – contd A router has 2 separate but related tasks

It ensures that information does not go where it is not needed.

It makes sure that information does make it to the intended destination.

As the number of networks attached to one another grows, the configuration table for handling traffic among them grows, and the processing power of the router is increased.

Recall that the Internet is a packet-switched network which means each packet may take a different route to reach its destination. Each packet contains a header that tells its source and destination address.

Routing packets: An example Consider a medium-sized router in a company’s office network

with 50 computers and devices and the Internet. The office network connects to the router through an Ethernet

connection (e.g. 100 base-T connection meaning 100 Mbps). There are 2 connections between the router and the ISP. One is

a T1 connection (1.5 Mbps) and the other is an ISDN line (128 Kbps).

The configuration table tells it that all out-bound packets are to use the T1 line, unless it is not available. If T1 is not available, then the ISDN line will be used.

The router also has rules limiting how computers from outside the network can connect to computers inside the network and how the office network appears to the outside world, and other security functions.

Routing packets: An example One of the crucial tasks for any router is knowing when a packet

of information stays on the local network. For this, a router uses a mechanism called a subnet mask. The subnet mask looks like an IP address but usually reads

“255.255.255.0”. This tells the router all the messages with the sender and receiver having an address sharing the 1st 3 groups of numbers are on the same network, and shouldn’t be sent to another network.

Here is an example: The computer at address 15.57.31.40 sends a request to the computer at 15.57.31.52. The router, which sees all packets, matches the 1st 3 groups in the address of both sender and receiver (15.57.31), and keeps the packet on the local network.

Service Model Connectionless (datagram-based) Best-effort delivery (unreliable service)

packets may be lost packets may be delivered out of order duplicate copies of a packet may be delivered packets can be delayed for a long time

Datagram format Version HLen TOS Length

Ident Flags Offset

TTL Protocol Checksum

SourceAddr

DestinationAddr

Options (variable) Pad(variable)

0 4 8 16 19 31

Data

Basics of Routing Algorithms Routers use routing algorithms to find the best route to

the destination. What does it mean by best route?

Based on some metrics e.g. the number of hops, time delay and communication cost of packet transmission

Two categories of routing algorithms Global routing algorithms

Every router has complete info about all other routers in the network and the traffic status of the network

Sometimes known as link state (LS) algorithms. Decentralized routing algorithms

Every router has information about the routers it is directly connected to.

Sometimes known as distance vector (DV) algorithms.

LS Algorithms Every router follows the following steps

Identify the routers that are physically connected to them and get their IP addresses. When a router starts working, it first sends a “hello” packet over network. Each router that receives the packet replies with a message that contains its IP address

Measure the delay time for neighbor routers. In order to do that, routers send echo packets over the network. Every router that receives these packets replies with an echo reply packet. By dividing round trip time by 2, routers can count the delay time. This time includes both transmission and processing times – the time it takes the packets to reach the destination and the time it takes the receiver to process it and reply.

Broadcast its information over the network for other routers and receive the other routers’ information.

Using an appropriate algorithm, identify the best router between 2 nodes of the network. A well known algorithm is called the Dijkstra shortest path algorithm. In this algorithm, a router based on the information that has been collected from other routers, builds a graph of the network. This graph shows the location of routers in the network and their links to each other. Every link is labeled with a number called the weight or cost. This number is a function of delay time, average traffic, and sometimes simply the number of hops between nodes. The router chooses the link with the lowest weight.

Example: Dijkstra Algorithm

Let’s try to find the best route between A and E. There are 6 possible routes between A and E

(ABE,ACE,ABDE,ACDE,ABDCE,ACDBE). Let start from A.

Example: Dijkstra Algorithm

A know that it is directly linked to (B,C). Since B has less weight, it has been chosen as the next hop.

The status record set of tentative nodes that have a direct link to B is (D,E). Since D has less weight, it has been chosen as the next hop.

Example: Dijkstra Algorithm At D, there is only one node E but it is also the destination. So, we find the route ABDE.

In general, each router builds a graph of the network and identifies the source and destination nodes.

Then, it builds a matrix, called the adjacency matrix. In this matrix, a coordinate indicates weight. For example, [I,j] is the weight of a link between Vi and Vj. If there is no direct link between Vi and Vj, then the weight is set to infinity

The router builds a status record set for every node on the network. The record contains 3 fields: predecessor, length, label field. The length shows the sum of the weights from the source to that node. The label shows the status of node – whether it is permanent or tentative.

The router initializes the parameters of the status record set (for all nodes) and sets their length to infinity and label to tentative.

If a node is picked as the next hop,its label is changed to permanent and update its length. The router updates the status record set for all tentative nodes that are directly linked to the next

hop once this next hop node is identified and repeat the procedure until the destination node is reached.

Distance Vector (DV) algorithms

DV algorithms are also known as Bellman-Ford routing algorithms and Ford-Fulkerson routing algorithms.

Every router has a routing table that shows the best route for any destination. A typical graph and routing table for router J is shown in the next slide.

DV algorithms

Destination Weight Line

A 8 A

B 20 A

C 28 I

D 20 H

E 17 I

F 30 I

G 18 H

H 12 H

I 10 I

J 0 ---

K 6 K

L 15 K

Routing Table forRouter J

DV algorithms – contd

In DV algorithms, each router has to follow these steps It counts the weight of the links directly connected

to it and saves the information to its table In a specific period of time, it sends its table to its

neighbor routers and receive the routing table of each of its neighbors

Based on the info in its neighbors’ routing tables, it updates its own.

Counting to Infinity Problem

Imagine that the link between A and B is cut. B corrects its table. After a specific amount of time, routers

exchange their tables, and so B receives C’s routing table. Since C doesn’t know what has happened to the link between A

and B, it says that it has a link to A with the weight of 2. B thinks that there is a separate link between C and A, so it corrects its table and changes infinity to 3.

A B C D

A 0,- 1,A 2,B 3,C

B 1,B 0,- 2,C 3,D

C 2,B 1,C 0,- 1,C

D 3,B 2,C 1,D 0,-

Counting to Infinity Problem – contd

Once again the routers exchange their tables. When C receives B’s routing table, it sees that B has changed the weight of its link to A from 1 to 3, so C updates its table and change the weight of the link to A to 4.

This process loops until all nodes find out that the weight of link to A is infinity. This situation shows that DV algorithms have a slow convergence rate.

One way to solve this problem is for routers to send information only to neighbors that are not exclusive links to the destination. For example, C shouldn’t send any information to B about A because B is the only way to A.

Hierarchical Routing

When the network size grows, the number of routers in the network increases.

The size of routing tables increases as well, and routers can’t handle network traffic efficiently.

Thus, hierarchical routing is used to overcome this scaling problem.

Let us see an example

Hierarchical Routing Example

If we use DV algorithms to find best routes between nodes, each node has to save a routing table with 17

records.

Hierarchical Routing Example

Destination Line Weight

A --- ---

B B 1

C C 1

D B 2

E B 3

F B 3

G B 4

H B 5

I C 5

J C 6

K C 5

L C 4

M C 4

N C 3

O C 4

P C 2

Q C 3

Node A’s routingtable

Hierarchical Routing Example In hierarchical routing, routes are classified in groups known as

regions. Each router has only the information about the routers in its own region and has no information about routers in other regions.

So routers just save one record in their table for every other region. In the example before, we have 5 regions.


Destination Line Weight

A --- ---

B B 1

C C 1

Region 2 B 2

Region 3 C 2

Region 4 C 3

Region 5 C 4

Routing TableAt Node A

References

How Internet Infrastructure works at www.howstuffwork.com

How routing algorithms work at www.howstuffwork.com

How routers work at www.howstuffwork.com

how internet works emc 165 computer and communication networks feb 3, 2004

Documents

internet worksemc

term internet

network of networks

larger network

physical network

internet service provider

smaller networks

highlevel networks