network basics and vpm - edward bosworth · an overview of the internet the global internet is best...
TRANSCRIPT
Anonymity
Published in the New Yorker Magazine, July 5, 1993
An Overview of the Internet
The global Internet is best seen as a mechanism that allows computers to communicate.
The Internet is a collection of interconnected networks, each with its own protocol.
It provides the illusion of a single network, but has considerable internal structure.
For our purposes, we shall divide networks into two general classes:
1. Local Area Networks (LAN)
2. Wide Area Networks (WAN)
What Is a Router?
Basically, a router is a computer that is connected to at least two different networks and
passes information between those networks. Naturally, there is more to it than this.
Some routers connect hosts on a local area network to the global Internet.
Some routers connect only to other routers; these are definitely in the network core.
An Example of Early Network Routers
The basic function of a router is to accept a message, determine its destination
address, and forward the message to the router appropriate for that address.
Here is a picture of an early “torn tape switch”. The center had a large number of tape
readers and punches, each connected to another routing center.
Incoming messages would be punched onto paper tape. The human would tear the tape
from the punch on the incoming line, determine the address, and insert it into the tape
reader appropriate for the next routing center in its path to its destination.
Another Example of Early Network Routers
The transmitter–distributor (TD) bank in the Canadian Communications Centre.
In the top left corner are Telex machines with rotary dials which were used to connect to
other Telex machines overseas. When calling New Delhi, India as an example, it would
take minutes to get an answerback – a confirmation that the line was still connected and
assurance that the message got through.
Simple Graph Theory
Graph theory is one of the subjects studied by mathematicians. There is a formal
mathematical graph theory, based on set theory, but we ignore that.
Definition: A graph G is a finite non-empty set of objects called vertices together with
a (possibly empty) finite set of unordered pairs of distinct vertices of G called edges.
The vertex set of G is commonly denoted by V(G), and
the edge set commonly denoted by E(G).
An (n, m)-graph G is a graph with n vertices and m edges; |V(G)| = n and |E(G)| = m.
Any communication network, such as the global Internet, can be represented by
mathematical graph theory. In this model
the communicating nodes (hosts or routers) are considered to be vertices
the links between the nodes are considered to be edges.
Many of the definitions in graph theory translate naturally to concepts in network theory.
In particular, we are interested in connectivity, which considers the effect of the loss
of either communicating nodes (hosts or routers) or the links between nodes.
We want to design networks that are robust in the event of device failure.
Graph theory helps us in this design.
Network Topologies and Graph Theory
Network theorist use the term topology to refer to the structure of a network.
Mathematicians find that graph theory is well suited to a description of the network
structure. We can say that network topology is well expressed by graph theory.
Often the only difference between “network topology” and “graph theory”
is the terminology used.
Consider a network built with N communicating nodes. We say that the network is
connected if any node can communicate with any other node.
Mathematical graph theory tells us that the number of links between these N nodes
must be at least (N – 1) and cannot be greater than N(N–1)/2.
One should note that communication links, normally being implemented with cables
or microwave links, cost money. One then addresses the issue of how to get
acceptable reliability at acceptable cost. Again, mathematical graph theory can help.
We begin by investigating a number of network topologies.
Star Topology
One common network form is the star topology, which can be drawn in at least two
ways, as is shown below. The figure on the left is the depiction most commonly used by
network theorists, while the figure on the right is the one favored by graph theorists.
In terms of graph theory, this structure is a tree. It is a connected graph without cycles.
The star topology features one central node that communicates with all other nodes and
through which all other nodes must communicate. This topology is a realization of a
specific management strategy: centralized control.
This topology may be considered the most efficient, in that it connects a number of
nodes with the least number of communication links.
Problems with the Star Topology
In terms of graph theory, it is easy to prove the following assertions:
1. If a network on N nodes is connected, it must have at least (N – 1) links.
2. The connected network with N nodes an (N – 1) links is unique.
There are three significant failure modes associated with the star topology; these are
shown in the figure below. It should be obvious that a link failure isolates one node,
while a node failure may either disconnect the entire network or leave the remaining
nodes connected.
Note that the most devastating failure is that of the center node. In that case,
none of the other nodes can communicate.
Questions: What is the probability of a node failure?
What is the probability of a link failure?
Bus Topology
The bus topology is commonly used because it is rather cheap and easily implemented.
Here are two depictions of the bus topology, the one on the left being favored by network
theorists and the one on the right by graph theorists. Admittedly, these two are not quite
equivalent. Nevertheless they share common vulnerability characteristics.
There are two failure modes associated with the bus topology. In each variant, a link
failure will result in the network being partitioned into non–communicating segments.
Ring Topology
One obvious way to fix the single–point failure problem of the bus topology is to connect
the ends of the bus. This gives rise to the ring topology, more resistant to single–point
failures. In graph theory terminology, this structure is a cycle.
A ring topology network is shown at left, and at right with a failed link.
Note that all five nodes can still communicate unless there is a second link failure.
On the left is C5, the cycle on five vertices. On the right is P5, a path on five vertices.
Fully Interconnected: the Complete Graph
In graph theory terms, the fully interconnected network on N nodes is called
KN, the complete graph on N vertices.
It should be apparent that the most reliable network is one in which each node has a
direct connection to every other node. As an example, we show the fully connected
network on five nodes. Note that it will take four link failures to isolate even one node.
While such a network offers the maximum reliability, it is also the most expensive.
A fully connected network on N nodes will require N(N – 1)/2 links.
Characterization of Nodes in a Network
For any node in a communications network, we are interested in the number of other
communication nodes to which this node is connected directly. This idea is expressed in
graph theory as the open neighborhood of a vertex.
For a vertex v in a graph G, define N(v), the open neighborhood of v, as the set of
vertices adjacent to v.
The degree of a vertex v in G, denoted either as dv or d(v), is the number of edges
incident on the vertex v. Since each edge incident on the vertex v causes another vertex
to be adjacent to v, we might say that the degree of the vertex is the number of vertices
adjacent to it. The two definitions are entirely equivalent.
In network terms, we consider a communicating node v. The set N(v) is the set of
communicating nodes that connect directly to the node v.
The degree of the communicating node v is the size of N(v); the number
of nodes that communicate directly with v.
A vertex (communicating node) of degree 0 is said to be an isolated vertex (node); it
has no ability to communicate. In network studies, we ignore isolated vertices.
What Is Connected to the Internet?
We consider our computers to be connected to the Internet.
Technically, it is the NIC (Network Interface Card) that is connected. This is connected
to a LAN (Local Area Network) that itself is connected to the Internet via a router.
The Network Interface Card is an Input / Output device attached to the computer.
It communicates with the computer using Direct Memory Access (DMA).
The physical network address, called MAC address (for Media Access Control) address,
is a 48–bit address that identifies the NIC, not the computer.
Addressing the NIC
Each computer on a LAN is addressed using the MAC (Media Access Control) address
of its NIC (Network Interface Card).
The MAC address is a 48–byte address, written as six bytes or twelve hexadecimal digits.
We show two typical MAC addresses;
00:20:C5:00:5F:C1 often written as EagleTec_00:5F:C1
00:40:05:3C:3D:8B often written as AniCommu_3C:3D:8B
MAC address assignment is coordinated by the IEEE. Each manufacturer of NIC’s is
allocated blocks of 224 addresses (three bytes). In the above, we see that
Eagle Tec has been allocated the block of addresses beginning with 00:20:C5,
Affinity Communications has been allocated the block beginning with 00:40:05.
Remember that each block of 224 addresses allows a manufacturer to produce 224
(16, 777, 216) network interface cards with unique addresses.
It is likely that the major manufacturers have two or more blocks of MAC addresses
assigned to them.
IP Addresses and MAC Addresses
In discussing the Internet, we normally use 32–bit IP addresses and not 48–bit MAC
addresses. For the examples discussed above, we have:
MAC Address 00:20:C5:00:5F:C1 IP Address 130.57.20.10
MAC Address 00:40:05:3C:3D:8B IP Address 130.57.20.1
The IP addresses are used for communication over the global Internet.
The MAC addresses are used for communication over the LAN.
The router that connects the LAN to the global Internet handles translation between
the two forms of addresses.
As each client joins the LAN, it broadcasts its MAC address to all others on the LAN.
The router detects this broadcast, and replies with a message assigning an IP address
to the new client. The router then handles the translation.
Structure of an IP Address
Technically, we are discussing the structure of an IP–v4 (IP, version 4) address.
This is a 32–bit number, represented as four groups in a dotted decimal notation.
Consider the IP address 168.192.250.3.
Note that each of these four numbers will be in the range 0 through 255 inclusive;
in other words, it can be represented as an eight–bit binary number.
Decimal 168 = 0xA8
Decimal 192 = 0xC0
Decimal 250 = 0xFA
Decimal 3 = 0x03
So IP address 168.192.250.3 can be written as 0xA8 C0 FA 03
It can be written in binary as 1010 1000 1100 0000 1111 1010 0000 0011
It can also be written as a single decimal number 2 831 219 203
The dotted decimal notation is generally agreed to be the easiest to use.
The Domain Name System
The DNS (Domain Name System) allows us to use the mnemonic hostnames to the
IP addresses that are required by the IP network layer.
Typical mnemonic hostnames include:
www.colstate.edu
www.csna.org
The hierarchy of the hostnames is from right to left, with the part on the right denoting
the high–level group to which the address belongs. Some high–level domains are:
.com commercial firms within the United States (usually)
.edu educational institutions
.mil organizations and branches of the U.S. Department of Defense
.org non–profit organizations
The last two parts of this address are “owned” by a specific organization.
Any hostname address ending in “ColumbusState.edu” will belong to CSU
Examples of this are www.ColumbusState.edu, cs.ColumbusState.edu,
csc.ColumbusState.edu, lsm.colstate.edu
PING
Pinging www.colstate.edu [168.26.193.117] with 32 bytes of data:
Reply from 168.26.193.117: bytes=32 time<1ms TTL=127
Reply from 168.26.193.117: bytes=32 time<1ms TTL=127
Reply from 168.26.193.117: bytes=32 time<1ms TTL=127
Reply from 168.26.193.117: bytes=32 time<1ms TTL=127
Ping statistics for 168.26.193.117:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 0ms, Average = 0ms
Pinging csna.org [67.63.199.206] with 32 bytes of data:
Reply from 67.63.199.206: bytes=32 time=48ms TTL=41
Reply from 67.63.199.206: bytes=32 time=52ms TTL=41
Reply from 67.63.199.206: bytes=32 time=54ms TTL=41
Reply from 67.63.199.206: bytes=32 time=50ms TTL=41
Ping statistics for 67.63.199.206:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 48ms, Maximum = 54ms, Average = 51ms
The TCP/IP Protocol Stack
A protocol stack divides the process of communication into several logical layers. The
TCP/IP protocol calls for five logical layers. We list them from highest to lowest level.
Application at this level, we are dealing with applications such as e–mail, web
browsing, file transfer, etc.
Logically, we have two applications directly communicating.
Transport this layer provides communication between two applications,
probably on different computers.
This level uses IP addresses and port numbers.
Network this layer provides communication between two computers,
probably on different Local Area Networks.
This level uses IP addresses.
Link this layer provides communication between two computers
on the same Local Area Networks.
This level uses MAC addresses.
Physical this layer provides for actual transport of bits over either an
electrical network or an optical fiber network.
Connection–Oriented vs. Datagram Networks
The Internet is a bit unusual in that it is not a connection–oriented network.
The U.S. telephone network is the best example of a connection–oriented network.
One dials a phone number, establishes a connection, and then keeps that connection open
for the duration of the conversation. This might include a lot of time “on hold”.
Most networks break the messages and other data being passed into a number of packets,
also called “datagrams”. Each packet can be routed independently from the source to
the destination, leading to a great flexibility in the network.
At the physical level, a packet is embedded in an Ethernet frame as the frame payload.
It has s 48–bit MAC address for the source node and destination node, a frame type, and
a CRC for error detection. For IP version 4, the frame type is 0x0800.
Basic Definitions
A Firewall is a devise that filters all traffic between a protected or “inside”
network and a less trustworthy or “outside” network.
The firewall mediates all access to a network, allowing and disallowing
certain types of access depending on a configured security policy.
An IDS (Intrusion Detection System) is a device, typically a stand–alone
computer, that monitors activity on a network in order to detect and record
any malicious or suspicious activity. An IDS raises an alarm when specific
activities, defined by a security policy, are detected.
A Honeypot is a computer system that appears open to attackers. It is
designed to be attractive to malicious attackers. It has several uses:
1. to attract attacks that can later be analyzed to predict new attacks.
2. to lure an attacker to a place that can be used to identify and stop him.
3. to provide an “attractive but diversionary playground”, in the hope
that the attacker will overlook your production systems.
Security Goals for Firewalls
Here is the standard list of goals for computer security. These goals can
be used as design goals for firewalls.
Confidentiality
Computer–based assets are accessed only by those specifically authorized
to access these protected assets.
Integrity
These assets can be modified only by authorized parties and only in
authorized ways. It is desired to protect against both accidental and
malicious activities.
Availability
These assets are accessible to those parties authorized to use them. In particular
the data are to be available in a timely manner without undue delays in
transmission through the firewall.
Attacks on Computer Networks
Let us define the goals of any network security system, including a firewall,
in the terms above. Basically the system should make the network more secure.
A vulnerability is a weakness in a security system, for example, in procedures,
design, or implementation, that might be exploited to cause harm or data loss.
The goal for many of the scanning methods, such as port scans, is to detect
one of the system’s vulnerabilities so that it can be exploited.
A threat to a network is a set of circumstances that might cause harm or loss.
An attack is a threat applied to a vulnerability. In other words, it is the
actual exploitation of a vulnerability.
A control is an action, device, procedure or technique that removes or
reduces a vulnerability. A firewall is an example of a control.
A threat is blocked by control of a vulnerability.
Basic Firewall
A firewall provides protection by inspecting the contents of each datagram,
and acting in accord with a stated business security policy.
In filtering incoming packets, the firewall serves to protect an “inner network”
with sensitive assets, from the “wild world” of Internet malware.
In filtering outgoing packets, the firewall serves to restrict dissemination
of a company’s sensitive documents and other materials.
There are a variety of firewalls, depending on the detail in which each packet
is examined and degree of correlation of a given packet with other packets.
Routers are often fitted with firewall functionality.
Company Networks: DMZ and Internal
The term “DMZ”, for “Demilitarized Zone”, arose first in military usage.
For example, there is a strip of land between North Korea and South Korea,
called the DMZ, that is designed to be free from military activity.
Each company maintains two classes of network and computer assets:
1. those that are strictly internal and contain sensitive company data.
2. those that represent the “public face” of the company and contain
data that are approved for unlimited public release.
This division of assets by functionality suggests a division of the
company’s network into two parts.
The DMZ, which contains servers and other assets that should be visible
and accessible to the public. This zone might include various web servers,
a mail server, and other assets that do not hold sensitive material.
The Internal Network, which contains sensitive material and should not
be visible or accessible to the public.
Structure of the DMZ
While the DMZ is less sensitive than the Internal Network, it does
require some protection from the global Internet.
In addition, the Internal Network requires protection from the DMZ as there
is some visibility from the DMZ to the outside world.
For this reason, the DMZ solution uses two firewalls, each implementing
a distinct security policy.
Three Network Types
A public network is one that is shared by many users, representing many
organizations. The prime example of this is the global Internet.
Transmissions of packets in this network are over shared communication lines.
A private network is one that is dedicated to one user. These are designed
using private leased telephone lines, which are accessible only to the company
using the network.
A private network is, by definition, a point–to–point connection.
By definition, a private network is carried by a land line. Use of a microwave
link would make the network public, as the transmissions could be monitored.
True private networks are expensive to acquire and maintain.
A VPN (Virtual Private Network) is a “technology that allows two or more
locations to communicate securely over a public network while maintaining
the security and privacy of a private network” [page 313, Ref. 2].
A VPN offers many of the advantages of a private network at much lower cost.
Private Networks and Leased Lines
As noted above, a leased line is a point–to–point communication between two
points designated by the company leasing the telephone line.
The line can carry a number of data streams, all owned by the lessee, through a
mechanism called TDM (Time Division Multiplexing).
TDM is essentially a round–robin sharing of a single line among a number of
pairs of communicating entities; each takes its turn communicating.
One early standard is called “T1”. It provides 24 channels, allowing 24 different
communications to be carried at the same time. Here are some standards:
T1 24 channels 1.544 Mb/sec
E1 32 channels 2.048 Mb/sec
T2 4 T1 streams 6.312 Mb/sec
T3 7 T2 streams 44.736 Mb/sec
T4 6 T3 streams 274.176 Mb/sec
Optical fiber channels have different standards, the STS series.
Characteristics of Virtual Private Networks
Virtual Private Networks have a number of characteristics.
1. The connection is point–to–point.
2. The traffic is encrypted in order to prevent eavesdropping.
3. The end–point sites are authenticated.
4. Multiple protocols are supported over the VPN.
The last characteristic shows that a VPN is more general than a dedicated
secure protocol such as HTTPS and SSH.
Each of the latter two protocols provides a dedicated service.
A VPN provides security services to a wide variety of protocols.
VPN Types
One classification of virtual private networks divides them into two types:
User VPNs and Site VPNs.
A User VPN is a virtual private network between an individual user computer
and an organizational site or network.
A Site VPN is used by an organization as a replacement for a private
point–to–point network.
The defining characteristics of a site VPN include the following.
1. The site does not move around, as it the common case
with individual users.
2. The link to the remote site will carry much more traffic than
the link to any individual user computer.
3. The remote site will most likely employ a true firewall as
a part of setting up the VPN.
Benefits of Virtual Private Networks
There are two primary benefits of a User VPN
1. Employees who travel can have access to e–mail, files, and
other sensitive internal assets.
2. Employees who work from home can have the same access to network
assets as those who work on–site.
User VPN connections are far superior to the standard alternatives, which use
a modem and a (usually slower) dial–up connection. Admittedly, one can
configure a firewall to handle modem traffic, but that is just one more issue.
The benefit of a Site VPN is that it offers a secure connection to each remote
office at significantly less cost than would be associated with a leased T1 line.
Security Issues with a User VPN
The biggest security issue with the use of a VPN by an employee is the
possible existence of a simultaneous connection to another Internet site.
The figure on the left shows the intended setup. That on the right shows the
situation in which the user computer has been infected with a Trojan Horse that
is used to initiate a VPN session with the corporate host.
This class of attack is thought to require some sophistication, but that claim
ignores “script kiddies”, those who use sophisticated scripts written by others.
One obvious remedy is to disallow other Internet connections when the VPN is
active. This would require all Internet access to go through the corporate site.
VPNs and Firewalls
A Virtual Private Network is a logical construct, based on implementation
of a few key ideas: encryption, authentication, and packet integrity.
A VPN can be implemented either in software on the user computer (as in a
User VPN) or a separate computer.
Consider the VPN device in another light: it can be seen as a gatekeeper
between the public networks and the confidential assets of the company.
What else can be considered as a gatekeeper? The answer is a firewall.
This leads to obvious decision to build VPN capabilities into a firewall.
The firewall’s rule base can be used to augment the function of the VPN.
One obvious use of the firewall rules would be to limit access to the VPN
server. Who can access the VPN server from the outside?
As always, configuration of the firewall/VPN server represents a compromise
between security and availability.
The VPN Server
The VPN Server is that computer that acts as one of the end points of the VPN.
The VPN Server is most commonly placed on a firewall. If the company
uses a DMZ, the VPN Server will be placed on the Inner Firewall.
If the VPN Server is a stand–alone system, it is often placed in a dedicated
VPN DMZ, separate from the Internet DMZ that contains the Web server
and E–Mail server. The VPN Server is more trusted than other DMZ servers.
The VPN Server must be protected by a firewall rule set that is much more
strict than the rule set for the semi–public servers.
The Fail–Over Problem
Some vendors allow for redundant VPN Servers, with the option of “fail over”;
this removes one possible single point of failure.
The idea of redundant VPN Servers also allows for load balancing, in case
that the traffic becomes large.
There are a number of problems associated with redundant servers.
1. This violates the desirable property of having a server that is both a
single entry point to the protected network and a single exit point from it.
2. The idea of a set of “peer VPN Servers” opens a possible avenue of
hacking the internal server. I just get my computer placed on the list of
approved VPN servers, and then proceed to commit mischief.
Encryption for the VPN
The design of encryption for a VPN mimics that used for HTTPS/SSL.
Any encryption algorithm used for the VPN should be one that is
well known and thoroughly tested. Options include DES, 3–DES, AES, etc.
Any cryptographic hash function for data integrity verification should also
be one that is well known and thoroughly tested. Options: SHA–1, MD5, etc.
We may borrow some terminology from IPSec and claim that establishing a
VPN between a remote site and the corporate site is equivalent to establishing
a SA (Security Association).
1. The first step is to set up a TCP session.
2. The two end–points then authenticate each other.
3. The two end–points then use a standard protocol, such as IKE, to
exchange secret keys to be used for the duration of the session.
The key exchange will often use public key encryption to send the short data
strings representing the secret keys. While public key encryption could be
used to manage the entire VPN session, it is generally to slow to be practical.
IKE
The term “IKE” (Internet Key Exchange) is a key exchange method used by
“ISAKMP” (Internet Security Association and Key Management Protocol).
ISAKMP and IKE are defined by four IETF Publications:
RFC 2407, RFC 2408, RFC 2409, and RFC 4306.
The IKE protocol uses UDP packets, usually on port 500, and generally requires
2 or 3 round trip communications (4 or 6 packets) to set up a SA.
The IKE protocol calls for two phases in the establishment of a VPN.
1. The two ISAKMP peers establish a secure, authenticated channel
with which to communicate.
2. The Security Associations used by the VPN are negotiated, and
keys are exchanged.
Key exchange in IKE uses the Diffie–Hellman key exchange to set up a
shared session secret, from which the cryptographic keys are derived.
This protocol is based on public key cryptography, with public keys and
private keys. It is considered secure against eavesdroppers.
Authentication
One of the key points in setting up a VPN Security Association is the
authentication of the end points.
This is especially important for a User VPN.
Methods often used are the standard pair (user name, password).
One might include the MAC address of the user NIC as part of authenticating
the remote user. This would limit each user to a small number of computers.
When used in conjunction with the standard authentication, this additional
authentication might offer either some advantages or a big loophole in security.
A company might consider keeping a current list of employees on travel and
allowing connections from only those employees who are “out” at the time.
References
1. Network Security: A Beginner’s Guide, Eric Maiwald,
Osborne/McGraw–Hill, 2001, ISBN 0 – 07 – 213324 – 4.
2. Essential Check Point Firewall–1, Dameon D. Welch–Abernathy
(a.k.a. Phoneboy), Addison–Wesley, 2000, ISBN 0 – 201 – 69950 – 8.
5. Computer Networks, Andrew Tanenbaum & David Wetherall,
(Fifth Edition), Prentice Hall, 2011, ISBN 978 – 0 –13 – 212695 – 3.
Web References
3. Wikipedia,
http://en.wikipedia.org/wiki/
Internet_Security_Association_and_Key_Management_Protocol
(Note that the link is one continuous string)
4. Wikipedia, http://en.wikipedia.org/wiki/Internet_Key_Exchange