the internet

147
The Internet The Internet

Upload: lara

Post on 10-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

The Internet. Overview. An introduction to HTML Dynamic HTML Encryption Public Key Infrastructure Development of the Internet Web Browsers. Top 10 uses of Internet at Work (2000). 1 . E-mail : 73 % 2 . Business related research : 35 3 . Academic Research : 23 - PowerPoint PPT Presentation

TRANSCRIPT

The InternetThe InternetThe InternetThe Internet

Overview

An introduction to HTML Dynamic HTML Encryption Public Key Infrastructure Development of the Internet Web Browsers

Top 10 uses of Internet at Work (2000)

1. E-mail: 73% 2. Business related research: 35 3. Academic Research: 23 4. General browsing/surfing: 17 5. IT information: 11 6. Downloading Software: 11 7. News information: 10 8. Searching for personal information: 9 9. Reading Magazines/Newspapers: 7 10. Sports information: 7

Overall Structure of Internet

How does the World Wide Web works?

1. User must have a program called "browser" running on the computer: Internet Explorer (IE) or Netscape

2. User establishes a connection with an ISP (Internet Service Provider) via dial-up or LAN (local area network).

3. User types in an URL (Uniform Resource Locator) as the target webpage address in browser's address field. For example, http://www.csd.uwo.ca/~cs031

4. (4-6 are behind the scene) Through ISPs, the English URL is translated into a numerical IP (Internet Protocol) address. Eg:130.100.11.3 

5. User's browser uses the IP address to establish a connection via local, regional, and/or national ISPs, with the target computer (a web server).

6. The web page that the user wants, HTML page, is sent back to user's browser. 

7. User's browser interprets HTML commands, and displays the page with nice format to the user. HTML pages can have

Formatting information (text formatting, framing, etc.) Hyperlinks (user clicks on and browser repeats steps 3-6) Multimedia (pictures, audio, video, animations)

A Simple Example (simple.html)

<h1>A very simple web page</h1>My name is <b>Charles Ling</b> <br><br>Here is a picture of mine <br><IMG SRC=../c_ling2.jpg width=150><br><br>My favourite thing to do is<ul><li>Adventurous travelling around the world<li>Watching good movies <li>Reading news at <a href="http://www.cnn.com">CNN</a></ul>For more info about me, click my<a href="http://www.csd.uwo.ca/faculty/ling">home page at UWO.</a>

Building Webpages

Writing html files directly (using notepad or other text editors)

Using MS Word and save as html Using specialized software: MS

Frontpage, Dreamwaver, etc. Adding animations, forms, java,

javascript, database functionality, …

Writing Simple html pages

Start notepad and writing html code directly

Save it as an html file (eg, my.html) Start browser (eg, Internet Explore) Click file > open, click browse to

locate and open the html file (eg, my.html).

You will see how the html file is displayed!

HTML

HTML – HyperText Markup Language

A language used to define the content of, and the presentation instructions for, a Web document

When a browser presents a Web document, the browser scans the document and applies the presentation instructions to the content

Content that does not have presentation instructions will be presented using default instructions built into the browser

HTML documents must employ a simple format so anyone can create documents

HTML documents are stored in text (ASCII) files

This type of document can be created using any editor that allows you to save the document as a text file

To combine the content and the presentation instructions in the same file, there must be a way to distinguish between these two components

In HTML, the presentation instructions are inserted as “tags”

Anything that isn’t a presentation instruction is content

HTML tags normally occur in pairs The pair of tags surround the

content to which they apply A start tag is indicated with angle

brackets <TAG>

An end tag is indicated with a slash after the opening angle bracket </TAG>

HTML has a set of predefined tags These tags can be used to

Control how the text in the document is displayed

Insert images into the document Insert links to other documents

Document TagsDocument Tags

HTML documents are enclosed within <HTML> and </HTML> tags

Every HTML document will have a head and a body

The document head is enclosed within the <HEAD> and </HEAD> tags

The body is enclosed within the <BODY> and </BODY> tags

The basic structure of an HTML document is

The <TITLE> within the <HEAD> is displayed in the title bar of the browser

The <HEAD> of the document contains information used by the browser

All of the content for the document and the associated presentation instructions are placed inside the <BODY> tags

Formatting Tags

HTML contains tag definitions that allow you to control Headings Style Ordered Lists Unordered Lists Definition Lists etc.

Heading Tags

There are six heading levels The levels are named H1, H2, H3,

… H6 where H1 is the largest and H6 is the smallest

To create a heading, you enclose the text of the heading inside the opening and closing tags for the heading level

Heading Examples

Physical Style Tags

Used to control the display of text <B> - bold <I> - italics <U> - underline <TT> - typewriter type face

Physical Style Tag Example

Logical Style Tags

Examples of logical style tags <EM> - for emphasis <STRONG> - stronger emphasis <CITE> - citation <CODE> - computer code

Logical Style Tag Example

Layout Style Tags

Used to control text layout <CENTER> - center the text <P> - new paragraph <BR> - break, start a new line <HR> - horizontal rule, draw a line

Layout Style Tag Example

Lists

Lists of data can be defined using Ordered List – enumerated lists Unordered List – bulleted lists Definition List – lists that are made of

terms and their associated definitions

Ordered List

Use the <OL> and </OL> tags to start and end an ordered list

Within the ordered list, the list item (<LI>) tag is used to indicate the items on the list

The VALUE tag can be used to set the value of a list item

The START parameter is used to control the value of the first item

The TYPE parameter controls what enumeration scheme is used

The types are: 1 – numbers (default) a – lower case letters A – upper case letters i – small Roman numerals I – large Roman numerals

Ordered List

Ordered List

Ordered List

Unordered List

Use the <UL> and </UL> tags to start and end an unordered list

Within the unordered list, the list item (<LI>) tag is used to identify the items on the list

The TYPE parameter can be used to control the look of the list

The types are: Disc – a solid disc Circle – a hollow circle Square – a square symbol

Unordered List

Unordered List

Definition List

The <DL> and </DL> tags define the Definition List

The <DT> tag is used to indicate a definition term

The <DD> tag is used to indicate a definition

Definition List

URL

An URL is a Uniform Resource Locator

An URL contains information about The address of a document on the

Internet The protocol that will be used to

access the document

Protocols HTTP – HyperText Transfer Protocol

Designed to transmit files on the World Wide Web

FTP – File Transfer Protocol Designed to transmit files over the

Internet (before the Web developed) ftp://ftp.csd.uwo.ca

Email: mailto:[email protected] These protocols are sets of rules

that dictate how files are transmitted between computers

URL Example

In the following URL example, the protocol to be used is HTTP (before the “://”)

The document is “browse.html” and it is located in the “selected” folder at the World Wide Web site for UWO in Canada

Images

Images are added to documents using the <IMG> tag

A </IMG> tag is not required The SRC parameter is used to

indicate the SouRCe of the image

Image Formats

Standard image formats are needed so images can be stored retrieved transmitted over the Web

Examples of image formats used on the Web are: GIF – Graphics Interchange Format JPG ( JPEG ) – Joint Photographic

Experts Group PNG – Portable Network Graphics BMP – Windows Bitmap

Graphics Interchange Format

Uses the Lempel-Ziv Welch (LZW) compression algorithm

The algorithm searches the image for big blocks of the same color and then compresses these blocks This compression reduces the size of

the image

The algorithm also uses an indexed color scheme, in which a custom color palette for the image is selected using only 256 of the over 16 million available colors

This format is used when the image does not contain a wide range of colors or color shades

Joint Photographic Experts Group

Images can contain millions of colors

Uses Lossy compression algorithm When the image is compressed it

permanently loses some of its quality The algorithm looks for similar

colors (like a range of reds) and chooses the same red for very close shades

If the original image had 1,000 shades of red, the compressed image may have only 500 shades

The human eye cannot detect all the shades so in general the lose will not be noticed

This format is used when the image contains many colors and many color shades

Portable Network Graphics

Portable Network Graphics format was designed to replace GIF

Uses loss less compression like GIF Provides better resolution and

more colors like JPG Generates smaller files like GIF Is not supported by all versions of

browsers

Windows Bitmap

Every pixel in the image is represented by a piece of data

The data represents the color of the pixel

Bitmap images are very large Rarely used on Web pages

because of the time required to download the image

Image Tag

Anchors

Anchor tags (<A> and </A>) are used to insert hyperlinks and bookmarks into HTML documents

A hyperlink is a link to another document on the World Wide Web

A bookmark is a named location within an HTML document

No. 1 use of Anchors:Anchors as Hyperlinks

An example of a link to the UWO home page

When the HTML is rendered the document will contain a link to UWO

The Link Item is the text or image that you click on to activate the link

The HREF parameter is the Hypertext REFerence parameter

The HREF parameter is used to define the link destination

An Image as a Link

No. 2 use:Anchors as Bookmarks

(in the same document) An example of the definition of an

(invisible) bookmark using the NAME parameter (normally in a long html file)

<A NAME=“top”> </A>…..…..

<A NAME=“Conclusions_bookmark”> </A><h2>Conclusions</h2>…

An example of a link to a bookmark within the same document (in the same html document)

Note the use of #

You can see conclusions <a href=“#Conclusions_bookmark”>here</A>

…<a href=“#top”>Back to top</a>……

….You can see conclusions <a href=“#Conclusions_bookmark”>here</A>

…………

<A NAME=“Conclusions_bookmark”> </A><h2>Conclusions</h2>…

…You can see conclusions <a href=“#Conclusions_bookmark”>here</A>

In a long html file (say papers.html)

The form of the anchor tags used as a hypertext link is

<A HREF=“URL#Bookmark”>Link Item</A>

No.3 use: combining 1 and 2Anchors as Hyperlinksto bookmark in a different

document

An example of a link to a bookmark within another html document

Click <a href=“http://www.uwo.ca/papers.html#Conclusions_bookmark”>here</a> to jump to conclusions in that document.

See a real example inhttp://www.csd.uwo.ca/faculty/ling/cs031/simple.html

If linking to a bookmark in the same document, the URL is omitted

Web Page Example 1

Create a Web page with “My First Web Page” as the title Your name as a level 2 heading An enumerated list of your three

favorite University courses An image for the University. Try

“http://www.uwo.ca/gifs/uwologo4.gif” as the source URL. If this URL doesn’t work, look at the HTML source for the University’s home page to find an URL

Web Page Example 2 Create a Web Page with

A TV show name as a level 1 heading at the top of the page

A paragraph of text about the show Bold the stars names and italicize the

night that the show is broadcast within this text

A horizontal line A link to a Web page for the show. Use

the name of the show as the link text A horizontal line A link to the heading at the top of the

page, using “Top” as the link text

DHTML

Dynamic HTML Supported by fourth generation and

later browsers (Netscape and IE) DHTML allows the user to interact

with a web page The user can enter values and

select buttons

The user of a DHTML page can enter data and then have the data sent (posted) to a web site

The computer hosting the web site can then process the data

DHTML Example 1 DHTML Example 2

Encryption

Encryption involves encoding a message to conceal the meaning

Consider the name

NORMA JEAN BAKER

OPSNB!KFBO!CBLFS

The name has been encrypted as

What is the encryption algorithm? How would you decode the

message?

For encryption to work there must be an algorithm that is applied to the original message

There must also be a way to decode the encrypted message to obtain the original message

Encryption algorithms use a binary “key” to encrypt and decrypt messages

There are two types of encryption algorithms used to secure Internet transmissions Symmetric Key Encryption Asymmetric (Public) Key Encryption

Symmetric Key Encryption

Symmetric Key Encryption can use the same key for both encryption and decryption

The sender and the receiver must both know the key

Both must ensure that the key is kept secret If the key becomes public then others

can decrypt valid messages and create fake messages

Key Length

For Symmetric Key Encryption, the typical key lengths are 40, 56 and 128 bits

Key length is one measure of encryption strength

Longer keys provide stronger encryption

An additional bit in the key doubles the strength of the key

Data Encryption Standard

The Data Encryption Standard (DES) is the U.S. government’s standard for data encryption

Uses the Data Encryption Algorithm (DEA) to encrypt/decrypt the message An improvement on the Lucifer algorithm

developed by IBM in the early 1970s

Uses a 56 bit key

Triple DES

Uses a key three times as long as Standard DES 168 bit key

Used for banks and other organizations that transmit highly sensitive data

Public Key Encryption

Asymmetric Key Encryption Uses a pair of keys, one public and

one private Key length is a least 512 bits The public key is published so any

sender can obtain it The private key is kept secret

Messages encrypted using the public key can only be decrypted by using the private key

There reverse is also true, messages encrypted using the private key can only be decrypted using the public key This is one way to generate a digital

certificate (to sign a message)

Rhonda wants to send an email to Rick Rhonda finds Rick’s Public Key through

a Public Key directory She encrypts the message using Rick’s

Public Key and sends the message Rick uses his Private Key to decrypt

the message (his Public Key will NOT decrypt the message)

For Rick to respond, he must use Rhonda’s Public Key to encrypt the message

Encryption Strength

The strength of an encryption depends on the algorithm used and the length of the key

The algorithms used in most implementations of Public Key Encryption are patented by RSA Data Security Inc.

RSA Algorithms

The RSA Public Key Cryptosystem was developed in 1977 by Ronald Rivest Adi Shamir Leonard Adleman

They have created a number of 128 bit key algorithms For example, RC2 and RC4

Code Breaking

For Symmetric Key Encryption, the typical key lengths are 40, 56 and 128 bits

Tests have been conducted to determine how long it will take to break messages encoded using various key lengths

The 128 bit encryption has not been broken yet! “The sun will burn out first” is a frequent

estimate of how long it will take!

Key Length Broken in …

40 bits 3 hours

48 bits 13 days

56 bits 40 days

128 bits ???

US vs International Security

Under current U.S. policy, software manufacturers can only sell 40 bit key encryption systems overseas

Some exceptions can use 56 bit keys International banks

In the U.S., 128 bit keys are recommended to ensure secure communications

Why would the U.S. want to restrict key length in software used in other countries?

Public Key Infrastructure

A Public Key Infrastructure is an encryption and digital certificate delivery system which makes secure electronic transactions possible

The X.509 Standard

PKI uses Digital Certificates A digital signature Digital Certificates carry the same

legal weight as a written signature Provides a way for others to verify

your identity Uses Public Key Encryption

A Digital Certificate relates you to a set of public and private keys

Digital Certificates are used to provide secure transactions through the Secure Sockets Layer Protocol (SSL)

SSL Protocol

Developed by Netscape Goal is to provide secure and

reliable communication between applications

For example, between a Web application (your browser) and a Web site

Public Key Encryption is used by each application to establish the identity of the other application

Symmetric Encryption is used for data encryption

Public Key Encryption is used to exchange the key used by the Symmetric Encryption of the data

The reliability of the message is ensured by including a Message Authentication Code (MAC) as part of the data

SSL takes the message to be transmitted and fragments the data into manageable

blocks optionally, compresses the data performs a message integrity check encrypts the data transmits the result

Received data is decrypted verified decompressed reassembled delivered to the client

Digital Trust

Public Key Infrastructure (PKI) manages all aspects of Digital Trust

In the digital world, trust requires Privacy Integrity Non-repudiation Authentication

Privacy

To ensure privacy, messages are encrypted

Encryption ensures that the message cannot be read in transit or by anyone except the recipient

Integrity

Verify the integrity of the message Ensure that the message that is

received is exactly what was sent

Non-repudiation

The sender cannot deny or repudiate a valid message

For example, when a stock broker receives an order for stock trades, the client cannot later claim that they didn’t send the message

Authentication

Verify that the sender is who they claim to be

The Internet

Networks of networks Tens of thousands of computer

networks Reaches 100’s of millions of people How did the Internet develop?

Started with ARPANET, an experimental project of the U.S. Department of Defense Advanced Research Projects Agency (DARPA) in 1969

The original purpose was to explore experimental networking technologies for the military

How large is the Internet? Nobody knows for sure! According to the Internet Society (

ISOC), a professional organization of Internet developers, influencers, and users, the Internet reaches more than 170 countries

Internet Growth

Year Number of Hosts

1969 4

1974 62

1979 188

1984 1,024

1989 80,000

1994 2,217,000

1999 43,230,000

One of the reasons the Internet has been so successful is the commitment of its developers to producing “open” standards

The specifications or rules that computers need to communicate are publicly and freely available; published so that everyone can obtain them

TCP/IP

The standards that the Internet uses are known as TCP/IP Transmission Control

Protocol/Internet Protocol suite Without open standards, only

computers from the same vendor could talk to one another

Computers and networks that conform to the same communications standards are able to “interoperate”, regardless of the manufacturer

All of the networks and computers act as peers in the exchange of information and communication

Packets

Communication on the Internet revolves around the concept of a packet, a basic building block

All information and communications transmitted on the Internet are broken into packets, each of which is considered to be an independent entity

The packets are individually routed from network to network until they reach their destination, where they are reassembled and presented to the user

This method of networking is very flexible and robust

It allows diverse computers and systems to communicate by means of network software, not proprietary hardware

If a network goes “down” (breaks down), then the packets can be rerouted through other parts of the network of networks

This dynamic alternate routing of information creates a very persistent means of communication

Internet Development

There have been three generations of Internet development

They characterize the evolution of the Internet

First Generation

There were three main First Generation Tools Electronic mail Remote logon File transfer

These tools are still available on all parts of the Internet

Electronic Mail

Uses Simple Mail Transfer Protocol (SMTP) Standardized in 1983

Originally designed to transmit plain text Printable characters NOT binary files, graphics or sound

Current systems use Multipurpose Internet Mail Extensions (MIME)

MIME allows the email system to transport Plain text, binary files, graphics and

sound MIME encodes and decodes

complex messages into a simpler form that SMTP can transport

Characteristics of email programs Composition Response Read Delete Organize Filter

Email Address

An email address consists of a local part and a host part

For example,

[email protected]

The local part is a user name, mailbox, login name or user id csdept

The host part is the name of an email server on the Internet csd.uwo.ca

[email protected]

POP and IMAP

Protocols like the Post Office Protocol (POP) and the Internet Message Access Protocol (IMAP) are used to transmit email from your computer to your email server your email server to your computer

Simple Mail Transfer Protocol

The Simple Mail Transfer Protocol (SMTP) is used to transmit email between email servers

To send an email Construct the message on your

computer When you click on “Send”, the

message is moved using POP or IMAP to your email server

The email server uses the host part of the address to determine where to send the message

When the message arrives at the destination email server, it is stored and the recipient is notified of its arrival

When the recipient wants to read the message, it is moved using POP or IMAP to their computer

Remote Logon

Allows you to logon to a computer over the Internet

A utility that handles remote logon is Telnet

To remotely connect to a computer, you must know the address of the computer For example, mccarthy.csd.uwo.ca

On most host computers, you must have an account on the computer

Some host computers allow you to logon as “Anonymous” or “Guest” with your email address as the password Anonymous logon

File Transfer

The File Transfer Protocol (FTP) Used to copy (download) files over

the Internet

FTP was designed to copy plain text files

HTTP was designed to transmit text files, graphics, sound, etc.

FTP is faster than HTTP because FTP doesn't perform as many checks on the data during the download process

FTP allows you to connect to another computer list the files in a folder on the other

computer copy files back and forth between the

two computers Anonymous FTP allows you to logon

as “Anonymous” or “Guest” with your email address as the password

Second Generation

The Second Generation saw large increases in The amount of data being made

public The number of Internet users

There was an increasing need for tools that would aid users in finding resources

Tools

The first tool was Gopher Developed at the University of

Minnesota, where the mascot is a Golden Gopher!

Gopher was a hierarchical system of menus The top level menu contained general

categories The information became more

specific as you drilled down Looked a lot like Yahoo!

Veronica

Very Easy Rodent-Oriented Net-wide Index to Computerized Archives The University of Nevada

Gopher allowed you to search through the categories looking for interesting resources But it was a manual search

Veronica allowed the user to submit keywords and the utility did a search of gopher space

Archie

Archie is derived from the word archive

Developed at the McGill University School of Computer Science

Maintained a database of all the names of files stored at known public FTP sites

Helped find files at FTP sites

Network News - USENET

USENET is a network within the Internet

Divided into newsgroups Each newsgroup is devoted to a

topic To read or post to a newsgroup

you need a news reader application

Newsgroups

More than 80,000 newsgroups Newsgroups are divided into

hierarchies alt – 10,159 alternate groups microsoft – 991 groups bionet – 94 groups biz – 48 groups

Newsgroups are added daily so these numbers are out of date!

Third Generation

The World Wide Web Tools

Browsers Search engines Directories

World Wide Web

Originally developed by the European Laboratory for Particle Physics (also known as CERN) by Tim Berners-Lee of Switzerland

He developed a system to link together scholarly references

The links from one document to another are imagined to form a web!

The World Wide Web is a browsing and searching system

Built on the concept of hypertext and hypermedia

The Web is a continuous distributed information construction project

Tens of thousands of people are adding knowledge to it daily by bringing up their own servers or posting documents on existing servers

Browsers

A browser is application software Browsers use HTML documents as

their input The HTML tags in the document

are applied to the content and the result is displayed in the browser

Mosaic

The first popular graphical browser It was developed at the National

Center for Supercomputing Applications (NCSA) in Champaign, Illinois by Marc Andreessen

Allows a user to click on text, graphics, buttons or icons that link to other resources

Netscape

Developed by Netscape Communications Corporation

The company was founded in April of 1994 by Marc Andreessen, creator of the NCSA Mosaic software and Dr. James H. Clark, the founder of Silicon Graphics, Inc.

Microsoft’s Web browser is Internet Explorer

All browsers have the same basic functionality, they just have a slightly different “look and feel”

Browser Functionality

Typical functionality Display HTML documents Create bookmarks Send and read email Read news Display and create the source HTML

for documents Debug script on DHTML pages

Search Engines

One of the most difficult tasks for a Web browser is to make it easy for the user to find resources

Search engines allow users to do keyword searches

These searches are actually database searches Search engines keep databases that

match keywords to document URLs

Directories

The top level of directories indicate general categories

As the user drills down into a category, they are presented with more specific categories

Consider WebCrawler and Google These two are typical World Web

Web tools They both provide basic and

advanced search capabilities as well as directories

Advanced Searches

Each search engine has its own syntax for describing a search

Most engines AND together keywords The document must have all of the

keywords The search engine should also

support OR, NOT and exact phrases

Check out the WebCrawler and Google advanced search pages for examples of typical advanced search strategies

You can submit a page to be included in searches and directories WebCrawler Google

Search engine databases also get information about documents from programs called robots that explore the Web looking for documents to add to their database