html and http

Post on 06-Jan-2016

48 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

HTML and HTTP. Based on Computer Networks and Internets, Comer. Hypertext. HTML stands for HyperText Markup Language and HTTP stands for HyperText Transport Protocol , so that raises the question: what is hypertext? - PowerPoint PPT Presentation

TRANSCRIPT

CSIT 220 (Blum) 1

HTML and HTTP

Based on Computer Networks and Internets, Comer

CSIT 220 (Blum) 2

Hypertext• HTML stands for HyperText Markup Language and

HTTP stands for HyperText Transport Protocol, so that raises the question: what is hypertext?

• Hypertext is “a method of storing data through a computer program that allows a user to create and link fields of information at will and to retrieve the data non-sequentially.” (Webster’s)

• A hyperlink is a region on one document (page) that when clicked brings up for the user another document.

• It was developed by Ted Nelson in the 1960s.

CSIT 220 (Blum) 3

URL

• The “resources” (data or program files) are located on many computers through an internet or the Internet, hence this is a “distributed” system

• The location of a resource is given by its URL (Uniform Resource Locator) – http://www.lasalle.edu:1234/it/fake.htm#attach

CSIT 220 (Blum) 4

Browser

• Hypertext is generally viewed in a web browser, an application used to locate (linked or otherwise) web pages and display them.

• Some browsers such as Lynx only link text documents.

• But when most people think of browsers they think of Netscape Navigator and/or Microsoft Internet Explorer, which support more than just text.

CSIT 220 (Blum) 5

Hypermedia

• Modern browsers link information in non-textual format (graphics, sound, video, etc.) and so are “multimedia” or “hypermedia” programs.

• The browser may need a plug-in to support some formats. A plug-in adds a particular feature or service to a larger system.

• Browsers plug-ins are based on MIME file types.

CSIT 220 (Blum) 6

Mosaic

• The first widely used multimedia browser was Mosaic.

• Marc Andreessen is credited with initiating the development of Mosaic.

• Mosaic moved the Internet out of the realm of academics and computer hobbyists by making it accessible to a much more general audience.

• It helped the Internet maintain its exponential growth in number of users.

CSIT 220 (Blum) 7

Fig. 2.1: Computers connected to the Internet vs. Year

mosaic

CSIT 220 (Blum) 8

Mosaic (Cont.)

• Andreessen started Mosaic while working for the National Center for Supercomputing Applications (NCSA) at the University of Illinois.

• Andreessen helped found Netscape Communications, which was originally called Mosaic Communications.

• Mosaic is distinct from Netscape. In fact, Mosaic is also licsensed for commercial use and is provided to users by some Internet access providers.

CSIT 220 (Blum) 9

HTML

• Browsers interpret web documents, especially HTML documents

• HyperText Markup Language is an “authoring” scheme for creating documents for the World Wide Web.

• The World Wide Web (WWW) is the collection of resources available through HTTP to users on the Internet.

CSIT 220 (Blum) 10

CSIT 220 (Blum) 11

Markup

• The M in HTML stands for “Markup” • Markup refers to the sequence of characters

(or symbols) inserted in a document to indicate how the file should look when it is printed or displayed and/or to describe the document's logical structure.

• The markup indicators are often called "tags."

CSIT 220 (Blum) 12

Tags

• These formatting instructions must be distinguishable from the text they are in.

• In HTML, angle brackets < and > are used as delimiters to indicate the beginning and end of a tag– This gives <b>bold</b> type.

• As with the byte stuffing we saw in Ethernet frames (where soh an eot were special characters), angle brackets must be replaced in a HTML document with &lt; and &gt;

CSIT 220 (Blum) 13

Tags (Cont.)• The formatting or structure the tag indicates often

refers to an entire region, so many HTML tags occur in pairs (heading and trailing). The trailing tag includes a slash.

• An HTML document begins an <HTML> tag and ends with an </HTML> tag.

• An HTML document is broken into two pieces: the head and the body – The head is the part between the head tags <head> and

</head>– The body is the part between the body tags <body> and

</body>

CSIT 220 (Blum) 14

Example from W3

CSIT 220 (Blum) 15

HTML 4.01 versions

CSIT 220 (Blum) 16

HTML tag

CSIT 220 (Blum) 17

Page from my site

A space

CSIT 220 (Blum) 18

Cascading Style Sheet

• An html document can be be written to work in conjunction with a css file – a cascading style sheet.

• A cascading style sheet separates out instructions about look and layout so that they can be reused – that is referred to many times either within in the same document or even by different documents.

• XLS extensible style language is a newer version, but css is still popular.

CSIT 220 (Blum) 19

HTML (Cont.)

• There are hundreds of other tags used to format and layout the information in a Web page.

• For instance, <P> is used to make paragraphs and <I> … </I>is used to italicize fonts.

• Tags are also used to specify hypertext links. – <a href=“http://www.lasalle.edu”>La Salle</a>

• HTML is not the only Markup Language.

CSIT 220 (Blum) 20

SGML• HTML has similarities to SGML, Standard

Generalized Markup Language, a generic system for organizing and tagging elements of a document.

• GML was started by IBM and became SGML when it was taken over by the International Organization for Standards (ISO).

• SGML is not about formatting, it’s more general. SGML provides rules for tagging elements.

• Those tags might be interpreted as formatting as is done in HTML but can be interpreted in other ways as well.

CSIT 220 (Blum) 21

XML• Extensible Markup Language • “Extensible” means capable of being extended, and

markup language involves tags, so XML is a scheme in which the user can define his or her own tags.

• For example, a company may elect to designate a social security number by placing it in tags defined for that purpose – <ssn>123456789</ssn>

• This data can be transported from application to application and system to system and is carrying around a self-identifying tag with it.

CSIT 220 (Blum) 22

XML (Cont.)

• Unlike HTML tags, XML tags are not necessarily about formatting and presentation.

• However, a presentation application can be instructed to represent a certain type of data (as identified by its XML tags) in a particular way.

• On the other hand, a database interface program can be instructed to place the information into the appropriate field.

CSIT 220 (Blum) 23

XHTML

• Extensible Hypertext Markup Language is a mixture of HTML and XML designed for network display devices.

• XHTML is written in XML; therefore, it is an XML application.

CSIT 220 (Blum) 24

CFML• ColdFusion Markup Language is a mark up

language developed by Allaire (who have merged with Macromedia) for use with ColdFusion, a product that helps webpages work with databases. – CFML is proprietary.

• ColdFusion tags are placed in HTML files. – The HTML tags determine the page's look (layout). – The CFML tags bring in information (content) that

result from user input and/or database queries. • Files created with CFML have the file

extension .cfm

CSIT 220 (Blum) 25

DOM• The Document Object Model is a set of rules about how the

objects on web-page (for instance, text, images, textboxes, buttons) look and function.

• The DOM specifies the properties of the object as well as the events associated with each object.– A button’s properties are its height, width, position, color,

etc. – A button’s events are click, right click, mouse down, etc.

• Dynamic HTML (DHTML) uses DOM to determine the appearance of Web pages after they have been downloaded (that is on the client-side).

CSIT 220 (Blum) 26

DOM (Cont.)

• Alas Netscape Navigator and Microsoft Internet Explorer use different DOMs.

• This is why their implementations of DHTML are so different.

• Both companies have submitted their DOMs to the World Wide Web Consortium (W3C) for standardization.

CSIT 220 (Blum) 27

HTTP• HTML and other web documents are sent across the

network using HTTP Hypertext Transport Protocol, which was originally developed by Dr. Tim Berners-Lee.– It was developed while he worked at CERN, a center for

particle physics, so that scientists from all over the world could share information.

• HTTP defines rules for how messages are formatted and transmitted, what actions are allowed by Web servers, what actions are allowed by clients, etc.

CSIT 220 (Blum) 28

HTTP message

CSIT 220 (Blum) 29

HTTP• A Web server has an HTTP daemon that waits for

HTTP requests and handles them when they arrive. • A Web browser is an HTTP client, sending

requests to server machines. • For example, entering a URL in the location field

of a browser (client) sends an HTTP request to the appropriate Web server, which responds with the page. – Of course, if a domain name is entered, it may have to

go to the DNS server first.

CSIT 220 (Blum) 30

HTTP

• HTTP is a stateless protocol. Each command is executed without knowing anything about any preceding commands. – This is good for keeping transmission lines available,

since there are no ongoing sessions tying up resources. – This is bad for having a web site respond in an

intelligent way to a user.

• This problem of HTTP is addressed in a number of ways, including ActiveX, Java, JavaScript and cookies.

CSIT 220 (Blum) 31

HTTP 1.1• Most modern browsers support HTTP 1.1• Instead of opening and closing a connection for

each application request, HTTP 1.1 provides a persistent connection that allows multiple requests to be batched or pipelined to an output buffer.

• The underlying TCP layer can put multiple requests (and responses to requests) into one TCP segment.

• Fewer segments, less overhead.

CSIT 220 (Blum) 32

HTTP 1.1 (Cont.)

• Compression: If a browser (client) indicates that it can decompress HTML files, then a server compresses them for transport across the Internet.

• Standard image files are already in a compressed format, so this improvement applies only to HTML and other non-image data types.

CSIT 220 (Blum) 33

sHTTP

• Secure HTTP is an extension to the HTTP protocol for sending data securely over the Web.

• Not all browsers and servers support S-HTTP. • Another technology for secure communications

over the Web is Secure Sockets Layer (SSL). • SSL and S-HTTP have different designs and

goals. SSL is designed to establish a secure connection between two computers, S-HTTP is designed to send individual messages securely.

CSIT 220 (Blum) 34

Cache

• To increase speed, browsers cache web page documents locally.

• There are also cache servers, machines on the local network that cache web page documents.

• First, the page is looked for on the local machine, then on the local network (cache server) and then at the remote location.

CSIT 220 (Blum) 35

Refresh if you don’t want the cached version

CSIT 220 (Blum) 36

FTP

Based on Computer Networks and Internets, Comer

CSIT 220 (Blum) 37

FTP• File Transfer Protocol is a set of rules for

moving files around on an internet or the Internet. • One common use of FTP is to move web-page

files from the computer on which they were created to the web server where they are accessible to people on the web.

• Another common use is to download programs and other files to one’s computer.

• One can also download files using HTTP but FTP is faster.

CSIT 220 (Blum) 38

CSIT 220 (Blum) 39

Versions• There is a command-line version of FTP.

– This is a fairly standard utility but the user must know a set of commands to use it.

– A user can put a file into a directory at a remote location or get a file from there.

• There is also a GUI version. – This version is easier to operate (with its listboxes,

scrollbars and buttons).– But it must be downloaded.

• One can also use a browser to get files using FTP from sites.

CSIT 220 (Blum) 40

Access and Capability

• Access to the FTP services typically requires authenticating the user (username and password).

• In such cases, the user can typically delete, rename, move files and so on, in addition of copying them.

• Anonymous FTP does not authenticate a user but allows the user to do less, typically one only gets files– It is used as a means to distribute files.

CSIT 220 (Blum) 41

Anonymous FTP

• In anonymous FTP, one enters "anonymous" for the username.

• The password may not matter or they may request an email address, or in old versions the password may be “guest”.

• This is a way of giving the public access to a server so that files can be downloaded.

CSIT 220 (Blum) 42

Data and Control

• Local machine must have an FTP client.

• Remote machine must have an FTP server.

• Transferring a file using FTP actually consists of two connections.

• An FTP daemon listens at TCP port 21. – (UDP has its own set of ports.)

• Port 21 is for initiating a control connection.

CSIT 220 (Blum) 43

Data

Control

CSIT 220 (Blum) 44

Data and Control• The client’s initial control message includes the port

number at which the client expects to receive data.• The server’s port 20 initiates a data connection to that

port on the client. • The control connection indicates what files will be

transferred in which direction; the actual transferring takes place on the data connection.

• There is one control connection during an FTP session, but the data connections close when the transfer is complete, thus an FTP session may have several data connections.

CSIT 220 (Blum) 45

FTP Client and Server

CSIT 220 (Blum) 46

Command-line FTP: Start/Run/cmd

For older operating systems, use command instead of cmd.

CSIT 220 (Blum) 47

Command-line FTP: ftp <domain name>

Enter username and password, password need not be echoed

CSIT 220 (Blum) 48

Command-line FTP: ls

Shows contents of current directory (folder)

CSIT 220 (Blum) 49

Command-line FTP: cd <directory name>

Moves one into the specified folder on the remote machine

CSIT 220 (Blum) 50

Command-line FTP: wildcard

* is the wildcard, it stands in for anything that might follow, in this case we are listing any files that begin with f

CSIT 220 (Blum) 51

Command-line FTP: wildcard

* is the wildcard, it stands in for anything that might precede, in this case we are listing any files that end with .jpg

CSIT 220 (Blum) 52

Command-line FTP: get <filename>

Transfers a copy of a remote file to the local machine

CSIT 220 (Blum) 53

Overwriting

• Most versions of FTP simply overwrite a file of the same name when one uses the get or put commands.

• Unlike many applications, the user will not be given a warning that he or she is about to overwrite a file.

CSIT 220 (Blum) 54

Command-line FTP: put <filename>

Places a copy of a local file onto a remote computer

CSIT 220 (Blum) 55

Command-line FTP: binary

Get and put assume files are in ASCII, the binary command switches the mode to binary for transferring other types of files

While the first get looks like it worked, the PowerPoint file could not be opened, the second get provided a useable ppt file.

CSIT 220 (Blum) 56

Command-line FTP: ascii

Puts FTP back into ASCII mode

CSIT 220 (Blum) 57

Htm file transferred in ASCII mode

CSIT 220 (Blum) 58

Htm file transferred in Binary mode

“Returns” in original document can be lost, replaced with unprintable characters

CSIT 220 (Blum) 59

FTP commands

CSIT 220 (Blum) 60

A GUI version: ws_ftp le

CSIT 220 (Blum) 61

ws_ftp le

Le: limited edition

CSIT 220 (Blum) 62

Establishing a session

CSIT 220 (Blum) 63

Startup

Remote folder you want to start in, you must have permission. This doesn’t always work. FTP server may not allow you to specify folder.

Local folder to start in

CSIT 220 (Blum) 64

The well known port for FTP control is 21

CSIT 220 (Blum) 65

ws_ftp le

Local file directory

CSIT 220 (Blum) 66

ws_ftp le

Remote file directory

CSIT 220 (Blum) 67

ws_ftp le

Modes: ASCII or Binary

CSIT 220 (Blum) 68

ws_ftp le

get put

CSIT 220 (Blum) 69

Can also rename files, delete files and refresh the directory

CSIT 220 (Blum) 70

An FTP site: FTP service using a browser

ftp (not http) as the protocol

CSIT 220 (Blum) 71

Passive FTP

• Passive FTP is a more secure form of data transfer in which the flow of data is set up and initiated by the File Transfer Program (FTP) client rather than by the FTP server program.

• FTP client programs sometimes allow the user to select passive FTP.

• Most Web browsers (which act as FTP clients) use passive FTP by default.

CSIT 220 (Blum) 72

Passive FTP

CSIT 220 (Blum) 73

Passive FTP

• Recall FTP consists of two connections, in normal FTP the client initiates the control connection, but the server establishes the data connection.

• Some networks have firewalls that only allows connections that were initiated from within, this would rule out the data connection of a normal FTP session.

CSIT 220 (Blum) 74

“Normal” vs Passive FTP

• Normal: Client initiates control and gives a port number to server which then initiates data connection.

• Passive: Client initiates control and asks server to return over the control connection which port it intends to use (for data), then the client initiates a data connection using the port number supplied by the server.

CSIT 220 (Blum) 75

TFTP• Trivial File Transfer Protocol, a simple version

of FTP, but TFTP uses the User Datagram Protocol (UDP) instead of TCP. – It is simpler, faster, requires less code. – But is less capable and less secure.

• It is used where user authentication and directory visibility are not required.

• It is often used by servers to boot diskless workstations, X-terminals, and routers. – Diskless workstations need operating systems too.

CSIT 220 (Blum) 76

Other References

• http://www.webopedia.com

• http://www.whatis.com

• http://www.uic.edu/depts/accc/network/ftp/vftp.html

• http://www.w3.org/TR/REC-html40/struct/global.html

top related