20101 the application layer www chapter 7. 20102 www: http hypertext transfer protocol, to transfer...

20
2010 1 The Application Layer WWW Chapter 7

Post on 21-Dec-2015

221 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 1

The Application LayerWWW

Chapter 7

Page 2: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 2

WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a serverStateless: server maintains no information about the client

•pair of 1 request and 1 response•originally per pair 1 TCP connection was established and closed•now more pairs / connection (a persistent connection)•less overhead, better settings of self-learning parameters

Page 3: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 3

HTTP request message

Request are in ASCI text, e.g.GET /somedir/page.html HTTP/1.1Host: www.cucg.gh, the server name (more than 1 per IP)Connection: CloseUser-agent: Mozilla/4.0Accept-language: fr

Page 4: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 4

HTTP Methods

•Conditional Get: with If-Modified-Since header•PUT, POST and DELETE allows changing a site using HTTP

Page 5: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 5

HTTP Message Headers

The accept headers tell the server what the client is willing to accept in case it has a limited repertoire of what it can handle. It also allows the server to send back a page in a certain language, if it has a choice.

Page 6: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 6

Browser

•A web page may contain HTML code, images in GIF or JPEG format, sound in MP3 format, video in MPEG format, documents in PDF, MSWord or other formats, or information in many other formats.•Some are handled directly by a browser.•Some by a plug-in, a code module that the browser fetches from disk and installs as an extension to itself.•For others the browser starts up a helper application as a separate process.

Page 7: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 7

Client side actionsClicking in a browser on “http://www.cs.ru.nl/~ths/index.html”.The steps that occur then are:

1.The browser determines the URL (by seeing what was selected)2.The browser asks DNS for the IP address of www.cs.ru.nl3.DNS answers with the IP number4.The browser makes a TCP connection to that number on port 805.It then sends a GET /~ths/index.html command6.The www.cs.ru.nl server sends the file index.html7.The TCP connection is released8.The browser displays all the text in index.html9.The browser fetches all images indicated in index.html, by establishing a TCP connection for each of them, and displays them.

Page 8: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 8

URLs – Uniform Resource Locaters

A URL consists of 3 parts: a protocol, the DNS name of the host, and the file name.

Page 9: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 9

Server side actionsThis performs the following steps in its main loop:

1.Accept a TCP connection from a client. 2.Resolve the name of the page requested. 3.Authenticate the client if needed. 4.Perform access control on the client, can the requested page be sent given the client's identity and location. 5.Perform access control on the web page, some pages may only been sent to clients on particular domains, e.g. inside the company. 6.Check the cache if the page is there, otherwise get it from disk. 7.Determine the MIME type and include it in the header of the reply. 8.Other possible tasks, like building a user profile, gathering statistics or making an entry in a logfile. 9.Return a reply, either the requested file or error information 10.Release the TCP connection

Page 10: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 10

Statelessness and Cookies•For newer applications the server likes to know more about the user requesting pages e.g. to keep information between request•IP numbers are not suitable for that, because of dynamic IP addresses and NAT and there may be more than one user on a computer.•When a client requests a page, the server may send in the reply header a “cookie”: a small, at most 4 KB, text string.•Browers may accept it. When the browser later sends a request it checks whether it has cookies for the domain the request is for. It includes them in the request so the server can use them.

Page 11: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 11

HTML – HyperText Markup Language

The designer of a web page can indicate how the page can be best displayed, but the client can overwrite these settings. In contrast to Adobe Acrobat and Flash.HTML is an application of SGML (Standard Generalized Markup Language), XHTML uses XML (Extensible Markup Language).

By embedding the markup commands within each HTML file, a browser may reformat any web page. A web page can be shown full screen on a 1024 x 768 display with 24-bit color but also in a small window on a 640 x 480 screen with 8-bit color.

Page 12: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 12

Cascaded style sheetsHTML is constantly changing. Version 1.0 was the de- facto standard used in the Mosaic browser. When new browsers came along version 2.0 became an Internet standard. Version 3.0 added many new features, including tables, toolbars and cascaded style sheets.

This gives page designers more control over the desired appearance of pages on browsers. The semantics of a text are defined in the HTML file, while a style sheet defines the appearance:

h1 { color: #FF0000;} h2 { color: #0000FF;} body { color: #000000; background: #ffffff; } .red {color: #FF0000;} e.g. <p class=“red”>…</p>

HTML 4.01 is now the current version.

Page 13: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 13

XMLXML (eXtensible Markup Language) describes Web content in a structured way. On the left a structure called book_list, a list of books, each having 3 fields, is defined.The structure could have repeated fields (e.g. multiple authors), optional fields (e.g. title of included CD-rom) and alternative fields.

Page 14: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 14

XSLHow the XML page is to be formatted and displayed on a screen is determined by a XSL (eXtensible Style Language) file. It looks like HTML but has stricter syntax requirements, a browser should reject it if for instance a closing tag like </th> is missing.

XSL commands are given with a xsl tag, like <xsl:xxxx>. The for-each command iterates over the given structure, the list of books.

XHTML (X from eXtended) is essentially HTML 4 reformulated in XML. It needs a XSL file to provide display meaning to its tags. Strict performance to the syntax is required, like closing tags, tags and attributes in lower case, attributes in quotation marks and proper nesting of tags.

Page 15: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 15

Forms for interaction

On the server the CGI (Common Gateway Interface) starts the script (or program) 'query' with the string after the ? as its parameter. The script does its work, e.g. search a database, and returns its result as a HTML page.

Input is returned in a string added to the URL:http://www.ru.nl/cgi-bin/query?name=jan&city=a…A + indicates a space, a %2B indicates a typed in +, etc.

Page 16: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 16

Server-side Dynamic Web Pages

Another way to generate dynamic content is to embed little scripts inside HTML pages to be executed by the server to generate the page. A popular language for this is PHP (PHP: Hypertext Preprocessor). To use it the server has to understand PHP, usually page containing PHP have file extension 'php' rather than 'html' or 'htm'. JSP (Java Server Pages) is similar to PHP, except that the dynamic part is written in the JAVA programming language.ASP (Active Server Pages) is Microsoft's version, using Visual Basic Script for generating the dynamic content.

Page 17: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 17

with PHPThe PHP commands are included in the HTML tag <?php ... ?>.

On the top a form with 2 entry fields. Below is the 'action.php' file with the PHP commands. They have access to the information filled in the form using the name of the fields, e.g. $age. They produce a text string which is included in the output send to the client.

PHP is a powerful programming language oriented towards interfacing between the WEB and a server database. It is open source and freely available, and specially designed to work well with Apache, which is also open source and is the world's most widely used Web server.

Page 18: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 18

Client-Side Dynamic Web Pages

Here a program contained in a web page is executed by the browser and the result is displayed. No information is send to the server.

JavaScript can be used for this, a scripting language very loosely inspired by some ideas from JAVA. It is a full-blown programming language, with variables, strings, arrays, objects, functions, and all the usual control structures.

Another way to make web pages highly interactive is through the use of applets. These are small JAVA programs embedded with the 'applet' tag and executed by a Java Virtual Machine. As they are interpreted, the interpreter can prevent them from doing Bad Things. In theory at least, in practice many bugs were found.

Microsoft's answer to SUN's applet was allowing web pages to hold ActiveX controls. They are faster than applets, but only run on Window machines.

Page 19: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 19

Client-Side JavascriptIt has the ability to manage windows and frames, set and get cookies, deal with forms and handle hyperlinks.

As these things are rather internal to browsers, and often different for different browsers and versions, it is difficult to write JavaScript programs which work correctly for all browsers, versions and platforms.

It can also track mouse movements and actions. When the mouse is over a link, a window with a certain image is displayed.

It is embedded in a HTML page using the 'script' tag or inline at certain locations.

Page 20: 20101 The Application Layer WWW Chapter 7. 20102 WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server

2010 20

Client-Server overview

Cascaded Style Sheets are part of HTML.Plug-ins or helpers can display other contents, such as ps, pdf, video, sound and images, e.g. SVG (scalable vector graphics).SGI scripts can be in various languages, Perl, Python, C, etc.