http cs587x lecture department of computer science iowa state university

31
HTTP CS587x Lecture Department of Computer Science Iowa State University

Upload: rodney-king

Post on 19-Jan-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP

CS587x LectureDepartment of Computer Science

Iowa State University

Page 2: HTTP CS587x Lecture Department of Computer Science Iowa State University

What to Cover

WWW HTTP/1.0

Protocol highlights Problems

HTTP/1.1 Highlights of improvement

Page 3: HTTP CS587x Lecture Department of Computer Science Iowa State University

World Wide Web (WWW)

Core Components Servers

Store files and execute remote commands Browsers (i.e., clients)

Retrieve and display “pages” of content linked by hypertext

Networks Send information back and forth upon request

Problems How to identify an object How to retrieve an object How to interpret an object

Page 4: HTTP CS587x Lecture Department of Computer Science Iowa State University

Semantic Parts of WWW

URI (Uniform Resource Identifier) protocol://hostname:port/directory/object

http://www.cs.iastate.edu/index.html ftp://popeye.cs.iastate.edu/welcome.txt https://finance.yahoo.com/q/cq?s=ibm&d=v1

Implementation: extend hierarchical namespace to include

anything in a file system server side processing

HTTP (Hyper Text Transfer Protocol) An application protocol for information sending/receiving

HTML (Hypertext Markup Language) An language specification used to interpret the information

received from server

Page 5: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP Properties

Request-response exchange Server runs over TCP, Port 80 Client sends HTTP requests and gets responses from

server Synchronous request/reply protocol

Stateless No state is maintained by clients or servers across

requests and responses Each pair of request and response is treated as an

independent message exchange

Resource metadata Information about resources are often included in web

transfers and can be used in several ways

Page 6: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP Commands

GET Transfer resource from given URL

HEAD Get resource metadata (headers) only

PUT Store/modify resource under a given URL

DELETE Remove resource

POST Provide input for a process identified by the

given URL (usually used to post CGI parameters)

Page 7: HTTP CS587x Lecture Department of Computer Science Iowa State University

Response Codes of HTTP 1.0

2xx success3xx redirection4xx client error in request5xx server error; can’t satisfy the request

Page 8: HTTP CS587x Lecture Department of Computer Science Iowa State University

Steps of Processing an HTTP Requesthttp://www.cs.iastate.edu/index.html

The client1. Contact its local DNS to find out the IP address

of www.cs.iastate.edu2. Initiate a TCP connection on port 803. Send the get request via the established

socketGET /index.html HTTP/1.0

The server 4. Send its response containing the required file5. Tell TCP to terminate connection

The browser6. Parse the file and display it accordingly7. Repeat the same steps in the presence of any

embedded objects

Page 9: HTTP CS587x Lecture Department of Computer Science Iowa State University

Server Response

HTTP/1.0 200 OKContent-Type: text/htmlContent-Length: 1234Last-Modified: Mon, 19 Nov 2001 15:31:20 GMT<HTML><HEAD><TITLE>CS Home Page</TITLE></HEAD>…</BODY></HTML>

Page 10: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP/1.0 Example

Client Server

Request file 1

Transfer file 1

Request file 2

Transfer file 2

Request file n

Transfer file n

Finish displaypage

Page 11: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP Server Implementation

public WebServerDemo(String[] args) { public static void main(String[] args) { ServerSocket ss = new ServerSocket(80);

for (;;) { // accept connection Socket accept = ss.accept();

// Start a thread to process the request new Handler(accept).start(); }}

Page 12: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP Server Implementation

class Handler extends Thread { // Handler for a HTTP request Socket socket; BufferedReader br; PrintWriter pw;

public Handler(Socket _socket) { socket=_socket; }

public void run() { br = new BufferedReader(new InputStreamReader(socket.getInputStream())); pw = new PrintWriter(new OutputStreamWriter(bos));

String line = br.readLine(); // Read HTTP request from user if(line.toUpperCase().startsWith("GET")) { // parse the string to find the file name // locate the file and send it back ::::: } //other commands: post, delete, put, etc. }}

Page 13: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP/1.0 Caching

CLIENT GET request:

If-modified-since – return a “not modified” response if resource was not modified since specified time

Request header No-cache – ignore all caches and get resource

directly from server

SERVER Response header:

Expires – specify to the client for how long it is safe to cache the resource

Page 14: HTTP CS587x Lecture Department of Computer Science Iowa State University

Issues with HTTP/1.0

Each resource requires a new connection Large number of embedded objects in a web page Many short lived connections

Serial vs. parallel connections Serial connection downloads one object at a time (e.g.,

MOSAIC) causing long latency to display a whole page Parallel connection (e.g., NETSCAPE) opens several

connections (typically 4) contributing to network congestion

HTTP uses TCP as the transport protocol TCP is not optimized for the typical short-lived connections Most Internet traffic fit in 10 packets (overhead: 7 out of 17)

Too slow for small object May never exit slow-start phase

Page 15: HTTP CS587x Lecture Department of Computer Science Iowa State University

Highlights of HTTP/1.1

Persistent connections Pipelined requests/responses Support for virtual hosting More explicit support on caching Internet Caching Protocol (ICP) Content negotiation/adaptation Range Request

Page 16: HTTP CS587x Lecture Department of Computer Science Iowa State University

Persistent Connections

The basic idea was reducing the number of TCP connections

opened and closed reducing TCP connection costs reducing latency by avoiding multiple TCP

slow-starts avoid bandwidth wastage and reducing overall

congestion A longer TCP connection knows better about

networking condition (Why?)

New GET methods GETALL GETLIST

Page 17: HTTP CS587x Lecture Department of Computer Science Iowa State University

Pipelined Requests/Responses

Buffer requests and responses to reduce the number of packetsMultiple requests can be contained in one TCP segmentNote: order of responses has to be maintained

Client Server

Request 1Request 2Request 3

Transfer 1

Transfer 2

Transfer 3

Page 18: HTTP CS587x Lecture Department of Computer Science Iowa State University

Support for Virtual Hosting

Problem – outsourcing web content to some company

http://www.hostmany.com/A http://www.A.com http://www.hostmany.com/B http://www.B.com

In HTTP/1.0, a request for http://www.A.com/index.html has in its header only:

GET /index.html HTTP/1.0

It is not possible to run two web servers at the same IP address, because GET is ambiguousHTTP/1.1 addresses this by adding “Host” header

GET /index.html HTTP/1.1Host: www.A.com

Page 19: HTTP CS587x Lecture Department of Computer Science Iowa State University

Content Negotiation/Adaptation

A resource may have more than one representation Different languages Different size of images, etc.

ExampleGET /index.html HTTP/1.1Host: www.getbelix.comAccept-Language: en-us, fr-BE

Two approaches Agent-driven: the client receives a set of alternative

representation of the response, chooses the best representation and indicates in the second request

Server-driven: the server chooses the representation based on what is available at the server, the headers in the request messages, or information about the client, such as its IP

Page 20: HTTP CS587x Lecture Department of Computer Science Iowa State University

Range Request

A user may want to load only some portion of content E.g., retrieve only the newly appended

portion E.g., load some pages of a PDF file

GET bigfile.html HTTP/1.1Host: www.justwhatiwant.comRange: 2000-3999

Range: -1000Range: 2000-

Page 21: HTTP CS587x Lecture Department of Computer Science Iowa State University

no-cache: forcible revalidation with origin serveronly-if-cached: obtain resource only from cacheno-store: don’t allow caches to store request/responsemax-age: response’s should be no greater than this valuemax-stale: expired response OK but not older than staled valuemin-fresh: response should remain fresh for at least stated valueno-transform: proxy should not change media type

Cache-Control Request Directives

Page 22: HTTP CS587x Lecture Department of Computer Science Iowa State University

Cache-Control Response Directives

public: OK to cache response anywhereprivate: response for specific user onlyno-cache: do not serve from cache without prior revalidation

Must revalidate regardless of when the response becomes staleno-store: caches are not permitted to store response, requestno-transform: proxy should not change media typemust-revalidate: can be cached but revalidate if stale

A file may be associated with an age (expiration)proxy-revalidate: force shared user agent caches to revalidate cached responsemax-age: response’s age should be no greater than this values-maxage: shared caches use value as response’s maximum age (overide max-age)

Page 23: HTTP CS587x Lecture Department of Computer Science Iowa State University

Factors to Consider for Cache Replacement

Cost of storing the resource (size)

Cost of fetching the resource (size+distance)

The time since the last modification of the resource

The number of accesses to the resource in the past

The probability of the resource being accessed in the near future

May be a known priori or based on the past access pattern

The heuristic expiration time If there is no server-specified expiration time, the cache

decides on a heuristic expiration time. If no expired resource are available as candidates, then

resource that are close to their expiration time are prioritized as candidates for replacement

Page 24: HTTP CS587x Lecture Department of Computer Science Iowa State University

Summary

HTTP 1.0HTTP 1.1

Page 25: HTTP CS587x Lecture Department of Computer Science Iowa State University

What covered so far

HTTP DNS

TCP UDP

IP

Ethernet FDDI Token Etc.

Page 26: HTTP CS587x Lecture Department of Computer Science Iowa State University

FYI

SOURCE: National Science Board, Science and Engineering Indicators-2002

Internet domain survey host count worldwide

Page 27: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP Server (1)import java.io.*;import java.net.*;import java.util.*;

public class WebServerDemo { protected String docroot; // Directory of HTML pages and other files protected int port; // Port number of web server protected ServerSocket ss; // Socket for the web server

class Handler extends Thread { // Handler for a HTTP request protected Socket socket; protected PrintWriter pw; protected BufferedOutputStream bos; protected BufferedReader br; protected File docroot;

public Handler(Socket _socket, String _docroot) throws Exception { socket=_socket; docroot=new File(_docroot).getCanonicalFile(); // Absolute dir of the filepath }

Page 28: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP Server (2) public void run() { try { // Prepare our readers and writers br = new BufferedReader(new InputStreamReader(socket.getInputStream())); bos = new BufferedOutputStream(socket.getOutputStream()); pw = new PrintWriter(new OutputStreamWriter(bos)); String line = br.readLine(); // Read HTTP request from user socket.shutdownInput(); // Shutdown any further input if(line == null) { socket.close(); return; } if(line.toUpperCase().startsWith("GET")) { // Eliminate any trailing ? data, such as for a CGI GET request StringTokenizer tokens = new StringTokenizer(line," ?"); tokens.nextToken(); String req = tokens.nextToken(); String name; // ... form a full filename if(req.startsWith("/") || req.startsWith("\\")) name = this.docroot+req; else name = this.docroot+File.separator+req; File file = new File(name).getCanonicalFile(); // Get absolute file path // Check to see if request doesn't start with our document root .... if(!file.getAbsolutePath().startsWith(this.docroot.getAbsolutePath())) { pw.println("HTTP/1.0 403 Forbidden"); pw.println(); }

Page 29: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP Server (3) // run() continued else if(!file.canRead()) { // No access pw.println("HTTP/1.0 403 Forbidden"); pw.println(); } else if(file.isDirectory()) { // Directory, not file sendDir(bos,pw,file,req); } else { sendFile(bos, pw, file.getAbsolutePath()); } } else { // Unsupported command pw.println("HTTP/1.0 501 Not Implemented"); pw.println(); } pw.flush(); bos.flush(); } catch(Exception e) { e.printStackTrace(); } try { socket.close(); } catch(Exception e) { e.printStackTrace(); } } // run() protected void sendFile(BufferedOutputStream bos, PrintWriter pw, String filename) throws Exception { try { BufferedInputStream bis = new BufferedInputStream(new FileInputStream(filename)); byte[] data = new byte[10*1024]; int read = bis.read(data); pw.println("HTTP/1.0 200 Okay"); pw.println(); pw.flush(); bos.flush(); while(read != -1) { bos.write(data,0,read); read = bis.read(data); } bos.flush(); } catch(Exception e) { pw.flush(); bos.flush(); } }

Page 30: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP Server (4) protected void sendDir(BufferedOutputStream bos, PrintWriter pw, File dir, String req) throws

Exception { try { pw.println("HTTP/1.0 200 Okay"); pw.println(); pw.flush(); pw.print("<html><head><title>Directory of " + req + "</title></head><body><h1>Directory of “ + req

+ "</h1><table border=\"0\">"); File[] contents=dir.listFiles(); for(int i=0;i<contents.length;i++) { pw.print("<tr><td><a href=\"" + req + contents[i].getName()); if(contents[i].isDirectory()) pw.print("/"); pw.print("\">"); if(contents[i].isDirectory()) pw.print("Dir -> "); pw.println(contents[i].getName() + "</a></td></tr>"); } pw.println("</table></body></html>"); pw.flush(); } catch(Exception e) { pw.flush(); bos.flush(); } } } protected void parseParams(String[] args) throws Exception { switch(args.length) { // Check that a filepath has been specified and a port number case 1: case 0: System.err.println ("Syntax: <jvm> "+this.getClass().getName()+" docroot port"); System.exit(0); default: this.docroot = args[0]; this.port = Integer.parseInt(args[1]); break; } }

Page 31: HTTP CS587x Lecture Department of Computer Science Iowa State University

HTTP Server (5)

public WebServerDemo(String[] args) throws Exception { System.out.println ("Checking for paramters"); parseParams(args); // Check for command line parameters System.out.print ("Starting web server...... "); this.ss = new ServerSocket(this.port); // Create a new server socket System.out.println ("OK");

for (;;) { // Forever Socket accept = ss.accept(); // Accept connection via server socket // Start a new handler instance to process the request new Handler(accept, docroot).start(); } }

// Start an instance of the web server public static void main(String[] args) throws Exception { WebServerDemo webServerDemo = new WebServerDemo(args); }}