an introduction to xml and web technologies the http protocol file2 an introduction to xml and web...
TRANSCRIPT
1
1
An Introduction to XML and Web Technologies
The HTTP Protocol
Anders Møller & Michael I. Schwartzbach© 2006 Addison-Wesley
An Introduction to XML and Web Technologies2/53
Objectives
How the HTTP protocol works
The SSL security extension from a
programmer's point of view
How to write servers and clients inJava
An Introduction to XML and Web Technologies3/53
HTTP
HTTP: HyperText Transfer Protocol
Client-Server model
Request-Response pattern
client
URL
HTML
server
2
An Introduction to XML and Web Technologies4/53
Network Layers
OUR APPLICATIONS
THE APPLICATION LAYER
THE TRANSPORT LAYER
THE INTERNET LAYER
THE NETWORK INTERFACE LAYER
HTTP, FTP, SMTP, DNS
TCP, UDP
IP
Ethernet
An Introduction to XML and Web Technologies5/53
The Internet
An Introduction to XML and Web Technologies6/53
IP
IP: Internet Protocol
Unreliable communication of limited size data
packets (datagrams)
IP addresses (e.g. 165.193.130.107) identifymachines
Handles routing using the underlying physical
network (e.g. Ethernet)
3
An Introduction to XML and Web Technologies7/53
TCP
TCP: Transmission Control Protocol Layer on top of IP Data is transmitted in streams Reliability ensured by retransmitting lost datagrams,
reordering, etc. Connection-oriented
• establish connection between client and server• data streaming in both directions (full-duplex)• close connection
Socket: end point of connection, associated a pairof (IP address, port number)
An Introduction to XML and Web Technologies8/53
Ports
0-1023 well-known ports and are assigned toserver applications with root privilege
80: Web Server20, 21: FTP servers
22: SSH server25: SMTP servers
1024-49151 registered ports Remaining ports are private ports that can be
used freely
An Introduction to XML and Web Technologies9/53
UDP
UDP: user datagram protocol Just makes IP communication available to upper-
level processes Unreliable and datagram-oriented Faster than TCP Used for video and voice transmission where speed
is essential and occasional losses are acceptable Often use a multicasting mechanism in IP Basis for DNS
4
An Introduction to XML and Web Technologies10/53
HTTP
HTTP: HyperText Transfer Protocol
Layer on top of TCP
Request and response sent using TCPstreams
http://www.google.com/search?q=Introduction+to+XML+and+Web+Technologies
An Introduction to XML and Web Technologies11/53
HTTP Requests
Request line (methods: GET, POST, ...) Header lines of the form: field: value Request body (empty here)
GET /search?q=Introduction+to+XML+and+Web+Technologies HTTP/1.1Host: www.google.comUser-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2)→ Gecko/20040803Accept: text/xml,application/xml,application/xhtml+xml,→ text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 */* all media typesAccept-Language: da,en-us;q=0.8,en;q=0.5,sw;q=0.3 q: qualityfactorAccept-Encoding: gzip,deflate 0: not acceptableAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 1: defaultKeep-Alive: 300 da: danishConnection: keep-aliveReferer: http://www.google.com/
An Introduction to XML and Web Technologies12/53
HTTP Responses
Status line Header lines Response body
HTTP/1.1 200 OKDate: Fri, 17 Sep 2009 07:59:01 GMTServer: Apache/2.0.50 (Unix) mod_perl/1.99_10 Perl/v5.8.4→ mod_ssl/2.0.50 OpenSSL/0.9.7d DAV/2 PHP/4.3.8 mod_bigwig/2.1-3Last-Modified: Tue, 24 Feb 2009 08:32:26 GMTETag: "ec002-afa-fd67ba80“ for cache managementAccept-Ranges: bytes file size and last modification timeContent-Length: 2810Content-Type: text/html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html>...</html>
5
An Introduction to XML and Web Technologies13/53
Status Codes
200 OK 301 Moved Permanently 400 Bad Request 401 Unauthorized 403 Forbidden 404 Not Found 500 Internal Server Error 503 Service Unavailable ...
An Introduction to XML and Web Technologies14/53
HTML Forms<h3>The Poll Service</h3><form action="http://freewig.brics.dk/users/laudrup/soccer.jsp“ method="post">Who wins the World Cup 2006?<select name="bet"><option value="br">Brazil!</option><option selected value="dk">Denmark!</option><option value="other country">someone else?</option></select><br>Please enter your email address:<input type="text" name="email"><br><input type="submit" name="send" value="Go!"></from>
An Introduction to XML and Web Technologies15/53
Encoding of Form Data
[email protected] countrybetValueName
Encoding to query string (URL encoding):bet=other+country&email=zacharias_doe%40notmail.com&send=Go%21
GET: place query string in request URIGET /users/laudrup/soccer.jsp?bet=other+country&email=zacharias_doe%40notmail.com&send=Go%21HTTP/1.1
6
An Introduction to XML and Web Technologies16/53
POST
The query string is placed in the body ofHTTP request
POST /users/laudrup/soccer.jsp HTTP/1.1HOST: freewig.brics.dkContent-Type: application/x-www-form-urlencodedContent-Length: 62
bet=other+country&email=zacharias_doe%40notmail.com&send=Go%21
An Introduction to XML and Web Technologies17/53
GET vs. POST?
GET requests should be safe to the client (should have no side-effects)
GET is useful for retrieving data, not for submitting orders to anonline shop (POST)
GET is Idempotent: same results for the same requests Cachability: server log request URIs not the body. GET requests
are cachable POST requests (bodies) are not cachable, good forsensitive information.
HTML links result in GET request. For Forms GET and POST arepossible and GET is the default
Limits on request URI length (2K in IE) so GET can not be usedfor large input
POST allows other encodings (e.g. for file upload): content-disposition header: field name, the content type, character set ofthe value, its transfer encoding, and the file name
An Introduction to XML and Web Technologies18/53
Authentication
Restricting access to authorized users
Common techniques:
• IP-address
• Form (with username/password fields)
• HTTP Basic
• HTTP Digest
7
An Introduction to XML and Web Technologies19/53
HTTP Basic Authentication Server responds with
HTTP/1.1 401 Authorization RequiredWWW-Authenticate: Basic realm="The Doe Family Site"
Then the browser pops up a window for username and password(called challenge):
If Cancel, an HTML document is shown to the user with the message inthe body
Otherwise, Client send the base64 encoding of username:passwordAuthentication: Basic emFjaGFyaWFzOmFwcGxlcGllcg==
An Introduction to XML and Web Technologies20/53
Cache Control Caches used in clients, servers, and network
(proxy servers, content delivery networks) Cache-Control
• no-store• no-cache• public• private• max-age• must-revalidate
never cache this messageno-may cache but need revalidationmay cacheintended for single userset expirationrequire revalidation
Cache-Control: public, no-cache, max-age=3600 Cache-Control: no-store, no-cache, must-revalidate
Expires: Thu, 01 Jan 1970 00:00:00 GMTPragma: no-cache
An Introduction to XML and Web Technologies21/53
Range requests
Range: bytes=100-200,300-
only the bytes from 100-200 and 300to the end of the resource arerequested
The Server will send the requestedranges together with206 partial Content status code
8
An Introduction to XML and Web Technologies22/53
Persistent Connections Early versions of HTTP, every request was on a TCP connection
that was closed when the response had been received Establishing a TCP connection takes a relatively long time and
normally have a number of requests to the same sever for HTMLdocument, images and stylesheets
HTTP/1.1: Multiple request-response pairs on a single TCPconnection• Content-Length (now important to determine the boundary of each message!)• Connection: close (persistent by default in HTTP/1.1)• Connection: keep-alive (compatibility)• Keep-Alive: 300 (control timeout, compatibility)
Pipelining• send multiple requests before receiving the responses• fewer TCP/IP packets• only for idempotent requests (e.g. GET)• supported by newer browsers
An Introduction to XML and Web Technologies23/53
Limitations of HTTP
Stateless, no built-in support for trackingclients (session management)• log in to an online shop• put products into a shopping cart• enter address and payment information• log out
Must be managed at a level above HTTP
No built-in security mechanisms (can readand make sense of the contents of TCP/IPpackets)
An Introduction to XML and Web Technologies24/53
Session Management
Problem: how toidentify the clientin the server
Techniques• URL rewriting• Hidden form fields• Cookies• SSL sessions
9
An Introduction to XML and Web Technologies25/53
URL Rewriting
http://www.widget.com/buy Server rewrite it to
http://www.widget.com/buy;21A9A809... Along string that is hard to guess by others
Session URL: use the same URL for theentire duration of the session and point tothe newest HTML page but the serverneeds to store the HTML page for eachsession plus the overhead for redirectingthe client to its HTML page
An Introduction to XML and Web Technologies26/53
Hidden Form Fields
Place either a session id or the entiresession state in hidden form fields• not displayed in the browser• but are automatically submitted alongwith other fields
Only works with forms
<form action=http://www.widget.com/buy method=“POST”> <input type=“hidden” name=“sessionid” value=“21A9A8089…”></form>
An Introduction to XML and Web Technologies27/53
Cookies Extension of HTTP that allows servers to store data on
the clients• limited size and number• may be disabled by the client
Cookies are small data packages that are stored on theclients but managed by the servers
Used to track session state, but also user identity Controlled via HTTP header fields without modifying the
HTML pages Server Set-Cookie: sessionid=21A9A8089C305319; path=/
Client Cookie: sessionid=21A9A8089C305319
10
An Introduction to XML and Web Technologies28/53
Set-Cookie header field
Expire=time: specify the lifetime of the cookieMon, 02-05-2005 21:37:45 GMT
Domain=pattern: domain names of serversthat the cookie can be returned to
Path=path: cookie will only be returnedtogether with URLs whose path componentbegins with path
secure: will only be send over SSLconnection
An Introduction to XML and Web Technologies29/53
Limitations of Cookies
Max 4KB per cookie Max 20 cookies per domain name Max 300 cookies in total Stored somewhere by the browser and cannot
easily be moved to another person’s browser Shared for all windows so cannot have several
sessions from the same browser Threat to privacy as servers may learn user’s
behavior on the web May be disabled by the user in most browsers
An Introduction to XML and Web Technologies30/53
Security
Desirable properties: Confidentiality: just between client & server
Integrity: cannot be altered during transmission
Authenticity: the receiver is able to identify thesender and vice versa
non-repudiation: the sender cannot denyhaving sending and the server cannot denyhaving received it
11
An Introduction to XML and Web Technologies31/53
Security
Desirable properties: confidentiality
Integrity SSL/TLS
authenticity
non-repuditation digital signature
An Introduction to XML and Web Technologies32/53
SSL
SSL: Secure Sockets Layer (netscape) TLS: Transport Layer Security (newer version
by Internet Engineering Task Force)
Layer between HTTP and TCP,accessed by https://...
Based on public-key cryptography• private key + public key• certificate (usually for server authentication only)used to associate the identity of the user, such asname and address, to the public key
An Introduction to XML and Web Technologies33/53
Web Programming with Java
Why Java? platform independence safe runtime model multi-threading for both servers and clients sandboxing: safely execute non-trusted code Unicode serialization, dynamic class loading powerful standard libraries
• java.net: classes for TCP/IP and HTTP• java.nio.channels: TCP with non-blocking I/O and efficient buffers• javax.net.ssl: support for SSL
12
An Introduction to XML and Web Technologies34/53
TCP/IP: DomainName2IPNumbersimport java.net.*;
public class DomainName2IPNumbers { public static void main(String[] args) { try {
InetAddress[] a = InetAddress.getAllByName(args[0]);for (int i = 0; i<a.length; i++) System.out.println(a[i].getHostAddress());
} catch (UnknownHostException e) {System.out.println("Unknown host!");
} }} java DomainName2IPNumbers www.google.com
66.102.9.10466.102.9.99
An Introduction to XML and Web Technologies35/53
TCP/IP: SimpleServerimport java.net.*;import java.io.*;public class SimpleServer { public static void main(String[] args) { try { ServerSocket ss = new ServerSocket(Integer.parseInt(args[0])); while (true) { Socket con = ss.accept(); InputStreamReader in = new InputStreamReader(con.getInputStream());
while ((c = in.read())!=0) { msg.append((char)c); PrintWriter out = new PrintWriter(con.getOutputStream()); out.print("Simon says: "+msg); out.flush(); con.close(); } } catch (IOException e) {System.err.println(e); } }}
An Introduction to XML and Web Technologies36/53
TCP/IP: SimpleClientimport java.net.*;import java.io.*;public class SimpleClient { public static void main(String[] args) { try { Socket con = new Socket(args[0], Integer.parseInt(args[1])); PrintStream out = new PrintStream(con.getOutputStream()); out.print(args[2]); out.write(0); out.flush(); InputStreamReader in = new InputStreamReader(con.getInputStream()); int c; while ((c = in.read())!=-1)
System.out.print((char)c);con.close();
} catch (IOException e) {System.err.println(e); }}
java SimpleServer 1234
java SimpleClient localhost 1234 "Hello World"Simon says: Hello World
13
An Introduction to XML and Web Technologies37/53
Non-Blocking I/O
Blocking: a read blocks until data is available. It meansthat each connection needs its own thread.
With New I/O (NIO) packages, a thread can be madeto wait for an event on many sockets using a selectormechanism.
Support for concurrent connections and buffering Packages: java.nio.channels, java.nio Central classes:
• ServerSocketChannel, SocketChannel• Selector: wait for the client (OP_ACCEPT,OP_READ,OP_WRITE)• ByteBuffer: for reading and writing
An Introduction to XML and Web Technologies38/53
Non-Blocking I/Oimport java.util.*;import java.net.*;import java.nio.*;import java.nio.channels.*;import java.io.*;public class NonBlockingServer { public static void main(String[] args) { try { ServerSocketChannel[] ss = new ServerSocketChannel[args.length]; Selector sel = Selector.open(); for (int i= 0; I<args.length; i++) { int port = Integer.parseInt(args[i]); ss[i] = ServerSocketChannel.open(); ss[i].configureBlocking(false); ss[i].socket().bind(new InetSocketAddress(port)); ss[i].register(sel, SelectionKey.OP_ACCEPT); }
An Introduction to XML and Web Technologies39/53
HTTP in Java
Two approaches:
1. “manually” implement HTTP on topof TCP/IP
2. Use the more high-level features ofthe Java libraries
14
An Introduction to XML and Web Technologies40/53
HTTP: ImFeelingLucky (1/2)import java.net.*;import java.io.*;public class ImFeelingLucky { public static void main(String[] args) { try { Socket con = new Socket(www.google.com, 80); String req = “/search?"+ "q="+URLEncoder.encode(args[0], "UTF8")+"&"+ "btnI="+URLEncoder.encode("I'm Feeling Lucky", "UTF8"); BufferedWriter out = new BufferedWriter (new OutputStreamWriter(con.getOutputStream(),”UTF8”)); out.write(“GET “+req+” HTTP/1.1/\r\n”); out.write(“HOST: www.google.com\r\n”); out.write(“User-Agent: IXWT\r\n\r\n”); out.flush();
Send a request to the Google server and extracts the result, manually constructs anHTTP request and parsing the necessary parts of the response.
An Introduction to XML and Web Technologies41/53
HTTP: ImFeelingLucky (2/2)BufferedReader in = new BufferedReader
(new InputStreamReader(con.getInputStream())); String line; System.out.print("The prophet spoke thus: "); while ((line = in.readLine()) != null) { if (line.startsWith(“Location:” )) { System.out.println("Direct your browser to "+ line.substring(9).trim()+ " and you shall find great happiness in life."); break; } else if (line.trim().length() == 0) { System.out.println("I am sorry - my crystal ball is blank."); break; } } con.close(); } catch (IOException e) {System.err.println(e);} }}
An Introduction to XML and Web Technologies42/53
HTTP: ImFeelingLucky2 (1/2)import java.net.*;import java.io.*;
public class ImFeelingLucky2 { public static void main(String[] args) { try { String req = "http://www.google.com/search?"+ "q="+URLEncoder.encode(args[0], "UTF8")+"&"+ "btnI="+URLEncoder.encode("I'm Feeling Lucky", "UTF8"); HttpURLConnection con = (HttpURLConnection) (new URL(req)).openConnection(); con.setRequestProperty("User-Agent", "IXWT"); con.setInstanceFollowRedirects(false);
The class HTTPURLConnection and the related classes in the java.net package makeIt easier to create the HTP requests and to parse the responses.
15
An Introduction to XML and Web Technologies43/53
HTTP: ImFeelingLucky2 (2/2) String loc = con.getHeaderField("Location");
System.out.print("The prophet spoke thus: "); if (loc!=null) System.out.println("Direct your browser to "+loc+ " and you shall find great happiness in life."); else System.out.println("I am sorry - my crystal ball is blank."); } catch (IOException e) { e.printStackTrace(); } }}
java ImFeelingLucky2 W3C
The prophet spoke thus: Direct your browser tohttp://www.w3.org/ and you shall find greathappiness in life.
An Introduction to XML and Web Technologies44/53
Java HTTP features
setRequestMethod: GET(default) or POST setRequestProperty: sets a name-value property of the request setDoInput: set true if we read input from the connection setDoOutput: set true if we write output to the connection connect: establish the TCP connection getOutputStream: gives an output stream for the request body of
POST requests getHeaderFields: returns a field from the response header getInputStream: gives an input stream for reading the response
only setInstanceFollowRedirects: whether to redirect automatically setCaches: cache responses getContent: convert response into Java objects based on its type
An Introduction to XML and Web Technologies45/53
SSL in Java (JSSE)
javax.net.ssl, java.security.cert
SSLServerSocketFactory,SSLServerSocket
SSLSocketFactory, SSLSocket
SSLSession, Certificate,HttpsURLConnection
keytool
16
An Introduction to XML and Web Technologies46/53
Simple Secure Serverimport java.net.*;import java.io.*;Import javax.net.ssl.*;public class SecureSimpleServer { public static void main(String[] args) { try { SSLServerSocketFactory sf =
(SSLServerSocketFactory) SSLServerSocketFactory.getDefault(); SSLServerSocket ss =
(SSLServerSocket) sf.createServerSocket(Integer.parseInt(args[0])); while (true) { SSLSocket con = (SSLSocket) ss.accept(); InputStreamReader in = new InputStreamReader(con.getInputStream());
while ((c = in.read())!=0) { msg.append((char)c); PrintWriter out = new PrintWriter(con.getOutputStream()); out.print("Simon says: "+msg); out.flush(); con.close(); } } catch (IOException e) { e.printStackTrace(); } }}
An Introduction to XML and Web Technologies47/53
Simple Secure Clientimport javax.net.ssl.*;
public class SecureSimpleClient { public static void main(String[] args) { try { SSLSocketFactory sf=(SSLSocketFactory)SSLSocketFactory.getDefault(); SSLSocket con = (SSLSocket)sf.creatSocket(args[0], Integer.parseInt(args[1])); PrintStream out = new PrintStream(con.getOutputStream()); out.print(args[2]); out.write(0); out.flush(); InputStreamReader in = new InputStreamReader(con.getInputStream()); StringBuffer buf = new StringBuffer(); int c; while ((c = in.read())!=-1) System.out.print((char)c); con.close(); } catch (IOException e) {System.err.println(e); }}
java SecureSimpleServer 1234
java SecureSimpleClient localhost 1234 "Hello World"Simon says: Hello World
An Introduction to XML and Web Technologies48/53
Compilation To make this program run, we need a server certificate (called an
X.509 certificate) to obtained as follows: Use the tool keytool to create a new key store and generate a key pair
and a self-signed certificate using RSA algorithm and valid for certain days
Have a certificate authority (one that our future clients hopefully trusts)sign that the information we entered above is correct
Before inserting the signed certificate, import the certificate authority’s owncertificated
Replace the self-signed certificate in the key store with the one signed bythe certificate authority
Start the server
Start the client using the new trust store.
17
An Introduction to XML and Web Technologies49/53
Key Store
Use the tool keytool to create a new key store in a file named serverkey.jks andgenerate a key pair and a self-signed certificate here named widgetorg using RSAalgorithm and valid for 7 dayskeytool –genkey –alias widgetorg –keyalg RSA –validity 7 –keystore serverkey.jksEnter keystore password: ToPsEcReTWhat is your first and last name?
[Unknown]: www.widget.incWhat is the name of your organization unit? [Unknown]: Widget Inc.What is the name of your City or Locality? [Unknown]: PunxsutawneyWhat is the name of your State or Province? [Unknown]: PennsylvaniaWhat is the two-letter country code for this unit? [Unknown]: USIs CN=www.widget.inc, OU=Web Division, O=Widget Inc, L=Punxsutawney, ST=…, C=US, correct? [no]: yesEnter key password for <widgetorg> (RETURN if same as keystore password):
An Introduction to XML and Web Technologies50/53
Certification Authority
Have a certificate authority (one that our future clients hopefullytrusts) sign that the information we entered above is correct
keytool –certreq –alias widgetorg –keystore serverkey.jks–keyalg RSA –file widgetorg.csr
Enter keystore password: ToPsEcReTIt generates a cerficate signing request (widgetorg.csr), whichwe then send to the certificate authority, who returns a signedcertificate authenticating the server’s public key
An Introduction to XML and Web Technologies51/53
Import the certificate
Before inserting the signed certificate, import the certificateauthority’s own certificated (here store in a file named ca.cer,obtained from the authority’s Web site
keytool –import –alias root –keystore serverkey.jks –trustcacerts–file ca.cer
Enter keystore password: ToPsEcReTCertificate was added to keystore
Replace the self-signed certificate in the key store with the onesigned by the certificate authority
keytool –import –alias widgetorg –keystore serverkey.jks–file widgetorg.cer
Enter keystore password: ToPsEcReTCertificate was added to keystore
18
An Introduction to XML and Web Technologies52/53
Start the Server/Client
Start the Serverjava -Djavax.net.ssl.trustStore=serverkey.jks
-Djavax.net.ssl.trustStorePassword=ToPsEcReT SecureSimpleServer 8443
Start the ClientJava -Djavax.net.ssl.trustStore=serverkey.jks
-Djavax.net.ssl.trustStorePassword=ToPsEcReT SecureSimpleClient localhost 8443 “Hello World”
An Introduction to XML and Web Technologies53/53
A Web Server in 145 Lines of Code
Listens for HTTP requests on a port Parses the requests Returns files from the server’s file system