copyright 2012 & 2015 – noah mendelsohn introduction to: the architecture of the world wide...
TRANSCRIPT
Copyright 2012 & 2015 – Noah Mendelsohn
Introduction to:The Architecture of the World Wide Web
Noah MendelsohnTufts UniversityEmail: [email protected]: http://www.cs.tufts.edu/~noah
COMP 150-IDS: Internet Scale Distributed Systems (Spring 2015)
© 2010 Noah Mendelsohn2
What you should get from this session
You should understand at a high level the three pillars of Web Architecture
You should understand what happens when a Web page is retrieved using HTTP
You should know to refer to the TAG’s “Architecture of the World Wide Web” for more information
You should understand the difference between open standards and open source, and why both are important to the Web
© 2010 Noah Mendelsohn3
History
© 2010 Noah Mendelsohn4
Early History of the Web
Started out as a system for distributing documents written by scientists at the CERN physics lab in Switerzland – initial proposal in 1989
A chance to realize Tim Berners-Lee’s vision: a system for integrating all the world’s information!
August 6, 1991: announcement and early code made available along with a server you could access (Tim’s server computer, browser screen)
Others start writing code that complies with Web protocols (HTTP) and document formats (HTML)
Mosaic browser provides first widely available graphical interface
April 1993: Tim convinces CERN to give away the Web’s technology and code
© 2010 Noah Mendelsohn5
Goals and requirements for the Web
Integrate all of the world’s online information
Integrate with other systems– The Web is implemented on systems ranging from mainframes to traffic lights
Allow references (URIs) to be:– Memorable
– Conveyed in other systems (like the links in this slide show!)
– Written “on the side of a bus”
Explorable – random browsing should work, and should do no harm
Support all users, regardless of location, spoken language or disability
Extensible to new types of content, new devices, new modalities of interaction, etc.
Open: content, naming and extensions should not require concurrence of a central authority
Safe to use: e.g. should not unduly compromise your privacy
Provide non-discriminatory access
© 2010 Noah Mendelsohn6
Web Architecture Basics
© 2010 Noah Mendelsohn7
Demonstration:http://webarch.noahdemo.com/demo1/test.html
A simple Web page retrieval
© 2010 Noah Mendelsohn8
Architecting a universal Web
Identification: URIs
Interaction: HTTP
Data formats: HTML, JPEG, GIF, etc.
© 2010 Noah Mendelsohn11
The user clicks on a link
URI is http://webarch.noahdemo.com/demo1/test.htmlURI is http://webarch.noahdemo.com/demo1/test.html
© 2010 Noah Mendelsohn12
The http “scheme” tells client to send HTTP GET msg
HTTP GET
URI is http://webarch.noahdemo.com/demo1/test.htmlURI is http://webarch.noahdemo.com/demo1/test.html
© 2010 Noah Mendelsohn13
The server is identified by DNS name in the URI
HTTP GET
Host: webarch.noahdemo.com
URI is http://webarch.noahdemo.com/demo1/test.htmlURI is http://webarch.noahdemo.com/demo1/test.html
© 2010 Noah Mendelsohn14
HTTP GET
Host: webarch.noahdemo.com
GET /demo1/test.html HTTP/1.0Host: webarch.noahdemo.comUser-Agent: Noahs Demo HttpClient v1.0Accept: */*Accept-language: en-us
URI is http://webarch.noahdemo.com/demo1/test.html
demo1/test.html
The client sends an HTTP GET
© 2010 Noah Mendelsohn15
The server sends an HTTP Response
HTTP GET
HTTP RESPONSE
Host: webarch.noahdemo.com
demo1/test.html
HTTP/1.1 200 OKDate: Tue, 28 Aug 2007 01:49:33 GMTServer: ApacheTransfer-Encoding: chunkedContent-Type: text/html
<html><head><title>Demo #1</title></head><body><h1>A very simple Web page</h1></body></html>
HTTP Status Code200
Means Success!
© 2010 Noah Mendelsohn16
The server sends an HTTP Response
HTTP GET
HTTP RESPONSE
Host: webarch.noahdemo.com
demo1/test.html
HTTP/1.1 200 OKDate: Tue, 28 Aug 2007 01:49:33 GMTServer: ApacheTransfer-Encoding: chunkedContent-Type: text/html
<html><head><title>Demo #1</title></head><body><h1>A very simple Web page</h1></body></html>
The “representation” returned is an HTML
document
© 2010 Noah Mendelsohn17
Host: webarch.noahdemo.com
demo1/test.html
The server sends an HTTP Response
HTTP GET
HTTP RESPONSE
HTTP/1.1 200 OKDate: Tue, 28 Aug 2007 01:49:33 GMTServer: ApacheTransfer-Encoding: chunkedContent-Type: text/html
<html><head><title>Demo #1</title></head><body><h1>A very simple Web page</h1></body></html>
The HTML for the page.
© 2010 Noah Mendelsohn18
Three pillars of Web Architecture
HTTP GET
HTTP RESPONSE
URI is http://webarch.noahdemo.com/demo1/test.html
Identification with URIs
demo1/test.html
Host: webarch.noahdemo.com
© 2010 Noah Mendelsohn19
URI is http://webarch.noahdemo.com/demo1/test.html
Three pillars of Web Architecture
HTTP GET
HTTP RESPONSE
Interaction Using HTTP
demo1/test.html
Host: webarch.noahdemo.com
© 2010 Noah Mendelsohn20
demo1/test.html
Host: webarch.noahdemo.com
URI is http://webarch.noahdemo.com/demo1/test.html
Three pillars of Web Architecture
HTTP GET
HTTP RESPONSE
HTTP/1.1 200 OKDate: Tue, 28 Aug 2007 01:49:33 GMTServer: ApacheTransfer-Encoding: chunkedContent-Type: text/html
<!DOCTYPE html><html><head><title>Demo #1</title></head><body><h1>A very simple Web page</h1></body></html>
Representations using media types
like text/html, image/jpeg, etc
© 2010 Noah Mendelsohn21
Architecting a universal Web
Identification: URIs
Interaction: HTTP
Data formats: HTML, JPEG, GIF, etc.
Suggested Reading:
The Architecture of the World Wide Webhttp://www.w3.org/TR/webarch/
© 2010 Noah Mendelsohn22
Open Standardsand
Open Source
© 2010 Noah Mendelsohn23
Open protocol and format standards
HTTP GET
HTTP RESPONSE
URI is http://webarch.noahdemo.com/demo1/test.html
demo1/test.htmlHost: webarch.noahdemo.com
© 2010 Noah Mendelsohn24
Open protocol and format standards
HTTP GET
HTTP RESPONSE
URI is http://webarch.noahdemo.com/demo1/test.html
demo1/test.htmlHost: webarch.noahdemo.com
Open Standards protocols and formats: client
doesn’t see resource/server
details
© 2010 Noah Mendelsohn25
Open protocol and format standards
HTTP GET
HTTP RESPONSE
URI is http://webarch.noahdemo.com/demo1/test.html
demo1/test.htmlHost: webarch.noahdemo.com
Open Standards protocols and
formats: server supports any client
© 2010 Noah Mendelsohn26
Open source software
HTTP GET
HTTP RESPONSE
URI is http://webarch.noahdemo.com/demo1/test.html
demo1/test.htmlHost: webarch.noahdemo.com
Open Standards protocols and
formats: server supports any client
Open Software sometimes useful for implementing servers or clients – promotes
open standards protocols and formats:
server supports any client
Apache?Firefox?