application layer - central washington university · · 2017-05-20the application layer cn5e by...

Application Layer Chapter 7

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

• DNS – Domain Name System

• Electronic Mail

• The Web

• Streaming Audio and Video

• Content Delivery

Revised: August 2011

The Application Layer


Uses transport services to build

distributed applications

Physical

Link

Network

Transport

Application

DNS – Domain Name System DNS is a world-wide distributed directory service. DNS delegates the

responsibility of assigning domain names and mapping those names

to internet resources by designating authoritative name servers for

each domain (zone). Net admins may delegate authority over sub-

domains within their allocated name space to other name servers.

This provides a distributed and fault tolerant service that was designed

to avoid a single large central database.

Internet has two primary name-spaces:

1. domain name hierarchy (DNS addresses)

2. Internet Protocol Address spaces (IPv4 and IPv6)

The DNS resolves high-level human readable

names for computers to low-level IP addresses

• DNS name space »

• Domain Resource records »

• Name servers »


The DNS Name Space (1) DNS namespace is hierarchical from the root down • Different parts delegated to different organizations

• Can register under multiple top-level domains

• Absolute domain names always end in a period, relative domain names do not − Relative names have to be interpreted in a context to uniquely determine their meaning

The computer robot.cs.washington.edu


250+ Top

Level

Domains:

1. Generic

2. Countries

Generic Second Level

Domains run by Registrars

appointed by ICANN

Question: What is a “domain”?

The DNS Name Space (2)

The Generic top-level

domains are controlled

by ICANN who appoints

registrars to run them

This one was controversial


ICANN = Internet Corporation for

Assigned Names and Numbers

To create a new domain, permission is required of the domain in which it will be included.

Naming follows organizational boundaries, not physical networks.

Domain Resource Records (1) Every domain (all levels) has its own DNS database that is comprised of Resource

Records.

Each Resource Record is a 5 tuple: Domain_Name, TTL, Class, Type, Value (page 616)

The key resource records in the namespace are IP addresses

(A/AAAA) and name servers (NS), but there are others too (e.g.,

MX)


Domain Resource Records (2)

A portion of a possible DNS database for cs.vu.nl. Question: What is cs.vu.nl’s web server’s name and IP address?

Question: What is a reverse lookup?

IP addresses

of computers

Name server

Mail gateways


CNAME and PTR

are aliases; CNAME

allows the same IP

address to have multiple

names; PTR enables

reverse lookups

Name Servers (1)

DNS name space divided into overlapping zones. Admins determine zone

boundaries. Each zone has one or more name servers.

Name servers contain data for portions of the name

space called zones (circled).

One zone


Name Servers (2)


Finding the IP address for a given hostname is called

resolution and is done with the DNS protocol.

Resolution:

• Computer requests local name server to resolve

• Local name server asks the root name server

• Root returns the name server for a lower zone

• Continue down zones until a name server can answer

DNS protocol:

• Runs on UDP port 53, retransmits lost messages

• Caches name server answers for better performance

Name Servers (3)

Example of a computer looking up the IP for a name


Electronic Mail


• Architecture and services »

• The user agent »

• Message formats »

• Message transfer »

• Final delivery »

Architecture and Services (1) The key components and steps (numbered) to send email

− User Agent = Email Reader; a program that supports composing, receiving, and

replying to messages.

− Uses Simple Mail Transfer Protocol (SMTP) over TCP (port 25)

1. Mail submission is method by which user agents send messages into the email

system for delivery

2. Message transfer between MTAs

3. Final delivery to receiving User Agent


Architecture of the email system

Question: Where is the user’s mailbox located?

Architecture and Services (2)


Paper mail Electronic mail

Envelope

Message

(= header

and body)

Question: Who

knows why this is

funny for Tanenbaum?

The User Agent

What users see – interface elements of a typical user agent − Email address format: user@dns-address


Message Formats (1)


Header fields related to message transport; headers are

readable ASCII text

Message Formats (2)


Other header fields useful for user agents

Message Formats (3)


Originally only ASCII text could be sent by email. Multipurpose Internet

Mail Extensions (MIME) was developed to send messages with richer

content (non-Latin alphabets, Chinese, Japanese) as well as audio,

images, or binary documents or programs.

• MIME adds structures to the message body and also defines encoding

rules for the transfer of non-ASCII messages.

MIME header fields used to describe what content is in

the body of the message

Message Headers added by MIME

Message Formats (4)


Common MIME content types and subtypes

Message Formats (5)


Putting it all together:

a multipart message

containing HTML and

audio alternatives.

One part

(HTML)

Another

(audio)

Question: What does this email message do?

Message Transfer (1)


Messages are transferred with SMTP (Simple Mail Transfer

Protocol)

• Readable (4 byte long ASCII) text commands

• Submission from user agent to MTA on port 587 − If message cannot be delivered, an error report containing the first part of

the undeliverable msg is returned to the sender

• One MTA to the next MTA on port 25 usually using TCP − Recipient’s mailbox is within the destination MTA

− The destination MTA is the @dns-address part of the user’s email address

− user@dns-address is the user’s mailbox within the destination MTA

• Other protocols for final delivery (IMAP, POP3, webmail) − IMAP/POP3/webmail get email messages from the mailbox on the user’s

mail server to the user’s email user agent on the user’s machine.

− Webmail systems use web protocols: E.g., Gmail, Hotmail, Yahoo!mail



Sending a message:

• From Alice to Bob

• SMTP commands

are marked [pink]

. . . (rest of message) . . .

C: = from client (sender)

S: == from server (receiver)

HELO – Hello message

RCPT – ID email recipient(s)

EHLO – Hello msg for clients

wanting to use an

extension (next slide)

Question: Does the

Sender (client) or the

Receiver (server)

start the dialog?

If the server is first, how

does it know when to

talk?



Common SMTP extensions (not in simple example)

Final Delivery (1)


User agent uses protocol

like IMAP for final delivery

• Has commands to

manipulate folders /

messages [right]

Alternatively, a Web

interface (with proprietary

protocol) might be used

The World Wide Web (WWW)


Standards body for the web is the W3C (World Wide Web Consortium;

see http://www.w3.org)

Primary web protocol is HTTP, which is a simple text-based protocol

that runs over TCP in the general case.

• Architectural overview »

• Static Web pages »

• Dynamic pages and Web applications »

• HTTP – HyperText Transfer Protocol »

• The mobile Web »

• Web search »

http://www.w3.org/

Architectural Overview (1)

HTTP transfers pages from servers to browsers


Question: Where is the User in this picture?


Pages are named with URLs (Uniform Resource Locators)

• Example: http://www.phdcomics.com/comics.php


Our

focus

Protocol Page on server Server

Common URL protocols



Steps a client (browser) takes to follow a hyperlink:

− Determine the protocol (HTTP)

− Ask DNS for the IP address of server

− Make a TCP connection to server (port 80 or 443)

− Send request for the page; server sends it back

− Fetch other URLs as needed to display the page

− Close idle TCP connections

Steps a server takes to serve pages:

− Accept a TCP connection from client

− Get page request and map it to a resource (e.g., file name)

− Get the resource (e.g., file from disk)

− Send contents of the resource to the client.

− Release idle TCP connections

Architectural Overview (4) -- Clients


When a server returns a page, it also returns some additional info about

that page. This info includes the MIME type of the page that instructs the

client’s browser how to process that info. If the MIME type is not one of the

built-in ones, then the browser consults its table of MIME types which

associates MIME types with Third-Party-provided viewers. Two types of

viewers: 1) plug in or 2) helper app.

Content type is identified by MIME types

• Browser takes the appropriate action to display

• Plug-ins / helper apps extend browser for new types

Question: Where did we see MIME before?

Installed as

an extension

to the browser

(same process as

the browser)

Complete

program

running as

a separate

process

Architectural Overview (5) -- Servers


Web servers are optimized to be able to quickly service numerous requests

To scale performance, Web servers can use:

• Caching to optimize disk retrievals, multiple threads for

parallel processing, and a front end to orchestrate



Server steps, revisited:

1. Resolve name of Web page requested − Incoming request may not contain the actual name of file or

program as a literal string

2. Perform access control on the Web page − Authenticate client to determine if (s)he can access material

3. Check the cache

− Dynamic pages cannot be cached; only current cached content

returned (if so, skip step 4)

4. Fetch requested page from disk or run program

5. Determine the rest of the response that accompany the

contents of the page

6. Return the response to the client

7. Make an entry in the server log

Architectural Overview (7) -- cookies


Even though HTTP uses TCP, it does NOT have any concept of a login session

so each HTTP request has no memory nor context. Cookies provide a memory

or context. Examples: pay-for-view web sites; e-commerce; customized web

portals like Yahoo! Cookies can be abused to learn a user’s browsing habits.

Cookies support stateful client/server interactions − Cookies are a small (4KB or less) named string that the server

associates with a browser

− Browsers store the server-provided cookie for a time period within their

browser directory on the client’s disk. Cookies persist across browser

invocations to that server -- unless the user disables cookies.

• Server sends cookies (state) with page response

• Client stores cookies across page fetches

• Client sends cookies back to server with requests

Example of a browser directory within a client’s disk

Static Web Pages (1)


Static Web pages are simply files

• Have the same contents for each viewing

Can be visually rich and interactive nonetheless:

• HTML that mixes text and images

• Forms that gather user input

• Style sheets that tailor presentation

• Vector graphics, videos, and more (over) . . .

Static Web Pages (2)


Progression of features through HTML 5.0

Dynamic Pages & Web Applications (1)


Applications can now run inside the browser or the server with the web providing the

user interface. Users do not need to install separate apps and user data can be

accessed from different computers; this is a form of cloud computing

Dynamic pages are generated by programs running at the

server (with a database) and/or the client • E.g., PHP, JSP, and CGI at server; JavaScript at client

− Microsoft’s Active Server Pages .NET is a proprietary version of PHP and JSP

− CGI = Common Gateway Interface; JSP = JavaServer Pages

• Pages vary each time like using an application

Question: What does “a form of cloud computing” mean?



Web page that gets

form input and calls

a server program

PHP server program

that creates a

custom Web page

Resulting Web page

(for inputs “Barbara”

and “32”)

PHP calls



JavaScript program

produces result

page in the browser

First page with form,

gets input and calls

program above

Client-side processing occurs inside the user’s browser. Technologies: 1) JavaScript,

2) VBScript, 3) Java applets, and 4) Microsoft ActiveX controls (see pages 677-678)



Client-side and Server-side scripting look the similar in that they both

embed code in HTML files, they are processed totally differently.

The difference between server and client programs

Server-side scripting with PHP Client-side scripting with JavaScript

After user clicks on submit button, the

browser collects info into a long string

and sends it to server as a request for PHP

Page. The server loads the PHP file and

executes the PHP script to produce a new

HTML page that is sent to the browser to

display.

When the user clicks on the submit

button the browser interprets a JavaScript

function contained on the page. All of the

work is done locally within the browser.

Question: when is it preferable to use

server-side and when preferable to use

client-side scripting?



AJAX (Asynchronous Javascript and XML) is a set of technologies that

work together that work together to enable Web applications that are as

responsive and powerful as traditional desktop applications.

Web applications use a set of technologies that work

together, e.g. AJAX: • HTML: present information as pages.

• DOM: change parts of pages while they are viewed. − DOM (Document Object Model) – model of HTML page that is

accessible to programs (e.g., an API to change parts of a page)

• XML: let programs exchange data with the server. − XML (eXtensible Markup Language) is a language for specifying

structural content

− W3C created XML to allow web content to be structured for

automated processing

• Asynchronous way to send and retrieve XML data. − SOAP (Simple Object access Protocol) – language/system-

independent way to do RPC between programs

• JavaScript as a language to bind all this together.



The DOM (Document Object Model) tree represents

Web pages as a structure that programs can alter



XML captures document structure

HTML is concerned with presentation.

Example of XML:

Note: XHTML is

HTML defined

in terms of XML



Web application developers have a library of technologies

available to them to create web content:

HTTP (1)


HTTP is the primary protocol of the web.

HTTP (HyperText Transfer Protocol) is a request-

response protocol that runs on top of TCP

• Fetches pages from server to client

• Server usually runs on port 80

• Headers are given in readable ASCII

• Content is described with MIME types

• Protocol has support for pipelining requests

• Protocol has support for caching

HTTP (2) HTTP 1.0 – connection was set up and released for every response

HTTP 1.1 – Enables connection reuse

HTTP uses persistent connections to improve performance


One connection for

each request

HTTP 1.0

Sequential requests

on one connection

HTTP 1.1

Pipelined requests

on one connection

HTTP 1.1

HTTP (3)


HTTP has several request methods.

Requests server to Fetch

a MIME-encoded page

Used to send

input data

(forms) to a

server program

Both GET and POST are

used for SOAP web services

HTTP (4)


Response codes tell the client how the request fared:

HTTP (5)


A request line (e.g., GET, POST) may be optionally followed by additional

lines with more information, called request headers. Responses also

optionally may have response headers.

Many headers carry key information:

Function Example Headers

Browser capabilities

(client server)

User-Agent, Accept, Accept-Charset, Accept-

Encoding, Accept-Language

Caching related

(mixed directions)

If-Modified-Since, If-None-Match, Date, Last-

Modified, Expires, Cache-Control, ETag

Browser context

(client server)

Cookie, Referer, Authorization, Host

Content delivery

(server client)

Content-Encoding, Content-Length, Content-Type,

Content-Language, Content-Range, Set-Cookie

HTTP (6)


HTTP has built-in support to help clients identify when they can safely

reuse pages

HTTP caching checks to see if the browser has a known

fresh copy, and if not if the server has updated the page − HTTP uses two strategies: 1) cache page validation (step 2),

2) conditional GET – asks server if the cached copy is still valid

• Uses a collection of headers for the checks

• Can include further levels of caching (e.g., proxy caching)

The Mobile Web


Enhancement/extensions made to support web access for small handhelds

and other mobile devices.

Mobiles (phones, tablets) are challenging as clients:

− Relatively small screens

− Limited input capabilities, lengthy input.

− Network bandwidth is limited

− Connectivity may be intermittent.

− Computing power is limited

Strategies to handle them:

• Content: servers provide mobile-friendly versions;

transcoding can also be used

• Protocols: no real need for specialized protocols;

HTTP with header compression sufficient

Web Search


Search has proved hugely popular, in tandem with

advertising that has proved hugely profitable

• A simple interface for users to navigate the Web

Search engine requires:

• Content from all sites, accessed by crawling. Follow

links to new pages, but beware programs. − Each web search engine obtains its database of pages by doing

web crawling – a systematic traversal of all pages and links

− Deep web: Web content that search engines cannot catalog (e.g.,

dynamic content), which is therefore “hidden”

• Indexing, which benefits from known and discovered

structure (such as XML) to increase relevance

Streaming Audio and Video


Real time and streamed multimedia traffic.

• Real time multimedia is usually carried using RTP

• Streamed multimedia may be carried by RTP or HTTP

• Real Time Streaming Protocol – proprietary (RealNetworks) variant of RTCP

• RTCP (real time control protocol) controls remote multimedia playout

Audio and video have become key types of traffic, e.g., voice

over IP, and video streaming.

• Digital audio »

• Digital video »

• Streaming stored media »

• Streaming live media »

• Real-time conferencing »

Digital Audio (1) – review


ADC (Analog-to-Digital Converter using Nyquist’s Sampling Theorem)

produces digital audio from a microphone

• Telephone: 8000 8-bit samples/second (64 Kbps);

computer audio is usually better quality (e.g., 16 bit)

Continuous audio

(sine wave)

Digital audio

(sampled, 4-bit quantized)

ADC

Digital Audio (2) -- review


Digital audio is typically compressed before it is sent

• Lossy encoders (like AAC) exploit human perception

• Large compression ratios (can be >10X)

Sensitivity of the ear

varies with frequency

A loud tone can mask

nearby tones

Digital Video (1)


When the human eyes sees something, the image is retained for some

milliseconds before decaying. If a sequence of images is shown at 50

images/sec, the eye does not notice that it is looking at discrete images. All

video systems exploit this principle to produce moving pictures.

Video is digitized as pixels (sampled, quantized)

• TV quality: 640x480 pixels, 24-bit per pixel color, 30 frames/sec

− Leveraging interlacing – odd number scan lines and even number

cumulatively sent sequentially at 60 frames/sec to fool the eye

• HDTV: 1280X720; Blu-ray: 1080X720

Video is sent compressed due to its large bandwidth

• Lossy compression exploits human perception

− E.g., JPEG for still images, MPEG, H.264 for video

− Large compression ratios (often 50X for video)

• Video is normally > 1 Mbps, versus >10 kbps for speech and

>100 kbps for music

Question: Why is music’s bandwidth higher than speech’s?

Digital Video (2)


JPEG = Joint Photographic Experts Group

JPEG lossy compression sequence for one image:

Step 1 Step 2 Step 3 Step 5

Subsequent slides show each step in the JPEG compression sequence

Question: What does “lossy” mean?

Digital Video (3)


Step 1: Pixels are mapped to luminance/chrominance (matrix:

YCbCr) color space and chrominance is sub-sampled • Y = 16 + .26R + .5G + .09B (luminance);

• Cb = 128 + .15R - .29G - .44B (chrominance)

• Cr = 128 + .44R - .37G + .07B (chrominance)

• The eye is less sensitive to chrominance

Input 24-bit RGB pixels 8-bit luminance

pixels

8-bit chrominances

for every 4 pixels

Digital Video (4)


Step 2: Each component block is transformed to spatial

frequencies with DCT (Discrete Cosine Transformation)

• Captures the key image features

One component block

of the Y matrix

Transformed block

(i.e., the DTC coefficients)

Digital Video (5)


Step 3: DCT coefficients are quantized by dividing by

thresholds; reduces bits in higher spatial frequencies

• Top left element is differenced over blocks (Step 4)

/ = Input Thresholds Output

Digital Video (6)


Step 5: The block is run-length encoded in a zig-zag

order. Then it is Huffman coded before sending (Step 6)

Order in which the block coefficients are sent

Digital Video (7)


MPEG = Motion Picture Experts Group; following describes MPEG-1

MPEG compresses over a sequence of frames, further

using motion tracking to remove temporal redundancy

− I (Intra-coded) frames are self-contained

− P (Predictive) frames use block motion predictions

− B (Bidirectional) frames may base prediction on future frame

Three consecutive frames with stationary and moving components

Who has viewed I-frames in Netflix or other video playout during fast forwards?

Streaming Stored Media (1)


A simple method to stream stored media, e.g., for video

on demand, is to fetch the video as a file download

• But has large startup delay, except for short files

Video-on-Demand Playout can start after the file has downloaded



Effective streaming starts the playout during transport

• Using RTSP (Real-Time Streaming Protocol) or the RTCP (RTP’s Real-Time Control Protocol)

Question: What protocol is being used in step 5?



Key problem: how to handle transmission errors

Strategy Advantage (PRO) Disadvantage (CON)

Use reliable

transport (TCP)

Repairs all errors Increases jitter significantly

Add FEC

(e.g., parity packet)

Repairs most errors Increases overhead, decoding

complexity and jitter

Interleave media Masks most errors Slightly increases decoding

complexity and jitter

Question: Who remembers what FEC is?

Interleave media: mixing up the order of the media before transmission and

unmixing it on reception. In this way if a packet (or a burst of packets) is lost,

the loss will be spread out over time by the unmixing.



Parity packet can repair one lost packet in a group of N

• Decoding is delayed for N packets

An approach for FEC using multimedia



Interleaving spreads nearby media samples over

different transmissions to reduce the impact of loss

Loss reduces temporal resolution; doesn’t leave a gap

Packet

stream

Media

samples



Key problem: media may not arrive in time for playout

due to variable bandwidth and loss/retransmissions

• Client buffers media to absorb jitter; we still need to

pick an achievable media rate

Safety margin,

to avoid a stall Can pause server (or go

ahead and save to disk)



RTSP commands

• Sent from player to server to adjust streaming

Streaming Live Media (1)


Streaming live media is similar to the stored case plus:

• Can’t stream faster than “live rate” to get ahead

− Usually need larger buffer to absorb jitter

• Often have many users viewing at the same time

− UDP with multicast greatly improves efficiency. When not

available, many TCP connections are used instead.

− For very many users, content distribution is used [later

section]



With multicast streaming media, parity is effective

• Clients can each lose a different packet and recover



Comparatively easy today for anyone to set up an Internet radio station.

In this example, recorded content is either stored on a media server as

podcasts for subsequent streaming or sent to for live streaming.

Production side of a student radio station.

As before

Real-Time Conferencing (1)


Real-time conferencing has two or more connected live

media streams, e.g., voice over IP, Skype video call

Key issue over live streaming is low (interactive) latency

• Want to reduce delay to near propagation

• Benefits from network support, e.g., QoS

• Or, benefits from ample bandwidth (no congestion)



H.323 was issued in 1996 for real time conferencing. It was a popular

approach until it became supplanted by the IETF’s SIP, which was

VASTLY simpler and cheaper to use and produced equivalent results.

H.323 architecture for Internet telephony supports calls

between Internet computers and PSTN phones.

Gatekeeper controls

calls for LAN hosts

Internet

VoIP

call

Internet/PSTN



H.323 protocol stack:

• Call is digital audio/video over RTP/UDP/IP

• Call setup is handled by other protocols (Q.931 etc.)



Logical channels that make up an H.323 call



SIP (Session Initiation Protocol) is an alternative to

H.323 to set up real-time calls

• Simple, text-based protocol with URLs for addresses

• Data is carried with RTP / RTCP as before



Setting up a call with the SIP three-way handshake

• Proxy server lets a remote callee be connected

• Call data flows directly between caller/callee



Comparison of H.323 and SIP.

YES

Corrected

by

Instructor

Content Delivery


Majority of Internet bandwidth is currently used to deliver multimedia

content. This resulted in greatly enhanced bandwidth deployments

and Content Delivery Networks (CDN). CDN sets up a distributed

collection of machines at locations inside the Internet and uses them

to serve content to clients. This is used by big players. Alternative is a

peer-to-peer (P2P) network architect, used by little players. This

section also talks about other content delivery types for other data

types.

Delivery of content, especially Web and video, to

users is a major component of Internet traffic

• Content and Internet traffic »

• Server farms and Web proxies »

• Content delivery networks »

• Peer-to-peer networks »

Question: What does “distributed collection of machines” mean?

Content and Internet Traffic


Internet traffic:

1. Shifts seismically (emailFTPWebP2Pvideo) − Internet traffic changes quickly, not only in the details but in the

overall makeup

2. Has many small/unpopular and few large/popular

flows – mice and elephants − Important: note description of “self-similar” traffic on page 737

Zipf popularity distribution, 1/k Shows up as a line on log-log plot

Server Farms and Web Proxies (1)


No matter how much bandwidth one machine has, it can only serve so

many web requests before the load is too great. Server farm is a single

logical machine comprised of many physical machines.

Server farms enable large-scale Web servers:

• Front-end load-balances requests over servers

• Servers access the same backend database

Question: What is the difference between logical and physical?

Server Farms and Web Proxies (2)


Caching improves performance by shortening the response time and reducing

the network load. Web proxy shares the cache among multiple users by

actually fetching web requests on behalf of its users. This also enables an

organization to control the content that is entering its AS.

Proxy caches help organizations to scale the Web

• Caches server content over clients for performance

• Also implements organization policies (e.g., access)

Question: Where is a web proxy usually physically located?

CDNs – Content Delivery Networks (1)


Server farms and web proxies help to build large sites and to improve web

performance, but they are not sufficient for truly popular web sites that

must serve content on a global scale. This is the role of a CDN.

CDNs scale Web servers by having clients get content

from a nearby CDN node (cache) Distributes copy of content to other

CDN nodes and is a Name Server to

resolve DNS queries

Content Delivery Networks (2)


Directing clients to nearby CDN nodes with DNS:

• Client query returns local CDN node as response

• Local CDN node caches content for nearby clients

and reduces load on the origin server

Content Delivery Networks (3)


Akamai pioneered using DNS for content distribution and became the first

major CDN and industry leader (page 745-6)

Origin server rewrites pages to serve content via CDN

Page that distributes content via CDN

Traditional Web page on server

Question: Where have we seen

Akamai before?

Peer-to-Peer Networks (1)


P2P is a file-sharing network that many computers use to pool their

resources to jointly form a CDN together. The computers may simply be

home computers. There is often no dedicated infrastructure and everyone

participates in the task of distributing content, with no central point of

control.

P2P (Peer-to-Peer) is an alternative CDN architecture

with no dedicated infrastructure (i.e., servers)

• Clients serve content to each other as peers

Challenges when servers are removed:

1. How do peers find each other?

2. How do peers support rapid content downloads?

3. How do peers encourage each other to upload?



Two common types of P2P technology:

1. BitTorrent protocol – see pages 750 – 752 – standard: www.bittorent.org

2. Distributed Hash Table (DHT)– see pages 753 - 756 – academic

1) BitTorrent lets peers download torrents, which are a content description

− Peers find each other via Tracker in torrent file

» Tracker is a server with the list of all other peers uploading or downloading content

− Peers swap chunks (parts of content) with partners, preferring those who

send most quickly [2]

− Many peers speed download; preference helps uploads [3]



Distributed Hash Tables (DHTs) are a fully distributed

index that scales to very many clients/entries

• Basic functionality of an index is to map a key to a value, much

like a hash table

− Chord uses SHA-1 for its hash, just like in BitTorrent (and cryptography). It

takes a variable length string and produces a random 160-bit number. It is

used to:

1. Convert any IP address to a 160-bit number called a node identifier.

2. Convert a content name to a key

• Need to follow O(log N) path for N entries

• Each node maintains a finger table that has m entries, each pointing to a

different actual node. Each entry has two fields: start and IP address of

successor.

• Lookup process (next slide): requesting node sends a packet to its

successor containing its IP address and the key it is looking for. The

packet is propagated around the ring until it locates the successor to the

node identifier being sought. That node checks to see if it has any info

matching the key and, if so, returns it directly to the requesting node,

whose IP address it has


A Chord ring of 32 identifiers. Finger tables [at right, and as

arcs] are used to navigate the ring.

• Example: path to look up 16 from 1 is 1 12 15


Identifier

values are

stored at

predecessor

Example: 32 node ID arranged in

circle. Shaded ones are actual machines,

Arcs show fingers and labels on arcs are

table indices.

End

Chapter 7


application layer - central washington university · · 2017-05-20the application layer cn5e by...

Documents