1 content distribution networks. 2 replication issues request distribution: how to transparently...

35
Content Distribution Networks

Upload: cory-johns

Post on 24-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

1

Content Distribution Networks

Page 2: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

2

Replication Issues

• Request distribution: how to transparently distribute requests for content among replication servers.

• Server selection: how to select a server replica for a given request.

• Content placement: how to decide how many replicas of given content to have and on which servers to place these replicas.

Page 3: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

3

Request distribution

• Transparent replication: techniques that require no user involvement or even awareness of the underlying replication scheme.

• Clients use a single logical name when requesting content.

• Basic issue: redirection of logically identical requests to distinct servers.

Page 4: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

4

Request Redirection

• At the client

• At the intermediate proxies

• At the DNS system

• At the primary origin server

• Somewhere in the network

Page 5: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

5

Content-blind request distribution with Full Replication

• Client redirection

• Redirection by a balancing switch

• Redirection through DNS servers

• anycast.

Page 6: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

6

Client redirection (1/2)

• Use the Web client’s DNS.

Page 7: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

7

Client redirection (2/2)

• Client redirection at the client.

• Use Web site DNS to return a list of server replicas.

• The entire list is passed to the client by the client DNS.

• Client selects server replica to contact.

Page 8: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

8

Advantages of client schemes

• Client can make the server selection.• Client can compare routing paths,

measure response times, packet loss rates or other metrics.

• Does not interfere with the DNS caching.

• Problem: server selection in performed by the client (sub-optimal selection of servers).

Page 9: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

9

Balancing switch redirection

• Hardware-based solution to redirect requests.

• “Balancer” is placed in front of a server farm.

• Balancer modifies

addresses.

Page 10: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

10

Switch-based redirection

• L3, L4 switch: IP balancing

• Uses standard clients, DNS and Web servers.

• Low geographical scalability.

• IBM Network dispatcher.

Page 11: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

11

Web Site DNS redirection• Many DNS implementations allow the Web

site DNS to map a host domain name to a set of IP addresses and chose one of them for every query.

Page 12: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

12

Web site DNS redirection: pros and cons

• Scales well geographically.

• No modifications of the client or DNS are needed.

• DNS response caching complicates the management of caching replicas and request distribution.

Page 13: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

13

Anycast

• Multiple physical servers use a single IP addr called anycast addr.

• Each server advertises both the anycast addr and its regular addr.

• Routers build paths that lead to the nearest anycast member-server.

Page 14: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

14

Anycast

Page 15: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

15

Content-blind request distribution with partial replication

• Servers may have the entire replica of the web site.

• Full replication is often used but imposes considerable overhead.

• Typically only a small fraction of content is responsible for most of the requests.

• Partial replication.

Page 16: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

16

Surrogates as server replicas

• Content requests are distributed among surrogates that are distinct from the origin servers.

• A surrogate can fulfill the posed request.• Otherwise the object is retrieved from the

origin server.• DNS redirection is often used for

distributing requests among the surrogates.

Page 17: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

17

Partial replication through surrogates

Page 18: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

18

Back-end distributed file systems

• Use a distributed file system to allow each replication server access the same, shared file set.

Page 19: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

19

Content Delivery Networks

• CDN: agents of content providers

• A CDN signs up individual content providers for scalable content delivery and delivers their content to any client that accesses the respective Web sites.

• CDN customers, CDN clients.

• The latter download content from the CDN that the former provide to the CDN.

Page 20: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

20

Benefits to content providers

• Global reach: A CDN serves content from multiple CDN servers deployed around the globe. By signing up with a CDN, a content provider gains instant presence around the globe at virtually no upfront cost.

• Flash event protection: Sometimes a Web site experiences a sudden increase in demand called a flash event. A CDN offers an easy way to prepare for a predictable flash event or to protect from an unforeseen one.

Page 21: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

21

CDN benefits

• A CDN is an intermediate layer of infrastructure between origin servers and clients (middleware).

• A CDN can achieve scalable content delivery by distributing load among its servers, by serving client requests from servers that are close to requesters, and by bypassing congested network paths.

• CDN infrastructure is shared among multiple content provider sites.

• CDN has close relationship with the underlying networks. • A technical consequence of shared infrastructure is that

CDNs must implement a mechanism for finding the origin server for a given piece of content.

• close relationship with underlying networks: CDNs place their servers within PoPs or backbone nodes of ISPs.

Page 22: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

22

Types of CDNs

• Classification: relationship to ISPs, relationship to customers, mechanisms for delivering requests to CDN servers.

• Multi ISP, single ISP. • CDN with hosting service (CDN servers for

relaying content from origin servers and the origin servers themselves, customer maintains a staging server).

• Relaying CDNs: origin servers remain external to the CDN.

Page 23: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

23

CDN types

Page 24: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

24

Relaying CDN: 1st hit @ origin

Page 25: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

25

Relaying CDN: 1st hit @ CDN

Page 26: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

26

Request delivery mechanisms

• DNS outsourcing.• Customer delegates the

DNS service for its domain to the CDN.

• Customer’s DNS replies (when queried) with a point to the CDN DNS.

• CDN DNS resolves query to one of the CDN servers.

Page 27: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

27

Relaying CDNs with 1st hit @ origin

• Embedded urls that the CDN is supposed to deliver use host names that are controlled by the CDN DNS.

• Alternatively, the embedded content uses domain names that belong to the CDN. Method adopted by Akamai.

Page 28: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

28

Comparing CDN types

• Origin-first CDNs require that embedded urls use distinct domains from container pages.

• Relative urls are avoided.• CDN-first CDNs allow embedded objects

to share the same domain name as the container page.

• Origin-first CDN: two tcp connections are required.

Page 29: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

29

Request distribution in CDNs

• DNS/Balancing switch redirection.

• CDN DNS returns the IP addr of the balancing switch.

Page 30: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

30

Request distribution in CDNs• Two-level DNS redirection

• This architecture distributes the DNS load among leaf DNS servers and allows client DNS servers to use nearby DNS servers for queries while they cache high-level DNS responses.

Page 31: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

31

Problems in DNS-based request distribution

• Originator problem: large networks often have a few DNS servers to handle clients’ requests.

• The client may be far away from the client DNS.

• Hidden load factor problem.

• Client DNS masking problem (recursive resolution of requests).

Page 32: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

32

Server Selection Metrics

• Proximity metrics• Server load metrics• Aggregate metrics• Passive measurements obtain metrics by simply

observing the normal operation of the system.• Active measurements involve actions that the

system performs only for the purpose of obtaining the metric.

• Synchronous/Asynchronous measurements.

Page 33: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

33

Proximity metrics• Geographical distance

• Number of network routers

• Number of AS

http://www.caida.org/home/

Page 34: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

34

Server Load Metrics

• Number of connections

• Number of requests

• Ready queue length

• Response time

>> uptime5:12pm up 30 day(s), 7:01, 3 users, load average:

0.11, 0.09, 0.09

Page 35: 1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers

35

Aggregate metrics

• Tcp ping latency

• Icmp ping latency

• http request latency

• Download latency, download time