unit vi overlays
TRANSCRIPT
Overlay Networks(with a focus on Content Distribution Networks)
What is an Overlay ?
What is the topology of this network?
WHICH network??Figure borrowed from www.isi.edu/xbone
• Networks built using an existing network as substrate
• Also known as Virtual Networks
• Most popular overlay – The Internet: Evolved as an overlay on the POTS (Plain Old Telephone System) network
• Overlays could consist of routing software installed at selected sites, connected by encapsulation tunnels or direct links
Overlay Networks: Overview
• MBone, 6Bone, ABone
• RON, VNS
• P2P (Napster, FreeNet, Gnutella)
• Content Networks- Cooperating Caches- Server Farms- Content Distribution Networks (CDNs)
Overlay Networks: Examples
• Semi-permanent testbed to carry IP multicast traffic
• Routing of IP multicast traffic is not commonly integrated and deployed in production routers on the Internet
• Hence, layered on the Internet to support routing of IP multicast packets using tunneling
Example Overlays: (1) MBone
Mbone node
Mbone node
Mbone node
Internet router
Internet router
Internet router
Internet router
Example Overlays: (2) 6Bone
• 6bone is an IPv6 testbed on the Internet
• Intended to eventually subsume the underlying IPv4 network
• IPv4 tunnels used to overlay the 6bone
• ABone is the Active Networks Backbone, for experimentation in Active networking. Uses tunneling
• Resilient Overlay Network (RON): Provides fault tolerance and faster recovery as compared to conventional routing techniques
• Virtual Network Service (VNS): Infrastructure for provisioning QoS within Virtual Private Networks
• Peer to Peer Networks: Infrastructure for distribution and sharing of files (eg: Napster, Gnutella, Freenet)
• Content Networks:- Server Farms, Caching Proxies, Content Distribution
Networks (CDNs)- Today, we will try to focus on CDNs- What are the motivations for Content Networks?
Other known Overlays
• More hops between client and Web server => more congestion!
• Same data flowing repeatedly over links between clients and Web server
Motivations for Content Networks
S
C1
C4
C2
C3
- IP router
• Origin server is bottleneck as number of users grows
• Flash Crowds (for instance, Sept. 11)
• The Content Distribution Problem: Arrange a rendezvous between a content source at the origin server (www.cnn.com) and a content sink (us, as users)
Motivations for Content Networks (contd.)
• Arbitrate client requests to servers using an “intelligent” L4-L7 switch
• Pretty widely used today
Example content networks: Server Farms
L4-L7 Switch
Request fromgrad.umd.edu
Request from ren.cis.udel.edu
Request fromren.cis.udel.edu
Request fromgrad.umd.edu
www.cnn.com(Copy 1)
www.cnn.com(Copy 3)
www.cnn.com(Copy 2)
• Simple solution to the content distribution problem: deploy a large group of servers
Example content networks: Caching Proxies
Clientren.cis.udel.edu
Clientmerlot.cis.u
del.edu
Intercepters
Proxy
www.cnn.comInternetTCP port 80 traffic
Othertraffic
ISP
• Majorly motivated by ISP business interests – reduction in bandwidth consumption of ISP from the Internet
• Reduced network traffic
• Reduced user perceived latency
Web Serverwww.cnn.com
Usermerlot.cis.udel.edu
1000,000other hosts
1000,000other hosts
Consider, On September 11, 2001
New ContentWTC News!
oldcontent request
request
- Caching Proxy
ISP
- Congestion / Bottleneck
Problems with discussed approaches:Server farms and Caching proxies
• Server farms do nothing about problems due to network congestion, or to improve latency issues due to the network
• Caching proxies serve only their clients, not all users on the Internet
• Content providers (say, Web servers) cannot rely on existence and correct implementation of caching proxies
• Accounting issues with caching proxies. For instance, www.cnn.com needs to know the number of hits to the webpage for advertisements displayed on the webpage
Web Serverwww.cnn.com
Usermerlot.cis.udel.edu
Again, On September 11, 2001
New ContentWTC News!
requestnew
content
1000,000other users
1000,000other users
- Surrogate
- Distribution Infrastructure
FL
IL
DENY
MA
MICA
WA
• Overlay network to distribute content from origin servers to users
• Avoids large amounts of same data repeatedly traversing potentially congested links on the Internet
• Reduces Web server load
• Reduces user perceived latency
• Tries to route around congested networks
Web replication - CDNs
• Caches are used by ISPs to reduce bandwidth consumption, CDNs are used by content providers to improve quality of service to end users
• Caches are reactive, CDNs are proactive
• Caching proxies cater to their users (web clients) and not to content providers (web servers), CDNs cater to the content providers (web servers) and clients
• CDNs give control over the content to the content providers, caching proxies do not
CDN vs. Caching Proxies
Surrogate Surrogate
Request Routing
Infrastructure
Distributionand
Accounting Infrastructure
CDN
CDN Architecture
Origin Server
Client Client
• Content Delivery Infrastructure: Delivering content to clients from surrogates
• Request Routing Infrastructure: Steering or directing content request from a client to a suitable surrogate
• Distribution Infrastructure: Moving or replicating content from content source (origin server, content provider) to surrogates
• Accounting Infrastructure: Logging and reporting of distribution and delivery activities
CDN Components
Server Interaction with CDN
DistributionInfrastructure
1
1. Origin server pushes new content to CDN OR CDN pulls content from origin server
Accounting Infrastructure
2
2. Origin server requests logs and other accounting info from CDN OR CDN provides logs and other accounting info to origin server
CDN
Origin Server
www.cnn.com
Request Routing
Infrastructure
Client Interaction with CDN
1
1. Hi! I need www.cnn.com/sept11
2
2. Go to surrogate delaware.cnn.akamai.com
3
3. Hi! I need content /sept11
Q:How did the CDN choose the Delaware surrogate over the California surrogate ?
Client
Surrogate(DE)
Surrogate(CA)
CDNcalifornia.cnn.akamai.com
delaware.cnn.akamai.com
Request Routing Techniques
• Request routing techniques use a set of metrics to direct users to “best” surrogate
• Proprietary, but underlying techniques known:• DNS based request routing• Content Modification (URL rewriting)• Anycast based (how common is anycast?)• URL based request routing• Transport layer request routing• Combination of multiple mechanisms
DNS based Request-Routing
• Common due to the ubiquity of DNS as a directory service
• Specialized DNS server inserted in DNS resolution process
• DNS server is capable of returning a different set of A, NS or CNAME records based on policies/metrics
DNS based Request-Routing
Akamai DNS
DN
S q
uery
:w
ww
.cnn
.com
DN
S r
espo
nse:
A 1
45.1
55.1
0.15
Sess
ion
local DNS server (louie.udel.edu)128.4.4.12
DNS query:www.cnn.com
DNS response:A 145.155.10.15
www.cnn.com
Surrogate145.155.10.15
Surrogate58.15.100.152
AkamaiCDN
merlot.cis.udel.edu
128.4.30.15
delaware.cnn.akamai.com
california.cnn.akamai.com
Q:How does the Akamai DNS know which surrogate is closest ?
DNS based Request-Routing
DN
S q
uery
DN
S r
espo
nse
Sess
ion
Akamai DNS
www.cnn.com
Surrogate
Surrogate
AkamaiCDN
merlot.cis.udel.edu
128.4.30.15
local DNS server (louie.udel.edu)
128.4.4.12
DNS query
DNS response
Measure
to
Client D
NS
Measure to Client DNS
Measurement results
Measurem
ent re
sults
Mea
surem
ents
Measurements
DNS based Request Routing: Cachingwww.cnn.com
Client DNS76.43.32.4
Surrogate145.155.10.15
Surrogate58.15.100.152
Akamai DNS
AkamaiCDN
Client76.43.35.53
Requesting DNS - 76.43.32.4
Surrogate - 145.155.10.15
www.cnn.comA 145.155.10.15TTL = 10s
Requesting DNS - 76.43.32.4Available Bandwidth = 10 kbpsRTT = 10 ms
Requesting DNS - 76.43.32.4Available Bandwidth = 5 kbpsRTT = 100 ms
DNS based Request Routing Techniques: Discussion
• Originator Problem: Client may be far removed from client DNS
• Client DNS Masking Problem: Virtually all DNS servers, except for root DNS servers honor requests for recursion
Q: Which DNS server resolves pel.cis.udel.edu? Q: Which DNS server performs the last recursion of the
DNS request?
• Hidden Load Factor: A DNS resolution may result in drastically different load on the selected surrogate – issue in load balancing requests, and predicting load on surrogates
Server Selection Metrics
• Network Proximity (Surrogate to Client): - Network hops (traceroute)- Internet mapping services (NetGeo, IDMaps)- …
• Surrogate Load: - Number of active TCP connections- HTTP request arrival rate- Other OS metrics- …
• Bandwidth Availability
• …
Value of a CDN
• Scale: Aggregate infrastructure size
• Reach: Diversity of content locations (diverse placement of surrogates)
• Request routing efficiency, delivery techniques
Content Distribution Internetworking: CDI
• Interconnection of content networks – collaboration between caching proxies and CDNs, as well as between individual CDNs
• Greater reach, larger scale, higher capacity, increased fault tolerance
• A new area, lots of challenges
• Basic architecture involves gateways between various content networks
CDI: Architecture
CDN1 CDN2
CDN3
CN4For instance,cache
network of some ISPx - Content Peering Gateway
Traditional networks
• Information processed at layers 1 through 3 of the OSI stack
• Units of transported data are frames and packets
Traditional vs. Overlay Content Networks
Content networks
Overlay "Content Layer" to enable richer services on top of layer 7 protocols (HTTP, RTSP)
• Information processed at layers 4 through 7 of the OSI stack
• Units of transported data in content networks are images, movies, songs
• Overlays is a concept which can be used to:- deploy new services on the Internet
(Mbone, 6bone, Abone, Peer-to-Peer, Content Networks)- get around problems in the underlying technology
(Resilient Overlay Networks)
• Further reading - Overlays:- www.savetz.com/mbone/- www.6bone.net/- nms.lcs.mit.edu/projects/ron/- www-2.cs.cmu.edu/~hzhang/VNS/
• Further reading - CDNs:- www.ietf.org/internet-drafts/draft-ietf-cdi-model-01.txt- www.ietf.org/internet-drafts/draft-ietf-cdi-known-request-routing-00.txt- Bunch of papers … send me mail if you are interested
• Questions? Answers? Thoughts?
In Summary
• Full-Site delivery is what we have seen so far – entire webpage is delivered from the CDN
• Partial-Site delivery delivers only embedded objects (say, only images on the webpage) from the CDN
• Embedded object redirection can be done using DNS based request routing or URL rewriting
Full-Site vs. Partial-Site Content Delivery
Q: How many TCP connections are needed to do a P-HTTP transfer of a webpage with embedded objects using the above 2 techniques?
Surrogate Server
CDN
Origin Server
Client
GET index.html
GET image1.gif, image2.gif
inde
x.ht
ml,
imag
e1.g
if,
imag
e2.g
if
CDN with Full-Site Delivery
index.html
embedded image1.gifimage2.gif
Origin Server
SurrogateServer
CDN
Client GET index.html
GET image1.gif, image2.gif
imag
e1.g
if,
imag
e2.g
if
CDN with Partial-Site Delivery
index.html
embedded image1.gifimage2.gif
CDN Types (Skeletal)
CDNs
Hosting CDN Relaying CDN
Partial Site Content Delivery
Full Site Content Delivery
URL Rewriting
DNS based
Request Routing Techniques