![Page 1: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/1.jpg)
Hui Zhang, Fall 2012 1
15-441 Computer Networking
CDN and P2P
![Page 2: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/2.jpg)
Hui Zhang, Fall 2012 2
Content Distribution Networks(CDNs)
A third-party caching networks for content publishers - proxy caches deployed by network providers
Goals- Load balancing- Fast response- High availability- Handling flash crowd
Commercially successful- Akamai, Limelight, CDNetworks, …
content
replica
replicareplica
replica
replica
replica
Origin Server
2
![Page 3: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/3.jpg)
Hui Zhang, Fall 2012 33
Two Key Questions
How does CDN intercept a web request? How does CDN select a good server for a given
request?
![Page 4: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/4.jpg)
Hui Zhang, Fall 2012 44
Intercepting the Request: DNS Re-Direction
Root server gives NS record for akamai.net Akamai.net name server returns NS record for
g.akamaitech.net- Name server chosen to be in region of client’s name
server
- TTL is large
G.akamaitech.net nameserver chooses server in region
- Should try to chose server that has file in cache - How to choose?
- Uses aXYZ name and hash
- TTL is small why?
![Page 5: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/5.jpg)
Hui Zhang, Fall 2012 55
DNS Re-Direction In Action
End-user
cnn.com (content provider)DNS root server
1 2
3
4
Akamai high-level DNS server
Akamai low-level DNS server
Nearby matchingAkamai server
12
6
78
910
Get index.html
Get foo.jpg
5Local DNS server
11
13
14
Control Plane Call
Data Plane Call
![Page 6: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/6.jpg)
Hui Zhang, Fall 2012 66
P2P System
Leverage the resources of client machines (peers)- Computation, storage, bandwidth
Self-scaling
![Page 7: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/7.jpg)
Hui Zhang, Fall 2012 7
P2P Design Choices
Data plane- whole-file vs. chunks
Control plane- Centralized index (Napster, etc.)
- Flooding (Gnutella, etc.)
- Smarter flooding (KaZaA, …)
- Routing (Freenet, etc.)
![Page 8: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/8.jpg)
Hui Zhang, Fall 2012 8
How Different Solutions Fit In
Central Flood Super-node flood
Route
WholeFile
Napster Gnutella
Freenet
ChunkBased
BitTorrent
KaZaA (bytes, not chunks)
DHTseDonkey2000
![Page 9: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/9.jpg)
Hui Zhang, Fall 2012 9
Searching
Internet
N1
N2 N3
N6N5
N4
Publisher
Key=“title”Value=MP3 data… Client
Lookup(“title”)
?
![Page 10: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/10.jpg)
Hui Zhang, Fall 2012 10
Searching 2
Needles vs. Haystacks- Searching for top 40, or an obscure punk track from
1981 that nobody’s heard of?
Search expressiveness- Whole word? Regular expressions? File names?
Attributes? Whole-text search?
• (e.g., p2p gnutella or p2p google?)
![Page 11: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/11.jpg)
Hui Zhang, Fall 2012 11
Framework
Common Primitives:- Join: how to I begin participating?
- Publish: how do I advertise my file?
- Search: how to I find a file?
- Fetch: how to I retrieve a file?
![Page 12: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/12.jpg)
Hui Zhang, Fall 2012 12
Next Topic...
Centralized Database- Napster
Query Flooding- Gnutella
Intelligent Query Flooding- KaZaA
Swarming- BitTorrent
Unstructured Overlay Routing- Freenet
Structured Overlay Routing- Distributed Hash Tables
![Page 13: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/13.jpg)
Hui Zhang, Fall 2012 13
Napster: History
1999: Sean Fanning launches Napster Peaked at 1.5 million simultaneous users Jul 2001: Napster shuts down
![Page 14: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/14.jpg)
Hui Zhang, Fall 2012 14
Napster: Overiew
Centralized Database:- Join: on startup, client contacts central server- Publish: reports list of files to central server- Search: query the server => return someone that stores the
requested file- Fetch: get the file directly from peer
![Page 15: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/15.jpg)
Hui Zhang, Fall 2012 15
Napster: Publish
I have X, Y, and Z!
Publish
insert(X, 123.2.21.23)
...
123.2.21.23
![Page 16: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/16.jpg)
Hui Zhang, Fall 2012 16
Napster: Search
Where is file A?
Query Reply
search(A)-->
123.2.0.18Fetch
123.2.0.18
![Page 17: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/17.jpg)
Hui Zhang, Fall 2012 17
Napster: Discussion
Pros:- Simple
- Search scope is O(1)
- Controllable (pro or con?)
Cons:- Server maintains O(N) State
- Server does all processing
- Single point of failure
![Page 18: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/18.jpg)
Hui Zhang, Fall 2012 18
Next Topic...
Centralized Database- Napster
Query Flooding- Gnutella
Intelligent Query Flooding- KaZaA
Swarming- BitTorrent
Unstructured Overlay Routing- Freenet
Structured Overlay Routing- Distributed Hash Tables
![Page 19: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/19.jpg)
Hui Zhang, Fall 2012 19
Gnutella: History
In 2000, J. Frankel and T. Pepper from Nullsoft released Gnutella
Soon many other clients: Bearshare, Morpheus, LimeWire, etc.
In 2001, many protocol enhancements including “ultrapeers”
![Page 20: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/20.jpg)
Hui Zhang, Fall 2012 20
Gnutella: Overview
Query Flooding:- Join: on startup, client contacts a few other nodes;
these become its “neighbors”- Publish: no need- Search: ask neighbors, who ask their neighbors, and
so on... when/if found, reply to sender.• TTL limits propagation
- Fetch: get the file directly from peer
![Page 21: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/21.jpg)
Hui Zhang, Fall 2012 21
I have file A.
I have file A.
Gnutella: Search
Where is file A?
Query
Reply
![Page 22: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/22.jpg)
Hui Zhang, Fall 2012 22
Gnutella: Discussion
Pros:
- Fully de-centralized
- Search cost distributed
- Processing @ each node permits powerful search semantics Cons:
- Search scope is O(N)
- Search time is O(???)
- Nodes leave often, network unstable TTL-limited search works well for haystacks.
- For scalability, does NOT search every node. May have to re-issue query later
![Page 23: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/23.jpg)
Hui Zhang, Fall 2012 23
KaZaA: History
In 2001, KaZaA created by Dutch company Kazaa BV
Single network called FastTrack used by other clients as well: Morpheus, giFT, etc.
Eventually protocol changed so other clients could no longer talk to it
Most popular file sharing network today with >10 million users (number varies)
![Page 24: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/24.jpg)
Hui Zhang, Fall 2012 24
KaZaA: Overview
“Smart” Query Flooding:- Join: on startup, client contacts a “supernode”
... may at some point become one itself- Publish: send list of files to supernode- Search: send query to supernode,
supernodes flood query amongst themselves.- Fetch: get the file directly from peer(s); can
fetch simultaneously from multiple peers
![Page 25: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/25.jpg)
Hui Zhang, Fall 2012 25
KaZaA: Network Design
“Super Nodes”
![Page 26: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/26.jpg)
Hui Zhang, Fall 2012 26
KaZaA: File Insert
I have X!
Publish
insert(X, 123.2.21.23)
...
123.2.21.23
![Page 27: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/27.jpg)
Hui Zhang, Fall 2012 27
KaZaA: File Search
Where is file A?
Query
search(A)-->
123.2.0.18
search(A)-->
123.2.22.50
Replies
123.2.0.18
123.2.22.50
![Page 28: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/28.jpg)
Hui Zhang, Fall 2012 28
KaZaA: Fetching
More than one node may have requested file... How to tell?
- Must be able to distinguish identical files
- Not necessarily same filename
- Same filename not necessarily same file...
Use Hash of file- KaZaA uses UUHash: fast, but not secure
- Alternatives: MD5, SHA-1
How to fetch?- Get bytes [0..1000] from A, [1001...2000] from B
- Alternative: Erasure Codes
![Page 29: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/29.jpg)
Hui Zhang, Fall 2012 29
KaZaA: Discussion
Pros:- Tries to take into account node heterogeneity:
• Bandwidth
• Host Computational Resources
• Host Availability (?)
- Rumored to take into account network locality
Cons:- Mechanisms easy to circumvent
- Still no real guarantees on search scope or search time
Similar behavior to gnutella, but better.
![Page 30: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/30.jpg)
Hui Zhang, Fall 2012 30
Stability and Superpeers
Why superpeers?- Query consolidation
• Many connected nodes may have only a few files
• Propagating a query to a sub-node would take more b/w than answering it yourself
- Caching effect• Requires network stability
Superpeer selection is time-based- How long you’ve been on is a good predictor
of how long you’ll be around.
![Page 31: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/31.jpg)
Hui Zhang, Fall 2012 31
BitTorrent: History
In 2002, B. Cohen debuted BitTorrent Key Motivation:
- Popularity exhibits temporal locality (Flash Crowds)
- E.g., Slashdot effect, CNN on 9/11, new movie/game release
Focused on Efficient Fetching, not Searching:- Distribute the same file to all peers
- Single publisher, multiple downloaders
Has some “real” publishers:- Blizzard Entertainment using it to distribute the beta of
their new game
![Page 32: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/32.jpg)
Hui Zhang, Fall 2012 32
BitTorrent: Overview
Swarming:
- Join: contact centralized “tracker” server, get a list of peers.
- Publish: Run a tracker server.
- Search: Out-of-band. E.g., use Google to find a tracker for the file you want.
- Fetch: Download chunks of the file from your peers. Upload chunks you have to them.
Big differences from Napster:
- Chunk based downloading (sound familiar? :)
- “few large files” focus
- Anti-freeloading mechanisms
![Page 33: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/33.jpg)
Hui Zhang, Fall 2012 33
BitTorrent: Publish/Join
Tracker
![Page 34: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/34.jpg)
Hui Zhang, Fall 2012 34
BitTorrent: Fetch
![Page 35: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/35.jpg)
Hui Zhang, Fall 2012 35
BitTorrent: Sharing Strategy
Employ “Tit-for-tat” sharing strategy- A is downloading from some other people
• A will let the fastest N of those download from him
- Be optimistic: occasionally let freeloaders download• Otherwise no one would ever start!
• Also allows you to discover better peers to download from when they reciprocate
- Let N peop
Goal: Pareto Efficiency- Game Theory: “No change can make anyone better off
without making others worse off”
- Does it work? (don’t know!)
![Page 36: Hui Zhang, Fall 2012 1 15-441 Computer Networking CDN and P2P](https://reader035.vdocuments.us/reader035/viewer/2022062322/5697bfab1a28abf838c9b30c/html5/thumbnails/36.jpg)
Hui Zhang, Fall 2012 36
BitTorrent: Summary
Pros:- Works reasonably well in practice
- Gives peers incentive to share resources; avoids freeloaders
Cons:- Pareto Efficiency relative weak condition
- Central tracker server needed to bootstrap swarm
- Tracker is a design choice, not a requirement. Could easily combine with other approaches.