introduction to the internet and peer-to-peer networksiosup/courses/2009-2010_in1105.pdf ·...

66
Introduction to The Internet and Peer-to-Peer networks Dr. Ir. Alexandru Iosup Parallel and Distributed Systems Delft University of Technology [email protected] The course was mainly developed by Dr. Ir. Johan Pouwelse http://www.pds.ewi.tudelft.nl/ ~iosup/Courses/ Internet_introduction_and_ peer_to_peer_networks_class.pdf http://www.pds.ewi.tudelft.nl/ ~iosup/Courses/ 2009-2010_IN1105.pdf

Upload: duongxuyen

Post on 13-Apr-2018

219 views

Category:

Documents


4 download

TRANSCRIPT

Introduction to The Internet and Peer-to-Peer networks

Dr. Ir. Alexandru Iosup

Parallel and Distributed SystemsDelft University of Technology

[email protected]

The course was mainly developed by Dr. Ir. Johan Pouwelse

http://www.pds.ewi.tudelft.nl/ ~iosup/Courses/ Internet_introduction_and_ peer_to_peer_networks_class.pdf

http://www.pds.ewi.tudelft.nl/ ~iosup/Courses/ 2009-2010_IN1105.pdf

Content

• Networks• Protocol layering• IP numbers• Internet routing• Packet loss• Reliable transport

• Congestion• TCP

• Assignment• Traceroute• mtr• wget

• Part II : Peer-to-Peer networks

Important networks

• The Internet• Telephony• Analog TV• LAN parties

Networking history

• 1800 - Optical Telegraph Network• 20 characters/min• code book with 25,392 entries

• 1848 – Morse code, 20 bits per second• 1866 - Atlantic telegraph cable, 7 words/min• 1890 – 250,000 telephones in the US• 1942 – First mainframe• 1960 – First 'dataphone'• 1965 – Arpanet• 1969 – 4 Internet comp.• 1972 – 24 comp.• 1977 – 111 comp.

Network types

• Connecting computers• Distance

• LAN• WAN

Networking Layers

Transport

Network

Data Link

Physical

•Packetizing, seq-num, retrans: e.g. TCP

•Routing, routing tables: e.g. IP

•Interface to physical media, recovery: e.g. retransmit on collision in Ethernet

•Electrical and mechanical characteristics of physical media: e.g. Ethernet

4

3

2

1

• Building blocks• Good architecture• Reduces cost• Easier design, analyse, implement, and test

Physical media

$10.00$23.90$1.64100 km2000 Mb/secSingle-mode optical fiber

~$1000$10.00$11.80$1.032 km600 Mb/secMultimode optical fiber

~$5$15.00$220.00$1.641 km10 Mb/secCoaxial cable

~$2$2.00$4.60$0.232 km(0.1 km)

1Mb/sec(20Mb/sec)

Twisted pair copper wire

Cost per computer interface

Labor cost to install

Cost for termination

Cost/ Meter

Maximum Distance

BandwidthMedia

~$1000

Network

Data Link

Physical

3

2

1

The Internet

• World's biggest WAN• Send and receive packets• Any Internet connected host• Any content• Reliable & Fast & Cheap

wlan.ewi.tudelft.nl Google.com

IP numbers

• Unique identity for each computer & router• 4 Bytes long = 32 bits: 232 different addresses• TUDelft webserver

• wlan.ewi.tudelft.nl• 130.161.158.72

• Google search engine• www.google.com• 64.233.183.99

• Foundation for Internet routing

Transport

Network

Data Link

Physical

4

3

2

1

IP number example

Internet routing

• Routes• Shortest path• Cheapest path• Fastest path

• Routing tables

wlan.ewi.tudelft.nl Google.com

/usr/sbin/traceroute Google.com

• Started from one IP number• Trace towards one location• Show Internet route• Numerous routers• for each router

• IP addresses and names• ms of distance

1

2

3

4

wlan.ewi.tudelft.nl

Google.com

dunet1.tudelft.nl130.161.1.49

145.145.26.9

gateway.its.tudelft.nl130.161.211.1

Traceroute Google.com

Traceroute Yahoo.com

Routing tables

• Final destination IP range• Next IP number Yahoo.com

Google.com

145.145.164.5

145.145.160.14

145.145.160.17

1 2 3 4

5G

5Y

6Y

6G

130.161.211.1

Destination Next hop216.109.112.0 – 216.109.127.255 145.145.160.14216.239.032.0 – 216.239.063.255 145.145.160.17etc.

Routing table: 145.145.164.5

Cisco 12416 Internet router

Dutch Research Internet SURFnet connects ~200 locations (~750K users):

- universities- academic hospitals- Polytechnic schools- research centers

- 6,000km connections ~ Dutch Railway system

DAS-3

Sources: Cees te Laat and Henri Bal

EU Research Internet

ABILENE: Backbone Research Network

• Test: Land Speed Record • ~ 7 Gb/s in single TCP

stream from Geneva to Caltech

Source: MonALISA monitoring framework, 2005

Packet loss

• Overload of routers • Limited bandwidth• Example connections

• Delft -> Amsterdam == 1 Gbps• Eindhoven -> Amsterdam == 1 Gbps• Twente -> Amsterdam == 1 Gbps• Amsterdam -> New York == 1 Gbps

• Traffic jam of The Internet• Solution

• Reduce bandwidth usage• File transfers take more time• TCP

mtr: Matt's Trace Route

• Shows packet loss on a link• Expansion of traceroute• Multiple packets• Latency

• last• average• best• worst• std. dev

mtr BTChina.net

mtr example (from Shanghai)

TCP: Reliable transport

• Slice into packets• Re-transmissions of packets• Understands router overload => packet loss• Conducts rate control

Transport

Network

Data Link

Physical

4

3

2

1

• Download speed testing of TCP on web server• Example locations (use Google):

• http://62.212.86.199/100mb.bin• http://fuller.zen.co.uk/test/100MB_zero.bin• http://speedtest2.eastlink.ca/superlarge.bin• http://www.vianet.ca/100mb-uncompressed.bin

WGET tool

Assignment commands:wget -O /dev/null 'http://www2.tele2.nl/100mb.bin'/usr/sbin/traceroute www2.tele2.nl

MP3 from: http://www.archive.org/details/on_liberty_librivoxhttp://www.archive.org/download/ramp315-Stochastic-Midnite-on-Tatooine.mp3

mtr and wget display

• Locations: Delft, Amsterdam, London, Boston• Routing: New York <=> Halifax

2.3Mbps

Assignment results

• Draw various routes• Add WGET performance numbers

Part II

Contents: Peer-to-Peer networks

• P2P file sharing• Usage• Napster/Gnutella/Kazaa• General model

• Bittorrent/Suprnova.org• Searching• Downloading• Bartering

• Understanding uptime• Assignment

• MP3 download• Spy on other downloaders

Definition: Aggregate use of distributed resources

• Resources at the "edge" of the internet• PCs are the dark matter of the internet• Cheap or near-zero cost

• In aggregate, these resources are valuable

• They are impossible to aggregate using traditional models

• P2P applications create novel ways of aggregating these resources

P2P - The new medium

P2P Benefits

• Cost structure• Setup• Maintain

• Scalability (superiour to Client/Server) • Availability (efficiency + reliability) • Fault-tolerance (recovers from errors) • Self-organization (deals with dynamic

activities) • Other application specific benefits

P2P examples

• Skype (Communication) • Millions of users• Superpeers• NAT circumvention/Firewall avoidance• Expand to video (Webcam/TV channel) • 0.017 Euro per minute

• Wikipedia.org (Information) • People collaboration• Joined knowledge

• File Sharing (The new medium) • Controversial• >70 % Internet traffic

P2P file sharing

Popular application75 % of EU Broadband users use P2P every month, Jupiter Media35M EU have downloaded music from file sharing service, Forrester Research

Various generationsNapsterGnutella / KazaaeMuleBittorrent

Get filesMovies / TV shows (DivX) Music (MP3) Games / Applications (ISO)

Evolution

• -1990 V1.0 Floppy & Tape• 1990s V2.0 FTP & Web servers• 1999 V3.0 Napster• 2001 V4.0 Gnutella / Kazaa• 2003 V5.0 Bittorrent• 2005 V6.0 Azureus++• 2007 V7.0 Tribler ??

Internet bandwidth: HTTP and P2P

Internet P2P backbone Bandwidth 2004

Internet P2P backbone Bandwidth 2006

Generic model of P2P file sharing

search download

Rating &Moderation

off-line

inject

idle

P2P History: Napster

• program for sharing files over the Internet• a “disruptive” application/technology?• history:

• 5/99: Shawn Fanning founds Napster • 12/99: first lawsuit• 3/00: 25% university traffic• 2000: est. 60M users• 2/01: US Circuit Court of Appeals:

Napster knew users violating copyright laws

• 7/01: # simultaneous online users: Napster 160K, Gnutella: 40K, Morpheus: 300K

• 1/04: 66 % of Internet bandwidth

Napster

napster.com

users

Central File list

Gnutella: Flooding

Requestsand

results

Kazaa users

Date Users20 April 2001 20,0006 June 2001 150,000

11 October 2001 550,00026 December 2001 515,000

31 March 2003 3,450,000

Kazaa problems

• Every download == upload– Bandwidth is a resource– 80% of Gnutella people do not share

• Fake files– Popular films/MP3s/Games– Renames of content

• Limited search capability– Supernodes do not scale– Napster-like indexing– Low-quality metadata

Fake files

Fake Files

• Deliberate attack• Patented• Junk files• Virus (search for .pif .exe .bat etc.) • Everyone can do it

• Unmoderated My Shared Folder• Multiple machines

• Problems:– Hard to find– Download and delete– Irritates users

Bittorrent & Suprnova.org

Bittorrent & Suprnova.org

• Suprnova.org• Search• inject• Rating & Moderation

• Bittorrent • Download

search download

Rating &Moderation

off-line

upload

idle

Bittorrent overview

peer tracker

peer

peer

Suprnova.orgLotR.torrent

Bartering for bytes

seed

leecher

leecher

leecher

medium download speed

low download speed

high download speed

• Download speed:upload-bandwidth constrained

• Peer selection policy:with highest bandwidth first

• Piece selection policy:rarest first

Bittorrent in the real-world

• 100 DAS2 nodes (1-Ghz Pentium-IIIs, 1 GB RAM)

• 8-month traces of more than 2,000 global components

• Complete lifetime of a popular file (90,000 peers)

• Uptime measurement of 55,000 peers

• 150 GB of collected data

Peer availability

3 months

3 weeks

3 min

3 days

UPTIME

PEERSReliable Not reliable

147.32.223.89

89.141.64.128

212.39.232.10160.120.240.10

121.167.190.43

Peer availability

Overall system activity

Number of active users in the system is strongly influenced

by the availability of the global components in

BitTorrent/Suprnova

www.pds.ewi.tudelft.nl/~iosup/p2pview.html

Tribler P2P research

P2P file sharing ecosystem

• Significant social demand• 85,000,000 addicted users• 24 Hours (songs/games/TV shows/movies) • Legal/Tech. counter-measures ineffective

• Insufficient knowledge at content providers• Not a single operational DRM system

• DRM does not work (according to S of RSA) • Fake files already ineffective• Artist to consumer• Possibly dying businesses

• MP3 • MPEG• ISO

The big picture

• Skype: phone call <0.03 $/min• Wikipedia: Encyclopedia Britannica• Newspapers: blogging versus NRC• Talpa: reduction in advertisement income• Bittorrent movies: DVD sales ?• Tribler: 150,000+ downloads

Hollywood Future: Vision on “Compete with free”

• Understand and follow P2P file sharing• Be a superior alternative• Add value• Stand-alone exploitation ends !• Be a friend:

• Music label• Lawsuites ?• Pricing ?

MP3 download Assignment

• Assignment• Visit http://releases.ubuntu.com• Download an ISO image, Check download speed

• Install BT http://download.bittorrent.com/dl/BitTorrent- 4.0.0-GPL.tar.gz

• Downloaders details• command: lsof -n| grep 'ktorrent.*TCP'• Locate position of downloaders using http://HostIP.info• wget -q -O -

'http://www.hostip.info/api/get.html?position=true&ip=1 30.161.43.249'

./btdownloadheadless.py --url 'http://releases.ubuntu.com/6.06.2/ubuntu- 6.06.1-desktop-i386.iso.torrent'

Exercise Solution

• In Browser• Go to http://releases.ubuntu.com• Locate a torrent file corresponding to an Ubuntu release, e.g., http://releases.ubuntu.com/6.06.2/ubuntu-6.06.1- desktop-i386.iso.torrent

• In Terminal/Console• cd• wget http://download.bittorrent.com/dl/BitTorrent -4.0.0-GPL.tar.gz

• tar zxvf BitTorrent-4.0.0-GPL.tar.gz• cd BitTorrent-4.0.0-GPL• ./btdownloadheadless.py --url <TorrentURL>, e.g., ./btdownloadheadless.py --url http://releases.ubuntu.com/6.06.2/ubuntu-6.06.1- desktop-i386.iso.torrent

Internet information

History http://www.isoc.org/internet/history/TCP http://pclt.cis.yale.edu/pclt/COMM/TCPIP.HTMRouting http://www.cc.gatech.edu/classes/

AY2005/cs3251_fall/ROUTING.pdfPeering http://www.openpeering.nl/Size http://www.firstmonday.org/issues/issue3_10/coffman/Addresses http://bgp.potaroo.net/ipv4/Monitoring the Internet: CAIDA http://www.caida.org/

P2P information

Tutorial http://cis.poly.edu/~ross/papers/P2PtutorialInfocom.pdfBittorrent http://www.theregister.co.uk/2004/12/18/

bittorrent_measurements_analysisKazaa 1 http://cis.poly.edu/~ross/papers/KazaaOverlay.pdfKazaa 2 http://www.apcmag.com/apc/v3.nsf/0/

AEC8DF4A9B2F3F06CA256DB00010EC9AP2P blog http://www.plasticbag.org/archives/2003/09/

weblogs_and_the_mass_amateurisation_of_nearly_eve rything.shtml