dynamics of peer-to-peer networks or who is going to be the next pop star? yuval shavitt school of...
TRANSCRIPT
![Page 1: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/1.jpg)
Dynamics of Peer-to-Peer Networks or
Who is Going to be The Next Pop Star?
Yuval ShavittSchool of Electrical Engineering
[email protected]://www.eng.tau.ac.il/~shavitt
![Page 2: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/2.jpg)
Credits
Talk is based on the papers:• Static and dynamic characterization of the
Gnutella network [Shaked-Gish, S, Tankel, IPTPS 2007]
• How to predict the next pop star? [Koenigstein, S, Tankel, KDD 2008]
![Page 3: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/3.jpg)
What are Peer-to-Peer Networks?
• The common computing paradigm is client-server– Server waits for requests (on a
known port)– Client sends a request– Server serves the client– Examples: WWW, FTP, SMTP (e-
mail), …..
• Peer-to-peer networks:– Each end-point is both client and
server
client client
client client
client client
client clientserver
![Page 4: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/4.jpg)
The Gnutella Network
• Gnutella: The most popular sharing network on the Internet
• According to the Digital Music News Research Group 40% market share in Q4 2007
• Limewire: The most popular file sharing client in the world. Dominates the Gnutella network.
![Page 5: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/5.jpg)
The Gnutella Protocol
• Originally: a flat peer-to-peer distributed protocol.– Churn caused instability
• Today: a 2-level tiered system – Stable nodes are promoted to become ultrapeers– Queries carry OOB address:
The originator’s address or in most cases when the client is firewalled, this is the ultrapeer’s address
![Page 6: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/6.jpg)
Locating the Origin IP address
IP resolution Process:
• Detect the U.P. IP• Discard queries with
more than 2 hops• Discard queries with
2 hops and same IP• Intercept queries
with 2 hops and different IPs
peer peer
UPUPUP listener
peer
Cancels the bias for rare queries
Introduces bias against firewalled clients
![Page 7: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/7.jpg)
Data Sets• First study:
– Jul 2006 - Nov 2006– 665,000,000 world-wide geo-identified queries
• Second study– Oct 2006 – Jul 2007, Sundays only– 310,000,000 USA geo-identified queries
• A network crawl of 24 hours– 1.2M users– 533,000 different songs
Largest studies ever performedin length and depth
![Page 8: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/8.jpg)
Query Classification in Gnutella
Music (68.11%) Adult (22.01%)
Movie (4.1%) TV (1.7%)
Unknow n (1.67%) Japanese Anime/Comic (1.37%)
Softw are (0.54%) File Suff ix (0.26%)
Spam (0.23%)
2nd
![Page 9: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/9.jpg)
Top Coutries
![Page 10: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/10.jpg)
Queries Per Day
![Page 11: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/11.jpg)
Queries Per Hour Per User
![Page 12: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/12.jpg)
Top Queries (constant)
![Page 13: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/13.jpg)
Top Volatile Queries
![Page 14: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/14.jpg)
Temporal Ranking Drift
![Page 15: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/15.jpg)
How to Predict Artist’s Success?
Noam Koenigstein, Y. Shavitt, and Tomer Tankel. Spotting Out Emerging Artists Using Geo-Aware Analysis of P2P Query Strings. The 2008 ACM SIGKDD Conference, August 2008, Las Vegas, NV, USA.
![Page 16: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/16.jpg)
The Word of Mouth Effect
A successful innovation formation of adopter-clusters around early adopters
unsuccessful product a uniform spatial distribution
The Divergence can be used to predict a new product success probability [Garber et al., Marketing Science 2004]
![Page 17: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/17.jpg)
The divergence
• When measured against the uniform distribution, maximum is achieved when P is a function.– True for both Kullback-Leiblar and Jensen-
Shannon– This is the case when emerging artists are
considered
• Non uniform distribution of potential adopters:
![Page 18: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/18.jpg)
Party Like a Rockstar in 2007Week 6: The string “party like a rockstar” is detected by the algorithmWeek 8: Atlanta’s popularity chart in (Feb 18th)Week 15: Atlanta based Shop Boyz sign contract with Universal RecordingsWeek 18: The song first enters the Billboard Hot 100 on (80th position)Week 23: Reached 2nd position on Billboard Hot 100
Ranked only10,156on the
global chart
![Page 19: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/19.jpg)
Party Like a Rockstar
0
0.5
1
1.5
2
2.5
3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Week Numbers (2007)
Div
erg
en
ce
0.00E+00
1.00E-02
2.00E-02
3.00E-02
4.00E-02
5.00E-02
6.00E-02
7.00E-02
8.00E-02
Po
pu
lari
ty
KL Divergence
PopularityShop Boyz related queries in February 2007
Shop Boyz Popularity and Divergence in 2007
![Page 20: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/20.jpg)
Soulja Boy
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
Week Numbers (2007)
Div
erg
ence
0.00E+00
1.00E-02
2.00E-02
3.00E-02
4.00E-02
5.00E-02
6.00E-02
KL Divergence
Popularity
• Detected by our alg:already in 2006.
•The string “soulja boy” entered the “Atlanta queries top 100” already in October 2006
• Entered the Bubbling Under R&B/Hip-Hop Singles in the 23rd of June 2007•Later ranked first in the following Billboard charts:Hot 100, Hot Rap Tracks, Hot Videoclip, Hot RingMasters and Hot Ringtones
![Page 21: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/21.jpg)
Yung Berg
• Active in LA
• Week 2: Entered LA top 100
• Week 15: First appeared on the Billboard charts
• Week 32: Reached 18 on the Billboard Top 100
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
Week Numbers (2007)
Div
erg
ence
0.00E+00
2.00E-03
4.00E-03
6.00E-03
8.00E-03
1.00E-02
1.20E-02
1.40E-02
1.60E-02
Po
pu
lari
ty
KL Diveregence
Popularity
![Page 22: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/22.jpg)
Madonna
![Page 23: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/23.jpg)
The Detection Algorithm• Input: A list of Geo-identified P2P Query strings
Output: A list of locally popular query string with high probability to become globally popular
• Build local and global popularity charts
• local popularity is detected using local and global popularity thresholds
• Looking for local popularity growth trends from week to week
• Filtering:Non-music related content, and already familiar artists are characterized by uniform distribution
![Page 24: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/24.jpg)
Local Popularity
• Not all queries are “products”, thus divergence is not effective (e.g., rare typos)
• Detection is based on local popularity:
![Page 25: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/25.jpg)
ATPL - All Times Popular List• Initialization: All the strings that reached global popularity in
2006
• Weekly aggregation
• Filters non-volatile string: • adult related, e.g., “porn” • well established artists, e.g., “madonna”, “avril lavigne”• Movies, software, etc.
![Page 26: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/26.jpg)
Algorithm's Flow
![Page 27: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/27.jpg)
Detection Time
![Page 28: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/28.jpg)
Local Threshold
![Page 29: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/29.jpg)
Local Threshold
![Page 30: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/30.jpg)
Manual inspection of the Atlanta data
![Page 31: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/31.jpg)
Correlation Between Billboard and downloads
![Page 32: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/32.jpg)
Correlation Measurements
• Modified time series correlation
• P2P correlation with the Billboard:
![Page 33: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/33.jpg)
Finding The Optimal Time Shift
![Page 34: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/34.jpg)
Prediction Results
• Example:When a song enters the Billboard will it reach “top 20”?
• Precision: 89%, Recall: 80%On average songs pass the threshold 2.83 weeks before reaching top Billboard rank
• More details:Koenigstein, Shavitt, and Zilberman, AdMIRe 2009
![Page 35: Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il shavitt](https://reader036.vdocuments.us/reader036/viewer/2022081602/551bede2550346b4588b6489/html5/thumbnails/35.jpg)
Summary
• Following activity in the Internet can help up detect trends before they are visible– P2P networks– Social networks– Blogs– Talk-backs– Searches
• More at http://www.eng.tau.ac.il/~shavitt