analyzing the video popularity characteristics of …...analyzing the video popularity...

28
Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez,Yong-Yeol Ahn, and Sue Moon

Upload: others

Post on 07-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Analyzing the Video Popularity Characteristics of Large-Scale User

Generated Content Systems

• Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, and Sue Moon

Page 2: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Why the study of UGC( User generated content )

• “bite-size bits for high-speed munching” [Wired magazine]

• Hundreds of millions of Internet users are content consumers AND publishers

• UGC is very different from VoD

Page 3: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

UGC vs. VoD

• DecentralizationUGC : Unlimited choice of content and the convenience of the WebEarly days of TV : Same program at the same time

• Scale15 days in YouTube to produce 120-yr worth of movies in IMDb

• Publisher1000 uploads over few years vs. 100 movies over 50 years

• Length30 sec - 5 min vs. 100 min movies in LoveFilm

Page 4: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Understanding the popularity characteristics of UGC

• Estimate the latent demand - may exist due to bottlenecks

• A lack of editorial control- problems for content aliasing and copyright infringement

Page 5: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

DataSummary of User-Generated Video and Non-UGC Traces

Page 6: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Goal

• UGC Versus NON-UGC

• Popularity Distribution

• Popularity Evolution over Time

• Aliasing and Illegal Uploads

Page 7: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Part 1 : UGC versus NON-UGC

- Content Production Patterns

- Content Consumption Patterns

Page 8: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Content Production Patterns

0ver 1,000 videos over a few years

Two orders of magnitude

Page 9: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Content Production Patterns

Cultural differences may cause Daum uploaders to be more active on Sundays

An off-peak day for YouTube users

Page 10: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Content Consumption Patterns( Scale of Pupularity )

Netflix : user customer ratings instead (so lower bound on the graph)

Zero Viewer (1,782)

MedianYouTube (182)Netflix (561)

Yahoo! (3,843,300)

Many unpopular videos in UGC

Page 11: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Content Consumption Patterns

• User Participation- The video popularity and rating a. String positive linear relationship for both UGC and non-UGC The correlation coefficient, 0.8 (YouTube), 0.87 (Yahoo! Movies) b. The level of active user participation : low Only 0.22% account do ratings Only 0.16% account do comments

• How Content is Found- 47% of all videos have incoming links- Nevertheless, the total clicks from these links are only 3% ( External links are not significant)

Page 12: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Part 2 : Popularity Distribution

- Pareto Principle

- Statistical Properties

Page 13: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Why analyze the popularity distribution

• Helps us understand the underlying mechanism

• Helps us answer important design questions- The scale-free nature of Web requests : Improve search engines and advertising policies- The distribution of book sales : Design better online stores and recommendation engines

Page 14: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Pareto Principle

!"#$%&'()*+,'*)"+#%-.'-/ 0#%1

2'"-

+"3+%

//#)/%

2)+,')45

!"#$%&'()*($&+',&-.-"$/-&-#'0&-/1))$%&-2$03

10% of the top popular videos account for nearly 80% of views

Page 15: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

UGC Video Empirical Plot

Straight Line waists (two orders of magnitude) and truncated both end

Page 16: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Most Popular VideosFetch-at-most-once behavior

Two types of user populationsFetchOnce : Behavior of fetching each immutable object only once

Power : Behavior of requesting popular vides multiple times

Page 17: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Most Popular Videos

Number of Videos (V), Users (U), Average number of request per user (R)

Fetch-at-most-onceThe decay in tail gets amplified for larger R and smaller V

Page 18: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Long Tail opportunities in UCGTwo reasons for a decaying tail below

1. The natural shape of the UGC popularity distribution is CURVED2. Bottlenecks in the system

(Information filtering or post-filters)

Page 19: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

If it is Naturally Curved?

And decaying tail is due to removable bottlenecks

Potential Benefit from removal of bottlenecks

Page 20: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Part 3 : Popularity Evolution over Time

- Popularity distribution Versus Age

- Temporal Focus

- Time Evolution of the Most Popular Videos

Page 21: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Popularity Distribution versus Age

Viewers are mildly more interested in new videos in the average requests

Page 22: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Popularity of individual UGC over Time

If a video did not get multiple requests during its first day, it is unlikely that it will get many requests in the future

The percentage of videos aged up to X days that had no more than V views

Page 23: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Video rank changes over a range of video ages

Young videos change many rank positions very fast,Old vides have a much smaller rank fluctuation

Some of the old videos increased ranks dramatically

Page 24: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Part 4 : Aliasing and Illegal Uploads

- Content Aliasing

- Illegal Uploads

Page 25: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Content Aliasing

• Content Aliasing- Exist multiple identical or very similar copies for a single popular event

• Aliasing dilute the popularity of the corresponding event.

• Has a direct impact on the design of recommendation and ranking system

Page 26: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

The level of popularity dilution

Recruited 51 volunteersIdentified 1,224 aliases, covering 184 out of the 216 videos

More than two orders of magnitude

Undiluted, the original video would be ranked much higher

Page 27: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Number of aliases versus the age differences

Significant aliases appear within one week

Cross-posted over multiple categories received almost 1,000 times more views.

Page 28: Analyzing the Video Popularity Characteristics of …...Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems • Meeyoung Cha, Haewoon Kwak,

Contribution

• An extensive trace-driven analysis of UGC video popularity distributions

- Analysis reveals properties about how users of these systems request UGC videos.

- Investigate whether video popularity can be modeled as a power-law

- What characteristics of the system influence the shape of the distribution

- Examine non-stationary properties of the UGC vide popularity

- Reveals the level of piracy and content duplcation