social networks and the diffusion of user-generated ...ryoung/phdseminar/socialnet-diffusion.pdf ·...

19
Information Systems Research Articles in Advance, pp. 1–19 issn 1047-7047 eissn 1526-5536 inf orms ® doi 10.1287/isre.1100.0339 © 2011 INFORMS Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube Anjana Susarla Tepper School of Business, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, [email protected] Jeong-Ha Oh School of Management, University of Texas at Dallas, Richardson, Texas 52242, [email protected] Yong Tan Michael G. Foster School of Business, University of Washington, Seattle, Washington 98195, [email protected] T his paper is motivated by the success of YouTube, which is attractive to content creators as well as corpo- rations for its potential to rapidly disseminate digital content. The networked structure of interactions on YouTube and the tremendous variation in the success of videos posted online lends itself to an inquiry of the role of social influence. Using a unique data set of video information and user information collected from YouTube, we find that social interactions are influential not only in determining which videos become successful but also on the magnitude of that impact. We also find evidence for a number of mechanisms by which social influence is transmitted, such as (i) a preference for conformity and homophily and (ii) the role of social networks in guid- ing opinion formation and directing product search and discovery. Econometrically, the problem in identifying social influence is that individuals’ choices depend in great part upon the choices of other individuals, referred to as the reflection problem. Another problem in identification is to distinguish between social contagion and user heterogeneity in the diffusion process. Our results are in sharp contrast to earlier models of diffusion, such as the Bass model, that do not distinguish between different social processes that are responsible for the process of diffusion. Our results are robust to potential self-selection according to user tastes, temporal heterogeneity and the reflection problem. Implications for researchers and managers are discussed. Key words : diffusion; user-generated content; YouTube; social networks; reflection problem History : Sanjeev Dewan, Senior Editor; Indranil Bardhan, Associate Editor. This paper was received July 21, 2008, and was with the authors for 10 1/4 months for 3 revisions. Published online in Articles in Advance. 1. Introduction The past decade has witnessed a tremendous growth in social computing and user-generated content (Peck et al. 2008), shifting the role of technology from infor- mation processing to actionable social intelligence embedded in computing platforms (Wang et al. 2007). This research is motivated by the tremendous growth of a social computing platform, YouTube. The usabil- ity and functionality of YouTube makes it easy for users to create their own channel and to post content that can be shared almost instantaneously to a wide audience across the world, making this an attractive platform to content creators and media companies alike. YouTube also enables a variety of social interac- tions whereby users can choose to friend or subscribe to other channels, comment on or choose favorite videos, and even post response videos to other chan- nels. This dual nature of user participation, in content creation as well as opinion formation, is in contrast to earlier online communities that did not enable such rich features of social interaction (Parameswaran and Whinston 2007). Ultimately, the explosion of creativity and self-expression unleashed by YouTube promises to transform consumer engagement with popular cul- ture, and has the potential to alter the structure of industries that deal with digital products such as media and entertainment. Compared to other models of user-generated con- tent, the democratic nature of content creation and the lack of formal monitoring and reputation mechanisms foster a self-regulating dynamic of social interaction on YouTube. The nature of product search, content discovery, and consumer preferences on YouTube thus entail different assumptions about behavior and deci- sion making, as distinct from that of atomistic mar- ket agents maximizing individual utility. For instance, economic models of aggregate influence, such as net- work externalities, assume that others’ actions impact one’s own behavior by directly affecting one’s pay- off, rather than changing the information available to market agents through the networked structure of social interactions (e.g., Easley and Kleinberg 2010). One of the hallmarks of YouTube is the tremendous variation in the success of content, where a handful of videos acquire Internet superstar status while most 1 Copyright: INFORMS holds copyright to this Articles in Advance version, which is made available to subscribers. The file may not be posted on any other website, including the author’s site. Please send any questions regarding this policy to [email protected]. Published online ahead of print April 8, 2011

Upload: vuongminh

Post on 04-May-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Information Systems ResearchArticles in Advance, pp. 1–19issn 1047-7047 �eissn 1526-5536

informs ®

doi 10.1287/isre.1100.0339©2011 INFORMS

Social Networks and the Diffusion of User-GeneratedContent: Evidence from YouTube

Anjana SusarlaTepper School of Business, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, [email protected]

Jeong-Ha OhSchool of Management, University of Texas at Dallas, Richardson, Texas 52242, [email protected]

Yong TanMichael G. Foster School of Business, University of Washington, Seattle, Washington 98195, [email protected]

This paper is motivated by the success of YouTube, which is attractive to content creators as well as corpo-rations for its potential to rapidly disseminate digital content. The networked structure of interactions on

YouTube and the tremendous variation in the success of videos posted online lends itself to an inquiry of the roleof social influence. Using a unique data set of video information and user information collected from YouTube,we find that social interactions are influential not only in determining which videos become successful but alsoon the magnitude of that impact. We also find evidence for a number of mechanisms by which social influenceis transmitted, such as (i) a preference for conformity and homophily and (ii) the role of social networks in guid-ing opinion formation and directing product search and discovery. Econometrically, the problem in identifyingsocial influence is that individuals’ choices depend in great part upon the choices of other individuals, referredto as the reflection problem. Another problem in identification is to distinguish between social contagion and userheterogeneity in the diffusion process. Our results are in sharp contrast to earlier models of diffusion, such asthe Bass model, that do not distinguish between different social processes that are responsible for the processof diffusion. Our results are robust to potential self-selection according to user tastes, temporal heterogeneityand the reflection problem. Implications for researchers and managers are discussed.

Key words : diffusion; user-generated content; YouTube; social networks; reflection problemHistory : Sanjeev Dewan, Senior Editor; Indranil Bardhan, Associate Editor. This paper was received July 21,

2008, and was with the authors for 10 1/4 months for 3 revisions. Published online in Articles in Advance.

1. IntroductionThe past decade has witnessed a tremendous growthin social computing and user-generated content (Pecket al. 2008), shifting the role of technology from infor-mation processing to actionable social intelligenceembedded in computing platforms (Wang et al. 2007).This research is motivated by the tremendous growthof a social computing platform, YouTube. The usabil-ity and functionality of YouTube makes it easy forusers to create their own channel and to post contentthat can be shared almost instantaneously to a wideaudience across the world, making this an attractiveplatform to content creators and media companiesalike. YouTube also enables a variety of social interac-tions whereby users can choose to friend or subscribeto other channels, comment on or choose favoritevideos, and even post response videos to other chan-nels. This dual nature of user participation, in contentcreation as well as opinion formation, is in contrast toearlier online communities that did not enable suchrich features of social interaction (Parameswaran andWhinston 2007). Ultimately, the explosion of creativity

and self-expression unleashed by YouTube promisesto transform consumer engagement with popular cul-ture, and has the potential to alter the structure ofindustries that deal with digital products such asmedia and entertainment.Compared to other models of user-generated con-

tent, the democratic nature of content creation and thelack of formal monitoring and reputation mechanismsfoster a self-regulating dynamic of social interactionon YouTube. The nature of product search, contentdiscovery, and consumer preferences on YouTube thusentail different assumptions about behavior and deci-sion making, as distinct from that of atomistic mar-ket agents maximizing individual utility. For instance,economic models of aggregate influence, such as net-work externalities, assume that others’ actions impactone’s own behavior by directly affecting one’s pay-off, rather than changing the information availableto market agents through the networked structure ofsocial interactions (e.g., Easley and Kleinberg 2010).One of the hallmarks of YouTube is the tremendous

variation in the success of content, where a handfulof videos acquire Internet superstar status while most

1

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Published online ahead of print April 8, 2011

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube2 Information Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS

languish in obscurity (e.g., Crane and Sornette 2008).The inequality and unpredictability of success in cul-tural markets has been attributed to social contagion(Salganik et al. 2006). The central question examinedin this paper is the impact of contagion through thenetworked structures of interaction on the diffusion ofdigital products on YouTube. Social contagion broadlydescribes a class of phenomenon where preferencesand actions of individuals are influenced by inter-personal contact, impacting the aggregate diffusionand spread of behaviors, new products, ideas, or epi-demics (Dodds and Watts 2004, 2005). We build upona rich set of explanations for social contagion, suchas the desire for social conformity, homophily, andawareness diffusion. Using a data set of video infor-mation and user information collected from YouTube,we find that social interactions play an important rolenot only in the success of user-generated content butalso on the magnitude of that impact.Identifying social influence underlying aggregate

popularity growth poses a number of econometricchallenges. First, individual user preferences mightbe subject to popularity surges that create a serialcorrelation. Second, it is difficult to infer true socialinfluence, whereby individual’s behavior could resulteither from the prevailing norms or tastes of the socialgroup that she belongs to, from similarity in behav-ior due to common characteristics or similar envi-ronments. The problem in distinguishing true socialinfluence from that of spurious correlation is referredto as the reflection problem (Manski 1993). We buildupon a considerable body of research across severalfields such as marketing, economics and sociology toidentify true social influence or peer effects from alter-nate factors (e.g., Bandiera and Rasul 2006, Bramoulleet al. 2009, Granovetter 1973, Manski 1993, Van denBulte and Lilien 2001, Sacerdote 2001). In particular,the impact of a focal channel in directing user searchbehavior can be confounded by user self-selection. Inother words, we need to examine whether the diffu-sion process is driven users selecting which channelsthey want to view, or whether it is the social struc-ture of interactions that confer an authoritative roleto a central channel. Our empirical approach distin-guishes between endogenous or true social influencefrom contextual or correlated effects arising from userself-selection by conducting exclusion restrictions thatidentify group interactions. Third, aggregate popu-larity of a video might be a result of unobservedattention-gathering efforts by channels that impacts apotential user’s propensity to view particular typesof content or to visit certain channels. We employHausman-Taylor estimation methods and multilevelmodels to consider the potential heterogeneity invideo and channel characteristics on YouTube.

This paper can make the following contributionsto literature. A considerable amount of IS literaturehas examined the impact of diffusion on individu-als’ adoption (Fichman 2000). By contrast, this paperexamines the dynamics of digital content diffusionstructured through a network. We quantify the impactof the social network structure as a pathway in (i) pro-viding information aiding in product search and dis-covery and (ii) spreading social influence that impactspotential experience. Prior models of diffusion, suchas the Bass model (Bass 1969), do not identify themechanism by which the transmission of social influ-ence occurs. Indeed, it has been posited that a limita-tion of prior literature is that the S-shaped diffusioncurves could result from user heterogeneity (Van denBulte and Stremersch 2004). This paper disentanglesthe impact of social contagion on diffusion from thatof (i) endogenous self-selection of users into groupsdictated by tastes and (ii) channel heterogeneity thatcould impact the awareness of potential viewers. Wealso examine whether the diffusion process in the ini-tial phases may be subject to different influences fromthat in the later phases by distinguishing betweensearch and experience characteristics of a video.The structure of the paper is as follows. We dis-

cuss the context of YouTube and describe the tax-onomy of social network structures in §2. Section 3presents the theory and hypotheses. In §4 we discussthe data collection process and operationalization ofmeasures. Section 5 presents the empirical approach,§6 discusses the results, and §7 concludes.

2. The ContextGiven the ease of creating a personalized page orchannel, a user on YouTube can engage in self-expression (Raymond 2001) as well as obtain peerrecognition (Resnick et al. 2000) from social inter-actions with other users. A channel allows usersto display content that they uploaded; videos fromother members; videos favorited by the channel, theirfriends, and subscribers; as well as channels thatthey subscribe to. The ease of creating a personalizedchannel on YouTube therefore blurs the boundariesbetween creators and consumers of content. A friendrelationship on YouTube is initiated by an invitationfrom one channel to another, requiring confirmationfrom the other. Because the friend network is theresult of mutual agreement, we characterize such net-works as an undirected network (e.g., Newman 2003).By contrast, the act of subscription indicates a will-ingness to visit and watch the videos uploaded byanother channel. Because a subscriber relationship isa one-way relationship representative of user tastes,we characterize it as a directional network. When anew video is posted, all the friends and subscribers of

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTubeInformation Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS 3

a channel are alerted through email or RSS feeds. Sub-scribers and friends can rate and comment on videosin addition to adding videos to a list of favorites. Suchchoices can also serve as signals to other users, driv-ing the popularity of content.We therefore identify three distinct mechanisms of

social influence on YouTube. First, there are networksof friends within the community of interest that wecharacterize as a local network of friends. Second,we observe friend ties between users from outsidethe community of interest, which we characterize asnonlocal or long ties (Centola and Macy 2007). Third,we observe that there are networks of subscriberswithin the community of interest, or social networksbased on instrumental ties, i.e., a pattern of affiliationbased on shared interests. In the data collection sec-tion, we explain how we demarcate the boundariesof these social network structures. Given the differentmotives in adding friends or subscribing to anotherchannel, we expect differences in the mechanism ofsocial ties characterizing different types of networks.We do not observe substantial overlap in membershipbetween friend and subscriber networks, which bol-sters our argument about the difference between thesenetworks.The qualitative evidence from YouTube on the

networked structure of interactions summarized inthe online appendix1 highlights several patternsof interest. First, friend ties within the communityof interest are characterized by greater frequencyin interaction compared to subscriber ties, consis-tent with prior theory that group membership isa basis for defining identity and for social interac-tions (Watts et al. 2002). Friend relationships could becharacterized by homophily and affinity while sub-scriber ties could exist for informational purposes toobtain content based on users’ tastes and interests.Second, we find greater interaction between friendswithin the network boundary, local friends, than fromfriends outside the network. We therefore build onprior literature by differentiating between cohesiveties or “local” network effects from that of nonlocalor nonredundant ties (long ties), which have differ-ent strengths from a structural or from an informa-tion transmission perspective (e.g., Burt 1987, Centolaand Macy 2007, Sundararajan 2007). Putnam (2000)differentiates between bridging capital that refer toloose connections between individuals who act asa source of useful information or new perspectives,and bonding social capital that exists in closer rela-tionships, such as kinship or friendships. The sub-scriber networks on YouTube seem to be indicative

1 Additional information is contained is an online appendix to thispaper that is available on the Information Systems Research website(http://isr.pubs.informs.org/ecompanion.html).

of the former, and could be classified as “weak ties”(Granovetter 1973, p. 1362).

3. Theory and Hypotheses3.1. Prior LiteratureThe Bass model (Bass 1969) has been widely usedto study the diffusion process in marketing (e.g.,Mahajan et al. 1990) and other disciplines. Socialnetwork methods have provided an important frame-work to study the diffusion of innovations in thesociology area (e.g., Wejnert 2002) starting with alandmark study by Coleman et al. (1966) that analyzesthe impact of social contagion on medical innova-tion. Strang and Tuma (1993) consider heterogeneityin the susceptibility of individual agents to conta-gion. Another stream of work has also examined theimpact of structural properties of social networks onthe propagation of epidemics (Watts 2002, Watts et al.2002).While the question of social influence has been

explored in earlier literature in IS (e.g., Armstrongand Sambamurthy 1999, Kraut et al. 1998), there aresome crucial distinctions between this study and priorwork. First, the literature on product diffusion dis-tinguishes between well-informed early adopters andlate adopters, wherein early adopters learn by doingwhile late adopters learn from observing others.2 Bycontrast, we posit that the influence of central chan-nels stems from their structural position in the socialnetwork. Second, we consider diffusion in the contextof a community where the diffusion process propa-gates through proximate links in a network. Third, weexplicitly address the reflection problem that poses achallenge in inferring social influence. It has been sug-gested, for instance, that marketing efforts can con-found the role of social contagion (Van den Bulte andLilien 2001). The S-shaped diffusion curves could alsoresult from user heterogeneity in the propensity toadopt (Van den Bulte and Stremersch 2004) ratherthan the impact of social contagion.

3.2. HypothesesThe two dimensions of user participation on YouTube,in both content creation as well as opinion mak-ing, impact two distinct aspects of the diffusion pro-cess. Following Kalish (1985), we distinguish betweensearch attributes and experience attributes of a videothat impact different stages of diffusion. First, giventhe bewildering array of content choices and the lim-itations of keyword search on YouTube (Szabo andHuberman 2009), potential viewers may lack aware-ness about the range of product choices available.

2 Bandiera and Rasul (2006) explicitly distinguish between the twomodels of learning.

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube4 Information Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS

Figure 1 Conceptual Diagram of Social Influence

Mechanism ofinfluence

Stage ofdiffusion

Impact on videocharacteristics

Type of network

Awarenessdiffusion

Search

Outgoingsubscribers

Incomingsubscribers

Attention

Initialstages

Homophily Local friends

Nonlocalfriends

Experience

Informationtransmission

Laterstages

Unlike other types of entertainment products, it isalso unlikely that a substantial number of channelsengage in promotional efforts or publicity efforts ofany form, which makes it highly uncertain that agiven video can acquire any momentum and reacha wide audience of viewers. The myopic nature ofproduct discovery coupled with the range and depthof offerings and the growth of titles in YouTube sub-stantially increase the uncertainty in searching andlocating content. We consider the role of subscribernetworks as a conduit to transmit awareness aboutnew videos, which is very important in the earlystages of diffusion. Second, a video being an experi-ence good (e.g., Nelson 1970), it is characterized bysubstantial uncertainty in terms of whether viewerswill favorably react to it or not. Because individualpreferences are highly influenced by the tastes andopinions of others (e.g., Bikhchandani et al. 1992),social processes such as conformity and peer pressurestructured through friend networks could impact auser’s perceived experience of a video.Given the uncertainty in search and experience,

this paper examines the impact of the network posi-tion of a channel (posting a video) on the processof diffusion of the video over the aggregate YouTubenetwork. The diffusion process begins when a nodethat is highly influential (measured through degreecentrality in our study) generates the infection byposting a video, which then spreads to other nodesthrough two mechanisms: (i) the ability of the cen-tral node to impact awareness of proximate actorsby directing search, and (ii) the ability of the centralnode to influence the potential experience of prox-imate actors through homophily and cohesion. Thetype of contagion we consider is that of a central actorincreasing the susceptibility of proximate actors byinfluencing the information available to other actors.

In other words, even if proximate actors (alters) donot immediately catch the infection from the focalchannel (ego), the latter is still contagious becauseit is attempting to transmit the infection. The infor-mational role of subscriber networks matters in theearly stage in the life of a video due to uncertainty insearch characteristics, while the influence of a chan-nel in the friend network matters in the later stage ofthe life of a video due to the uncertainty in experi-ence faced by a potential viewer. The conceptual dia-gram of social influence structured through differenttypes of social networks considered in this paper ispresented in Figure 1.

3.2.1. Impact of the Subscriber Network inDirecting Search. We examine the mechanism bywhich interactions structured through a social net-work lower the uncertainty in search associated witha new video. Given the limited attention span ofviewers and fluctuations in daily viewership of videos(Szabo and Huberman 2009), popularity of user-generated content is relatively ephemeral (e.g., Chaet al. 2007). Potential viewers face an informationalbottleneck since they may be aware of only a verysmall fraction of the available set of videos (e.g.,Hendricks and Sorenson 2009). Because that the pro-cess of searching and locating content might be localand myopic,3 a channel with a central position inthe subscriber network serves a conduit of infor-mation to others, guiding proximate nodes throughthe taste neighborhood (Liu et al. 2006). In the tra-dition of the epidemiology literature, infection (inour case the number of views of a video) spreads

3 Google has introduced video search optimization tools andenhanced recommendation tools for videos, which can influencethe search process, but these features were not present at the timeof our data collection.

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTubeInformation Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS 5

Figure 2 Subscriber Relationship (Directed Network)

CCC is subscribing toEEE

AAA is subscribing to DDD

AAA is subscribedby abc

ABC23 is subscribingto BBB

AAA’s subscribers’ list

AAA’s subscriptions list

from node to node through interpersonal contactstructured through a network (e.g., Newman 2003);Figure 2 depicts a representative subset of nodesin the directed subscriber network, distinguishingbetween incoming and outgoing connections of achannel.Channels with several incoming connections might

be more popular and might be more likely to dis-seminate information about a new video among awider group of actors. When a channel with a greaternumber of incoming ties from other nodes in asubscriber network posts a video, it lowers the infor-mational bottleneck faced by proximate nodes byincreasing exposure and thereby hastening awarenessdiffusion among the proximate ties (e.g., Kalish 1985).These proximate nodes in turn could be influentialin disseminating information and directing searchfrom other nodes, and eventually the video diffusesover the entire network. Thus, the shared interestin sampling videos characterizing social networks ofsubscriber groups provides an opinion-making roleto channels that have a high degree of incomingconnections.4

Channels with several outgoing connections may bemore gregarious, and more likely to be aware aboutthe types of content posted and viewed by nodes thatare incident upon the focal channel, i.e., more awareof the preferences of incident ties. The greater con-nectedness of the channel could provide an informa-tional advantage in directing awareness of incident

4 This explanation only relies on the centrality of the channel andnot on the directed path: “(infections) do not have targets and donot prefer to take the shortest paths to any node” (Borgatti 2005,p. 61).

ties toward itself. The greater the ability of the cen-tral channel to seek attention, the more the transmis-sibility of the contagion, enhancing the likelihood thatother viewers will actually watch a video, enhancingthe diffusion among the aggregate network. A chan-nel with a large number of outgoing connections isalso more susceptible to social contagion, increasingthe likelihood of infecting other nodes and thus couldbe enormously influential in the dynamics of diffu-sion (Dodds and Watts 2004). Given the importanceof both incoming and outgoing ties, we hypothesizeas follows.

Hypothesis 1A. Channels that are central in the sub-scriber network have a significant impact on the rate ofdiffusion.

A channel with a greater number of incoming oroutgoing subscriptions has access to a larger thepool of early adopters that are vulnerable to beinginfected. Contagion of new videos can be acceler-ated when a central node (channel) is connected toa pool of vulnerable nodes that are easily suscepti-ble (Watts 2002). Channels that are more connectedhave a greater ability to transmit information about avideo to proximate others. Centrally connected chan-nels occupying a position of greater transmissibilityare therefore on the cutting edge (Rogers 1995), act-ing as opinion leaders who influence the spread ofinformation about new videos that have not acquiredrecognition from the overall YouTube audience (e.g.,Bass 1969, Rogers 1995, Mahajan et al. 1990), whichis particularly important in the early stages of dif-fusion. The greater the number of incoming connec-tions of a channel, therefore, we should expect thatthe greater the influence in the early stages of diffu-sion, and the more likely that the information aboutthe video is eventually disseminated beyond the prox-imate actors to the aggregate global network, increas-ing the likelihood that a new video is viewed.5 Thus,we hypothesize the following.

Hypothesis 1B. A channel’s centrality in the incomingsubscriber network has a significant positive impact on therate of diffusion in the initial phase of content diffusion.

3.2.2. Impact of the Friend Network Structure inMitigating Uncertainty in Experience. We suggesttwo mechanisms by which social networks impact theperceived value from a video and thus influence dif-fusion. Friend networks within a community of inter-est on YouTube might arise from similarity in per-sonal characteristics and consistent interests, which

5 Promotional efforts by channels that are central in a subscribernetwork can help a recent video attain a “most viewed” or “mostdiscussed” status, which in turn leads to more awareness anddrives newer views.

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube6 Information Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS

Figure 3 Friend Relationships Inside and Outside the Group Boundary

12

3

4

5

A community (networkboundary)

Users 1, 2, and 3 are friends, users1 and 4 are friends, and they allalso have memberships to the

same group.

Users 1 and 5 are friends, but user 5 doesnot have the membership to the group 1belongs to. Thus, user 5 is a friend of 1

outside of the group.

promotes homophily and cohesion (Burt 1987). How-ever such cohesion within the local network, could inturn, also lead to redundancy in the social structureof interaction, creating a network-redundancy tradeoff(Reagans and Zuckerman 2008). Another mechanismto activate contagion is that of non-local friendshipties, sometimes characterized as “long ties” (Centolaand Macy 2007, p. 704). Figure 3 depicts the friendrelationships in the local network and outside thegroup boundary.

Conformity Through Local Ties. We consider thedegree centrality of a channel in the (undirected)friend network, which denotes that a channel has agreater number of ties or neighbors. Because an actorcould influence proximate actors’ opinions or prefer-ences, individuals who occupy a key position in anetwork marked by affinity have more power of influ-encing perception and impacting potential experiencefor two reasons. First, individuals care deeply aboutthe opinions of others they interact with, value confor-mity with the choices of others (Bernheim 1994) andface dissonance when they do not adopt the choices ofindividuals whose approval they seek. Central actorscan then influence the perceptions of others (Ibarraand Andrews 1993). The stronger sense of identifica-tion and stronger patterns of interaction within localfriend networks, i.e., networks of friends connectedby an interest group, promotes cohesion that enhanceslocalized conformity (e.g., Bikhchandani et al. 1992).An actor or channel that is central in a cohesive localnetwork has greater influence on proximate actors’decisions through localized conformity, increasing thelikelihood that other nodes connected to the focalchannel view a video that the channel has posted.Second, even if we do not ascribe a higher status orgreater influence to a central node, in a cohesive localnetwork, ties are characterized by homophily that fos-ters trust; thus, we should expect that a channel with

a greater number of proximate connections is moreinfluential in the local network.The influence of a central node matters not only

in the local network but also matters to the overallnetwork. Conformity preferences within a group areenhanced through cohesive social network structuresthat foster a sense of social identity. A channel witha greater degree centrality is in a strong position toinfluence proximate actors’ willingness to experiencecontent posted by the channel. At the aggregate level,the individual influence from a central channel ismagnified when proximate others are induced to viewa video;6 the resulting viewing patterns could diffuseto other agents and to the overall global network.The result is a social multiplier effect that strengthensthe influence of central actors, enhancing the effect ofconformity preferences. Thus, we have the followinghypothesis.

Hypothesis 2A. Channels that are central in the localfriend network significantly affect the rate of diffusion.

We expect that the centrality of the channel and theprestige (Bonacich 1987) of a channel would have dif-ferent impacts during different stages of the diffusionprocess. Due to localized conformity, a channel withmore prestige in the friend network could have sig-nificant ability to persuade proximate actors (Rogersand Kincaid 1981) who value their judgment. Thus,we should expect that the prestige of a channel in thelocal network is more important early in the life of avideo. On the other hand, while centrality of a chan-nel within a local friend network might confer influ-ence in persuading others, the patterns of interactions

6 When a proximate node watches or rates a video, other nodesincident to the proximate node get updated about this behavior,which provides a channel through which social influence can betransmitted.

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTubeInformation Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS 7

in a cohesive network structure could constrain con-tagion due to redundancy in information transmis-sion (Burt 1987). A central channel’s influence couldbe limited early in the life of a video when it needsto compete for attention with videos preferred byother actors connected to those proximate to the focalchannel. However, once a video acquires a criticalmass of views, the impact of a central channel in theoverall global network could be greater due to thelower uncertainty associated with search in the laterstage of diffusion. The larger number of proximateties linked to the central channel could be influen-tial in persuading others to view a video, hasteningdiffusion through the overall network. We thereforehypothesize the following.

Hypothesis 2B. A channel’s centrality in the localfriend network has a positive impact on the rate of diffusionin the later stages of the process of diffusion.

Hypothesis 2C (H2C). A channel’s prestige in thelocal friend network has a positive impact on the rate ofdiffusion in the early stage of the process of diffusion.

Information Transmission Through Long Ties.We consider the impact of nonlocal friend ties, orlong ties (Centola and Macy 2007) in transmittingcontagion beyond the boundaries of the local net-work structure.7 Burt (1992) classifies a network con-tact is characterized as nonredundant if an individ-ual does not share ties with other contacts in a per-son’s immediate social network. Within a communityor a local network, greater identification as a result ofhomophily and the consequent localized conformityand cohesion might also increase the redundancy ofinformation (e.g., Burt 1992), which limits the spreadof contagion. We therefore consider the impact of anode with a greater degree of connections outside thelocal network, or long ties, in facilitating contagion.While the tie strength might be structurally weaker,such nodes can still influence long ties through affin-ity. Given the cognitive overload involved in choosingbetween different videos, the time required to samplea variety of videos and the uncertainty in the expe-rience from a video, a channel with a greater num-ber of non-local ties can promote awareness aboutcontent, and activate (infect) a greater number ofnon-local neighbors (e.g., Watts 2002), enhancing theglobal spread of contagion. Thus, we hypothesize thefollowing.

Hypothesis 3A. A channels that has a greater numberof friend connections outside the local network of friendshas a significant impact on the rate of diffusion.

7 Both explanations (local or nonlocal ties) rely on the ability of anode to influence others nodes or to transmit an infection to othersusceptible nodes.

A channel can activate contagion beyond the cohe-sive but redundant pathways in the local networkstructure (e.g., Dodds and Watts 2004) in the follow-ing two ways. First, a channel with greater numberof connections outside the local network can shapeperceptions under conditions of uncertainty or ambi-guity when the nonlocal nodes are unsure about theactual experience offered by a video. Because long tiesare weaker from a relational perspective (Centola andMacy 2007), what matters is not the channel’s abilityto persuade nonlocal friends but to inform them. Theconnection to nonlocal friends becomes more impor-tant once the adoption threshold has already beenreached. Second, the greater number of ties to individ-uals outside the local network increases the ability ofa channel to transmit a diverse, novel, and rich set ofinformational cues (Burt 1992) to the global network.This ability to activate contagion to a nonlocal tie isparticularly important in the later phases of contentdiffusion, when the influence of the central node canbe extended beyond the local network. Thus, connec-tions to nonlocal ties matter more in the later stage ofdiffusion both from a relational as well as an informa-tion transmission perspective. Thus, we hypothesizeas follows.

Hypothesis 3B. Channels that have a greater numberof friend connections outside the local network have a sig-nificant impact on the rate of diffusion in the later phasesof content diffusion.

3.3. Peer Influence and Heterogeneity inUser Tastes

Prior literature suggests that diffusion curves couldalso result from differences in user propensities toadopt a new product, rather than social contagion(Bemmaor and Lee 2002), which poses an identifica-tion challenge to the empirical estimation. In the ter-minology of the literature on diffusion, heterogeneityin user tastes reflects the unobserved social prefer-ences (e.g., Van den Bulte and Stremersch 2004). Suchheterogeneity in tastes might create a self-selectionof users clustered into different channels or inter-est groups. While we do not hypothesize about theimpact of user tastes on group formation, we controlfor this possibility in the empirical estimation.

4. The Data4.1. The Data Collection ApproachWe employ a panel of data consisting of videoinformation and user information collected fromYouTube.com over a period of two months. Our sam-ple focuses on the videos uploaded within the groupof our interest, and the members of that specific groupfor a total of 4,106 videos posted by 913 users. Thedata were collected for 11 observation points in time,

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube8 Information Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS

each 5 days apart. At each observation point, theinformation on each video and each user within thegroup was collected by taking screen shots. For eachuser, we collected the complete list of friends, sub-scribers, and subscriptions, which are tracked repeat-edly over time as multiple events. Because this datacollection was repeated, we can get snapshots of thenetwork structure over time.We define the cumulative demand, vijt, for video i,

posted by user j , at a certain time t as the cumula-tive number of clicks8 of video i at time t, includ-ing the total number of views from the aggregateYouTube network. The age of video i, VAge, is thenumber of days since a video has first been postedonline. The average age of a video—the number ofdays a video has been online since it was posted—is212 days, and the average number of times a video iswatched is 14,180 with a standard deviation 247,455.Because there is a high dispersion of the popularityof video clips, to control the skewness we used a log-transformed number of views. We validated that thelog-transformed measure obeys a normal distributionby conducting the Kolmogorov-Smirnov, Cramer-vonMises, and Anderson-Darling tests. While viewersmay repeatedly watch a video, we can assume thatthere is a reasonable limit to the number of times anygiven individual watches a video. Because we takelogs, any bias caused by repeated viewings will beonly a slight downward revision of the estimates. Wealso gather data on the number of most importantexternal links leading to a video clip. Because a num-ber of links to YouTube from prominent sites couldlead to greater number of views (clicks), the numberof outer links provides partial control over the trafficfrom outside YouTube. Table 1 summarizes the videocharacteristics in our sample.

4.2. Social Network Structures on YouTubeTo identify the network structures on YouTube,we define the network boundary by focusing ona community (interest group) on YouTube. In theappendix, we discuss the reasons for this data col-lection approach. A community in YouTube is definedas a group with specific video categories (there area total of thirteen categories9 in YouTube). For exam-ple, a community listed under the “music” categoryis a group of people who share the same interest,and upload videos categorized as “music.” Any per-son interested in “music” and who wants to share

8 This is consistent with reports from industry observers and inde-pendent firms such as TubeMogul (http://www.tubemogul.com).9 The 13 categories are Autos and Vehicles, Comedy, Education,Entertainment, Film and Animation, Howto and Style, Music,News and Politics, People and Blogs, Pets and Animals, Scienceand Technology, Sports, and Travel and Events.

Table 1 Video Characteristics

Variables Mean St. dev. Min Max

Age of video (days) 212 144�26 0 731Number of times a 14�180�80 247,455 4 13,449,210

video is watchedLog number of times a 6�77 1�77 1�39 16�41

video is watchedNumber of external 3�66 1�93 0 5

links to a posted videoVideo rating 3�56 1�59 0 5Age of channel (days) 379�96 171�29 2 846

Table 2 Group and Channel Characteristics

Age of the group 2 yearsNumber of group members 1,558Average growth of the number of videos per day 15Average growth of the number of members per day 5Number of videos 4,106Average length of a video (in seconds) 223.06Number of users (channels that posted videos) 913Data collection duration 2 monthsNumber of observation points in time 11

their videos with other members in the group canjoin the community. Therefore the interest group itselfforms the network boundary. Our sample community(or group) is drawn from the “music” and “peopleand blog” categories, which not only are representa-tive of the characteristics of YouTube as a mediumof sharing video clips but also are representative ofthe social interactions within YouTube.10 Table 2 pro-vides the descriptive statistics of the group and chan-nel characteristics.As described in Table 2, the group is 2 years old

and has 1,558 group members and 4,106 videos. Thereare 15 new videos, on average, uploaded every dayin this group, and on average five new membersjoin the group every day. We calculate social networkmeasures using UCINET 6 (Borgatti et al. 2002). Anindividual user represents a “node” in a social net-work. As described earlier, a friendship network isan undirected network, while a subscriber networkis a directed network. Because a user’s social net-work reaches can reach beyond the boundary of theinterest group, a user’s friends or subscribers out-side the group also get notified when a new videois posted; therefore, we also consider the number offriends and subscribers outside the group boundaryas factors influencing the diffusion of a video. Thenetwork boundary allows us to distinguish betweenwithin-group or local social influence from outside-group or global influence (e.g., Centola and Macy2007, Sundararajan 2007). Because individuals couldinteract more if they share common interests, the

10 On YouTube, the terms group and community are synonyms.

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTubeInformation Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS 9

nature of interpersonal influence might be differentwhen individuals exhibit affinity.Because the prestige and status of an actor can be

increased as well as reduced, depending on the struc-ture of the network by connections to powerful oth-ers, we calculate two measures of social capital ofthe friend network. Because we are interested in theimpact of localized conformity and homophily, wecalculate two measures to denote the social influenceof an actor. The degree centrality Frn_NrmDegjt ofuser j in the friend network at time t indicates theimportance of a node (channel) in the social network(Wasserman and Faust 1994), which measures the sizeof the proximate network of the node. The numberof times a node is visited increases with the degreecentrality, and thus indicates the aggregate level ofconnectedness of a node and the ability and theopportunity of a node11 to diffuse information aboutvideos. The centrality measures a node’s involve-ment in the cohesiveness of the network, promotinglocalized conformity (Borgatti and Everett 2006). Wepresent the graph theoretic interpretation of this mea-sure in the appendix. Figure 4 shows a subset of thefriend network.An actor with a high degree centrality denotes

where “the action is” in the network (Wasserman andFaust 1994, p. 179). The more central the networkposition of the actor, the more the actor is a channelof relational information to others (Wasserman andFaust 1994) and occupies a position of social influence(Burt 1987). We also calculate the Bonacich power(Bonacich 1987), which is based on the insight that,while an individual’s status is a function of the statusof other actors he is connected to, i.e., connections tomany prominent others reduce the power of a node.If channel A and channel B have the same numberof connections, but B’s connections are well con-nected with others in the network while A’ connec-tions are more isolated, B might have less influence onhow proximate connections access information andthereby lower aggregate influence. The descriptivestatistics of the friend network for the final period ofdata gathering are presented in Table 3. The minimumdegree centrality is 0 while the maximum degree cen-trality is 521, indicating that some users are highlyconnected while others are not. logNumFrn_outsidejt ,the log-transformed number of friends of user j attime t outside the local network is the number ofproximate ties of user j outside the group boundary.Because our focus is on the spread of information inthe network, we examine the structural properties ofthe various networks that indicate the connectivity

11 A user is an “actor” or “node,” while the relationship betweenusers is a “link” or a “tie.”

Figure 4 Friend Network, Group Boundary, and Nodes Outsidethe Network

and consequently the dynamics of diffusion. The net-work or global-level density is the proportion of ties ina network relative to the total number possible. Thediameter of a network is the largest geodesic distance inthe (connected) network.Figures 5 and 6 show a subset of the subscriber

network and the group boundary for the last phaseof data collection. Because the subscriber relationshipis directional, the connectedness of the focal chan-nel in a subscriber network is measured in termsof the in-degree and out-degree degree centrality,Subs_NrmInDegjt and Subs_NrmOutDegjt of user j inthe subscriber network at time t, which denotes theextent to which individuals are extensively involvedwith or adjacent to many other actors in the socialnetwork (Freeman 1979, Wasserman and Faust 1994).The interpretation is similar to the centrality mea-sures for undirected graphs except that we distin-guish between the directionality of the tie. In-degreecentrality indicates the number of ties directed towarda node, while out-degree is the number of ties that a

Table 3 Friend Network Description

Degree Norm. degreecentrality centrality Bonacich power Norm. BP

Mean 3�852 0�261 −4�83 −17�15St. dev. 20�557 1�395 9�678 34�377Sum 5�682�000 385�482 −7,119.7 −25,291.1Min 0�000 0�000 −143�321 −509�113Max 521�000 35�346 53�135 188�749

Notes. Network centralization = 35.13%; network diameter = 5; networkcentralization= 0.26%.

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube10 Information Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS

Figure 5 Subscriber Network (Node Size Based onIn-Degree Centrality)

node directs towards others. The network centraliza-tion for the out-degree network is greater than thatof the in-degree network. While both in-degree andout-degree indicate that a node is more connected, achannel with a high number of incoming subscrip-tions could act as a gatekeeper or arbiter, while nodeswith a high degree of outgoing subscriptions havegreater awareness of what is going on in the adja-cent nodes. logNumOutSubs_outsidejt is the log trans-formed number of subscriptions (outward directional

Figure 6 Subscriber Network (Node Size Based onOut-Degree Centrality)

Table 4 Subscriber Network Description

Out-degree In-degreecentrality centrality Nrm. out-deg. cent. Nrm. in-deg. cent.

Mean 1�352 1�352 0�092 0�092St. dev. 12�415 3�960 0�842 0�269Sum 1�994�000 1�994�000 135�278 135�278Min 0�000 0�000 0�000 0�000Max 429�000 71�000 29�104 4�817

Notes. Network centralization (out-degree) = 29.032%; Network diameter(out-degree) = 4; Network density (out-degree) = 0.09%; Network central-ization (in-degree) = 4.728%; Network diameter (in-degree) = 5; Networkdensity (in-degree)= 0.09%.

ties) of user j from outside of the group at time t.logNumInSubs_outsidejt is the log transformed num-ber of incoming subscribers (incoming ties) of user j attime t from outside the group. The descriptive statis-tics for the subscriber networks for the last phase ofdata collection are shown in Table 4.

4.3. Control VariablesBecause the characteristics of a video affect its pop-ularity, we control for the number of external links,NumOfLinks, which enable accesses to the videos fromblogs, MySpace, Facebook, or other online communi-ties and forums outside YouTube. This variable pro-vides information of the traffic coming from externaloutlets other than accesses from within YouTube. Wealso control for another factor that may influence avideo’s popularity, the average rating (e.g., Chevalierand Mayzlin 2006) of each video, Rating, which areposted by registered YouTube users. Table 5 summa-rizes the variables. The correlations are presented inthe online appendix.

5. Empirical Approach5.1. Baseline Model and Variable DescriptionsThe baseline model is the standard Bass modelthat takes into account the social network structure.The Bass model estimates the growth of aggregatedemand or diffusion rate as a function of the aggre-gate demand of the prior time period as well as thetime elapsed form the initial launch of a new prod-uct. A viewer watching a video on YouTube is equiv-alent to a consumer adopting a new product and thechoice facing a viewer is whether to view a video ornot. The dependent variable is the diffusion rate ofeach video, measured by the growth in views (e.g.,Bass 1969), which is the difference of the number ofclicks to the video between time t and time �t − 1�,�vijt ≡ vijt − vijt−1, or the popularity growth of video iposted by user j from time �t − 1� to time t. Themeasure of aggregate demand is the popularity ofeach video, vijt, the total number of times a video hasbeen watched (the total number of times a video has

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTubeInformation Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS 11

Table 5 Description of Variables

Variable name Description of variables

Video characteristicslogNumOfViews Log number of times a video is

watched �vijt �

�logNumOfViews�2 The square term of the log number ofviews �v 2

ijt �.logVAge (log-transformed) Time elapsed since

a video has been posted�log�VAge ijt ��

Rating Video rating (0 to 5) �Ratingijt �

logNumOfLinks Number of links, which lead to thevideo, placed outside YouTube

Within group social networkmeasures

Frn_NrmDegree Degree centrality of a user in thefriend network within the group

Frn_NrmBP Normalized Bonacich’ power in thefriend network within the group

Subs_NrmOut-Degree Out-degree centrality of a user in thesubscriber network within thegroup. Connection initiated by theuser.

Subs_NrmIn-Degree In-degree centrality of a user in thesubscriber network within thegroup. Connection initiated byothers.

Outside group social networkmeasures

logNumFrn_Outside Number of friends outside the group

logNumOutSubs_OutsideNumber of outgoing subscriptions

outside the grouplogNumInSubs_Outside Number of incoming subscribers

from the outside of the group

been clicked), consistent with the Bass model (Bass1969) and the prior literature on new product diffu-sion (e.g., Talukdar et al. 2002). The subscripts i� j� tdenote video i, user j , and observation time t. Fig-ure 7 presents the distribution of the log-transformedviews by the percentage of viewers in the group.The baseline model is presented below. We assume

that the average number of views by a single vieweris independent of the exogenous factors in the esti-mation model. Y is a set of covariates that represent

Figure 7 Distribution of Log-Number of Views

0

5

10

15

20

25

30

0 8 10 12 14 16

Pro

babi

lity

(%)

Log (number of views)62 4

the video characteristics; such as the age of video iat time t uploaded by channel j VAge, the number ofouter links NumOfLinks, and the rating of the videoRatingijt. The ratings are posted by registered YouTubeusers and could change over time. The number ofexternal links is the log-transformed number of linksoutside YouTube, which enable accesses to the videosfrom blogs, MySpace, Facebook, or other kinds ofonline forums and communities. Social ties and userpreferences outside of the network are also userspecific at time t. To avoid identification challengeswe consider the impact of social network measuresup to the period �t − 1�. The normalized degreecentrality of channel j within the friend network,and the In- and Out-degree centrality of channel jwithin the subscriber network are all user specific.Our independent variables are the network measuresof friend networks (Frn_NrmDegjt−1, Frn_NrmBPjt−1,logNumFrn_outsidejt−1) and subscriber network (Subs_NrmOutDegjt−1, Subs_NrmInDegjt−1). For both friendnetworks and subscriber networks, we distinguishbetween the local network effects (Frn_NrmDegjt−1,Subs_NrmOutDegjt−1, Subs_NrmInDegjt−1) from thatof nonredundant ties outside of the group (Subs_NrmOutDegjt−1�Subs_NrmInDegjt−1 logNumFrn_outsidejt−1).The diffusion equation incorporating temporal het-erogeneity (e.g., Strang and Tuma 1993) is

�vijt = �0+�1vijt−1+�2v2ijt−1+�3 log�VAgeijt�

+Yijt−1 ·�+Xjt−1 ·�+ijt� (1)

Xjt−1 = Frn_NrmDegjt−1Frn_NrmBPjt−1

·logNumFrn_outsidejt−1Subs_NrmOutDegjt−1

·Subs_NrmInDegjt−1��

Yijt−1 = Ratingijt−1 logNumOfLinksijt−1��

There are several assumptions underlying the stan-dard Bass model. The population is assumed to behomogeneous, the size of the total population is fixedand known, and the parameters of external and inter-nal influence are assumed to be unchanged over time.Our approach is similar to Strang and Tuma (1993)in that we decompose the diffusion process into twocomponents: (i) the number of individuals who arealready infected (and can infect others) and (ii) thehazard rate of contagiousness (as a function of socialnetwork position), thereby incorporating heterogene-ity in the infectiousness of a channel, depending on itsnetwork position, and temporal heterogeneity, whichrefers to the time-varying influence of factors thatdrive diffusion. That is, rather than assuming thateach node is equally effective in activating conta-gion, we approximate the individual level of haz-ard rate by considering the diffusion probability to

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube12 Information Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS

depend on the degree centrality of the channel post-ing each individual video. Being a friend or a sub-scriber to a central node increases the susceptibilityof other nodes due to proximity to the central node,either because the viewer is aware about the videoor because the viewer would like to experience thevideo due to social influence. Because the age vari-able is skewed, we conducted an alternate specifica-tion using a dummy variable for Age instead of thelog-transformed measure of age, and found that thecoefficients in the baseline model were consistent.

5.2. Identification ChallengesIdentifying social contagion effects poses a numberof statistical and econometric challenges that are nottaken into account in the baseline Bass model. First,the baseline model assumes that the difference in thenumber of views between time periods is a func-tion of the total popularity of a video. When exoge-nous and unobservable events could trigger a surgeof popularity that drives diffusion, �vijt is influencedby �vijt−1. In other words, there could be a poten-tial bias due to contemporaneous shocks that affecttastes, which creates a serial correlation in estimatingthe popularity of videos due to unobservable socialfactors. Such serial correlation needs to be explicitlytaken into account in the econometric model.Second, identifying social influence is complicated

by the fact that an individual’s choices reflect thechoices of the group to which an individual belongs.The difficulty in estimating social contagion effects isthat individual behavior is not fixed but varies withthe prevailing norms or tastes of the social group.Econometrically, this is referred to as the reflectionproblem in the literature on peer effects in economics.We address the reflection problem by assuming thatthe effect of social contagion is based on networkcomposition up to the previous period. This approachcan address potential simultaneity between the meancharacteristics of a group and the mean outcome.We also observe that users (channels) communicatenot only within the group but also with individualsoutside the group. When a user (channel) interactswith others outside her subscriber network as well asindividuals within the subscriber network, individualdecisions are indicative not only of the social influ-ence of the group that the user belongs to (i.e., thesubscriber network) but also of the social influence ofothers from outside the subscriber network.However, an additional challenge is that we still

need to consider whether the measured impact ofa user’s connectedness in the subscriber networksreflects a systematic pattern of self-selection drivenby user preferences rather than the result of informa-tional role of channels that occupy a central role insubscriber networks in directing search and product

discovery. Such self-selection could likely affect view-ing volume. In that case, users with a high degree ofincoming subscriptions might be the ones with a fingeron the pulse, i.e., arbiters of what is likely to becomepopular, while those users with a greater degree ofoutgoing subscriptions have better awareness of oth-ers’ choices. Given the multiplicity of factors thatcould confound social influence, we distinguish mem-bership in a social network from other types of socialinfluence by conducting exclusion restrictions on thegroup composition. Thus, our method of identifica-tion is similar to recent approaches that measuresocial influence by identifying the nature of interac-tions structured through a network (e.g., Bramoulleet al. 2009).A third problem is that of unobserved heterogene-

ity in user tastes, which might result from an unob-served demographic characteristic such as age thatreflects the unobserved preferences of viewers. Forinstance, some users (channels) are likely to be exper-imenters while others wait to sample content onlyafter it becomes popular. The patterns of diffusion ondiffusion and the popularity of content could thenresult from heterogeneity rather than contagion dueto social influence (e.g., Bemmaor and Lee 2002). Thestructure of network formation in YouTube allows usto address this problem because we can distinguishbetween a user’s membership in a group of socialinfluence (the friend network) from membership ingroups dictated by user tastes.

5.3. Structural Model SpecificationWe now consider the interaction between variablesthat may impact the diffusion process (e.g., Wejnert2002) and the nature of group formation due touser preferences. The structural estimation proceedsin three stages. First, following the techniques usedby Boulding and Christen (2003), we use -dif-ferencing to remove serial correlation. Second, weconduct exclusion restrictions on the group compo-sition characterizing subscriber networks to addresspotential self-selection. We conduct two different esti-mations: (i) Hausman-Taylor estimation (Wooldridge2002, pp. 225–228) with instrumental variables to takeinto account unobserved self-selection relationshipsbetween users and videos and (ii) a multilevel (hierar-chical) estimation to address unobserved channel het-erogeneity.

5.3.1. Rho-Differencing to Remove Serial Corre-lation. The Bass model with a serial correlation inthe error term, where, ijt = �ijt + ijt−1 is

�vijt = �0 +�1vijt−1 +�2v2ijt−1 +�3 log�VAgeijt�+ijt� (2)

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTubeInformation Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS 13

The autocorrelation effect ijt−1 is removed through aserial correlation adjustment, leaving us with a con-temporaneous shock term.

�vijt = �vijt−1 + �0�1− � + �1vijt−1 − �1 vijt−2

+ �2v2ijt−1 − �2 v2

ijt−2 + �3 log�VAgeijt�− �3 log�VAgeijt−1� + �ijt�

(3)

After estimating from Equation (3) and taking thefirst-order difference, we have the variables correctedfor serial correlation ��′

ijt = ��ijt − ̂��ijt−1, �′ijt−1 =

�ijt−1 − ̂�ijt−2, �′2ijt−1 = �2

ijt−1 − ̂�2ijt−2, and log�VAge′

ijt� =log�VAgeijt�− ̂ log�VAgeijt−1� and the random shock �ijt

as the error term. Through p values, we establish thatthe parameter estimate of the auto-regression coeffi-cient is significant.

5.3.2. Hausman-Taylor Estimation Correcting forSelf-Selection. To identify social effects, Manski(1993) recommends that we understand the fac-tors that dictate the composition of social groups.To consider whether users self-select into differ-ent subscriber networks based on their taste pref-erences, we conduct exclusion restrictions usingNumOutSubs_outsidejt and NumInSubs_outsidejt as in-struments for out-degree and in-degree centralities ofthe subscriber network, which can remove the endo-geneity in systematic matching across users and socialnetworks depending on unobserved taste preferences.As a robustness check we performed a Wald F -test(Angrist and Krueger 1991) for the joint significanceof the parameters by including instrumental variablesalong with the other independent variables in the dif-fusion estimation, and verified that the test rejectedthe joint significance of the variables. We estimate thefollowing second-stage models:

Subs_NrmOutDegjt =f �logNumOutSubs_outsidejt�� (4)

Subs_NrmInDegjt =f �logNumInSubs_outsidejt�� (5)

The Hausman-Taylor estimation (Hausman andTaylor 1981) addresses unobserved relationshipsbetween popularity growth and video characteris-tics. In contrast with either fixed effects or randomeffects estimation, the Hausman-Taylor (hereafter,HT) approach assumes that some and not all ofthe regressors are correlated with individual effects.Given that a video in our data set can be uploaded byonly one channel, we can address a user’s time invari-ant propensity to enjoy a particular video by con-ducting an instrumental variable (IV) HT estimation(procedure detailed in Wooldridge 2002, pp. 325–328).From the exclusion restrictions described in (4)and (5), the exogenous time-varying variables, thelog-transformed number of subscribers from outsidethe network, are employed as instruments for the

endogenous time-varying variables, the out-degreeand in-degree centrality of the channel in the sub-scriber network. That is, because there could be self-selection of users depending on tastes, a channel’sposition in the subscription network as endogenousand time varying. We then estimate a random effectsmodel using the means of two exogenous time vary-ing variables, video rank, and number of externallinks as instruments for the endogenous time-fixedvariable, the length of the video.12 This approachallows us to capture the time-invariant characteris-tics of a video that could impact the diffusion pro-cess. Fixed effects estimation, by contrast, removes allsources of time-invariant variation in the explanatoryvariables. The endogeneity assumptions were vali-dated through a Hausman-Taylor specification test.We also conducted sensitivity tests to establish theappropriateness of the instruments used, consistentwith Boulding and Christen (2003) and Wooldridge(2002). We use X, Y , and Z to represent

Xjt−1 = frn_NrmDegjt−1 frn_NrmBPjt−1

· logNumFrn_outsidejt−1��

Yijt−1 = Ratingijt−1 logNumOfLinksijt−1��

�Z = Subs_ �N rmOutDegjt� Subs_ �N rmInDegjt��

��′ijt = �0 + �1�

′ijt−1 + �2�

′2ijt−1 + �3 log�VAge′

ijt�

+ �Yijt−1 + �Xjt−1 + ��Zjt−1 + �ijt� (6)

��′ijt = �0 + �1�

′ijt−1 + �2�

′2ijt−1 + �3 log�VAge′

ijt�

+ �Yijt−1 + �1Xjt−1 + �2�log�VAgeijt� · Xjt−1�

+ �1�Zjt−1 + �2�log�VAgeijt� · �Zjt−1� + �ijt� (7)

5.3.3. Hierarchical Estimation to Address Chan-nel Heterogeneity. An alternate causal explanationwe need to consider is that the result of growthin views could be the impact of external linkagesand publicity efforts by each channel that results ina video receiving more attention, leading to greaterawareness diffusion. That is, it could be the effortmade by channels to attract subscribers or greaterattention-seeking behavior that is responsible forgrowth in views rather than the social influence. Inthe econometric estimation, this leads to unobservedheterogeneity resulting from the efforts of channels attargeting potential subscribers.We categorize channels into three groups, depend-

ing on their connectedness in the overall networkstructure. The first group consists of channels iso-lated from the overall network structure within the

12 The Hausman-Taylor estimation was conducted using Stata. Thespecification examines the rank of the variance-covariance matrixand detects whether the constraints have been met.

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube14 Information Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS

group. The second group of channels are connectedto subscribers and friends outside the group (networkboundary) but isolated from interactions within thenetwork boundary. The third group comprises chan-nels that are well connected both within and out-side the network boundary. We follow Talukdar et al.(2002) in estimating a hierarchical model with anidiosyncratic channel specific component of the errorterm to capture the heterogeneity.The estimation proceeds as follows. We re-estimate

the models in (6) and (7) but add a channel-specific idiosyncratic term �k ∼ N�0��k� where wedefine three levels of hierarchy k ∈ �1�2�3�. Theintercept now becomes �0 + �k. There are two vari-ance components—one represents the variance amongvideos within the group � , and the other representsthe variation between group means �k (dependingon the connectivity of the different groups of chan-nels).13 Ignoring this systematic variation in globalconnectedness of a channel, which impacts the abilityof a channel to attract attention, could lead to mis-leading causal inference (e.g., Gelman and Hill 2007).We assume that observations within each level ofhierarchy share certain common characteristics result-ing from the connectedness of the channel. The errorstructure then becomes⎛

⎜⎝� 2 + �2

1 · · ·���

� � ����

· · · �2 + �2k

⎞⎟⎠ �

6. Results6.1. Discussion of EstimatesThe estimates for the baseline model are presented inTable 6. The video rating and the external links arehighly significant, indicating that the perceived videoquality and strategic linkages with other online chan-nels of influence positively influence diffusion.The results from the structural model are shown

in Table 7. The R-squared values are greater in thebaseline model due to serial correlation that has beencorrected in the structural models.The in-degree centrality of users in subscriber net-

works is highly significant, indicating support forHypothesis 1A. A channel that is central in a networkof subscribers could be more influential in directingsearch and discovery toward itself, with the resultthat content posted by the channel is more likelyto be disseminated into the overall network. Friendnetworks have a significant impact on diffusion, vali-dating Hypothesis 2A. Homophily and localized con-formity could provide an influential role to channels

13 For robustness we estimated a model with channel-specific het-erogeneity specified at the level of each individual channel andverified that the grouped approach fits the data better.

Table 6 Baseline Model Estimates

Parameter estimates

Variable Baseline

Intercept 0�287600 (0.003544)∗∗∗

LogNumOfViews 0�001317 (0.000378)∗∗∗

(LogNumOfViews�2 0�000625 (0.000094)∗∗∗

Log (vAge) −0�05003 (0.000596)∗∗∗

Rating 0�001981 (0.000295)∗∗∗

logNumOfLinks 0�002463 (0.000861)∗∗∗

Frn_NrmDegree −0�00080 (0.000203)∗∗∗

Frn_NrmBP −0�00003 (9�959E − 6)∗∗

Subs_NrmOut −Degree −0�00308 (0.000788)∗∗

Subs_NrmIn −Degree 0�000182 (0.001749)logNumFrn_Outside −0�00063 (0.000261)∗∗

Adjusted R Squared 0.17

Note. Standard errors in parentheses; ∗∗p = 0�05; ∗∗∗p = 0�01.

that are central in a local network of friends, and suchsocial influence eventually diffuses to the aggregatenetwork. Channels that are more connected to friendsfrom outside the group play a significant role in thediffusion of content, validating Hypothesis 3A. Cen-tral actors that can enrich the set of experiences andideas available to the aggregate network could pro-vide the pathways to transmit access to new and non-redundant content in the overall network.The interaction term between in-degree centrality

and video age is negative, indicating support forHypothesis 1B. However, the coefficient of the inter-action term between out-degree centrality and videoage is positive. The contrasting impacts of outgoingand incoming subscriptions in the earlier stages ofdiffusion suggest that a channel’s ability to directsearch and discovery through its incoming ties mightbe more important than a channel broadcasting itschoices and seeking attention through outgoing ties.The interaction term between degree centrality of thefriend network with time is positive, indicating sup-port for Hypothesis 2B, i.e., friend networks play avery important role in the later stages of diffusionwhen the experience attributes of a product are moreimportant. The Bonacich power, which measures thestatus of the channel, is more important in the earlystages than the later stages, supporting H2C. Therelative prestige of a channel within the local net-work could be very important in the earlier stagewhen the video is fairly unknown. However, the cen-trality of the channel could be more important inthe later stages due to the ability to quickly trans-mit contagion to a host of other actors. The inter-action term of the number of nonlocal friends withvideo age has a significant impact on diffusion, val-idating Hypothesis 3B. Nonlocal friends provide animportant pathway for a channel to activate conta-gion beyond the local network neighborhood. Themagnitude of the impact of nonlocal ties is stronger

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTubeInformation Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS 15

Table 7 Structural Model Estimates

Parameter estimates

Hausman-Taylor with Hierarchical model withVariables Hausman-Taylor interaction terms Hierarchical model interaction terms

Intercept 0�2317 (0.00354)∗∗∗ 0�2525 (0.00628)∗∗∗ 0�22800 (0.004360)∗∗∗ 0�24650 (0.00684)∗∗∗

LogNumOfViews 0�00196 (0.00034)∗∗∗ 0�00207 (0.00035)∗∗∗ 0�00232 (0.000343)∗∗∗ 0�00207 (0.00035)∗∗∗

(LogNumOfViews)2 −0�00014 (0.00009) −0�00015 (0.00009)∗ −0�00017 (0.000085)∗∗ −0�00019 (0.00008)∗∗

Log (I) −0�03810 (0.00059)∗∗∗ −0�04196 (0.00117)∗∗∗ −0�03726 (0.000603)∗∗∗ −0�04071 (0.00120)∗∗∗

Rating 0�00077 (0.00027)∗∗∗ 0�00077 (0.00027)∗∗∗ 0�00094 (0.000272)∗∗∗ 0�00074 (0.00027)∗∗

logNumOfLinks 0�00464 (0.00079)∗∗∗ 0�00476 (0.00079)∗∗∗ 0�00404 (0.000792)∗∗∗ 0�00457 (0.00079)∗∗∗

Frn_NrmDegree −0�00069 (0.00017)∗∗∗ −0�00440 (0.00165)∗∗∗ −0�00079 (0.000187)∗∗ −0�00449 (0.00165)∗∗∗

Frn_NrmBP −0�00001 (9�107E − 6) 0�00017 (0.00007)∗∗ −0�00002 (9�123E − 6)∗∗∗ 0�00017 (0.00007)∗∗

Subs_NrmOut-Degree −0�00209 (0.00180) 0�09068 (0.01258)∗∗∗ −0�00339 (0.000728)∗∗∗ 0�09135 (0.01258)∗∗∗

Subs_NrmIn-Degree 0�01514 (0.00267)∗∗∗ 0�04580 (0.01971)∗∗ 0�00532 (0.001700)∗∗∗ 0�05403 (0.01979)∗∗∗

logNumFrn_Outside −0�00158 (0.00029)∗∗∗ −0�01317 (0.00217)∗∗∗ −0�00067 (0.000253)∗∗∗ −0�01218 (0.00219)∗∗∗

logVAge ∗ Frn_NrmDeg 0�00070 (0.00031)∗∗ 0�00071 (0.00031)∗∗

logVAge ∗ Frn_NrmBP −0�00004 (0.00001)∗∗ −0�00004 (0.00001)∗∗∗

logVAge ∗ logSubs_NrmOut −Degree 0�00213 (0.00040)∗∗∗ 0�00199 (0.00040)∗∗∗

logVAge ∗ logSubs_NrmIn −Degree −0�01751 (0.00238)∗∗∗ −0�01763 (0.00238)∗∗∗

logVAge ∗ logNumFrn_Out −0�00532 (0.00372)∗∗ −0�00653 (0.00373)∗∗

Video length −0�0010 (0.00010)∗∗∗ −0�0010 (0.00010)∗∗∗

Adjusted R-squared 0.1323 0.1348 0.1287 0.1290

Note. Standard errors in parentheses; ∗p = 0�10; ∗∗p = 0�05; ∗∗∗p = 0�01.

than that of the degree centrality of the actor in afriend network, suggesting that the ability of a nodeto facilitate contagion to the aggregate network mightbe more important than the relationships marked byhomophily and conformity within a local network.Based on the estimates from the structural model,we constructed several diffusion curves incorporatingsocial network effects. Figure 8 illustrates the processof diffusion with and without the impact of varioussocial network structures on YouTube.14

In the absence of social influence effects, diffusionwould have been more rapid in the initial stages butwould also have peaked with a lower number ofaggregate views. Such diffusion dynamics could occurfor two reasons. First, the dynamics whereby thenumber of adopters reaches a critical mass is likelyto be different when we factor in the role of centralchannels in persuading early adopters. Channels thatare highly connected in a subscriber network have thepotential to act as mavens (Gladwell 2000). A videoposted by the central channel first needs to reach apool of early adopters, which subsequently influencesthe rate at which the video diffuses through the pop-ulation. Because the subscriber network is a directedgraph, the role of central nodes is to seed contentin the initial stages of contagion, influencing aware-ness diffusion. The initial diffusion rate is then verysensitive to the network structure of incoming andoutgoing subscriptions. Second, because we find thatthe role of conformity preferences and nonlocal ties

14 This figure has been modified to scale and is not an exact repre-sentation of the estimated coefficients.

is greater in the later stages of the diffusion process,diffusion through friend networks occurs only aftersome proportion of the initial population of users isinfected, when a certain proportion has already vieweda video, or when a social threshold level of adoptioncreates a tipping point. One important implication isthat the combined effect of friendship networks andsubscriber networks reduces the time for the diffusioncurve to reach the familiar S-shape. Figure 9 depictsthe number of views of a video over time for a repre-sentative set of videos.The results offer a contrast between the different

types of information transmission in various types ofsocial interactions. Channels occupying positions ofinfluence in the subscribers’ network are crucial indirecting search and ensuring that a video acquiresa critical mass of viewers. While the effects of socialinfluence from friendship networks are not strong in

Figure 8 Diffusion With and Without Social Network Effects

Log

(num

ber

of v

iew

s)

Video age

No networkFriend networkSubscriber networkFriend and subscriber networks

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube16 Information Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS

Figure 9 Video Popularity with Video Age

5

7

9

11

13

0 50 100 150 200 250 300

Log

(num

ber

of v

iew

s)

Video age

Incoming subscribers’influence is greater

Nonlocal friends havegreater influence

Within-group friends’influence is greater

the early stages of the life of a video, it is likelythat a small number of highly influential nodes inthe subscriber networks enhance the transmissibilityof the video by setting off a trajectory of adoptionwithin the subscription networks that cascades to thefriend networks, creating a social multiplier effect thatmagnifies the impact of user tastes through socialpreferences such as conformity. If the process of infor-mation transmission does not depend on user tastesor homophily, we should observe that the impactof the originating channel’s position in the friendnetwork and subscriber networks should be similarthroughout the diffusion process. We also find thatthe impact of nonlocal ties is stronger in the pro-cess of diffusion than that of within-group ties. The-ory suggests that users place different weights oninformation acquired from other users (e.g., Bala andGoyal 1998). A preference for conformity leading tofrequent and sustained connections with others in alocal network creates a strong sense of identification;however, such conformity limits the amount of non-redundant information that can be transmitted acrossthe local network. Long ties that are weaker in tiestrength can be very influential in extending diffusionbeyond a cohesive group. Our results suggest that theinformational role of long ties could be more impor-tant than the relational role of stronger ties. The con-trasting effects of degree centrality and the Bonacichpower measures in a friend network also suggest thathomophily and social status might have subtle differ-ences in the process of diffusion.

6.2. Contributions to Literature and PracticePrior research has extensively analyzed the roleof online dissemination of consumer opinionssuch as online word-of-mouth (e.g., Chevalier andMayzlin 2006), consumer-generated media (Dewanand Ramaprasad 2008), and social identity disclosurein user-generated content (Forman et al. 2008). Thefocus on this paper is on the impact of the networkposition of the content creator, which not only impacts

whether a video achieves success or failure but alsoaffects the magnitude of the impact. This inquiry intothe role of network position distinguishes this studyfrom prior studies on user generated content in ISliterature. Prior literature on diffusion highlights theimportance of promotional efforts such as targetedcommunication (Manchanda et al. 2008), advertising(Van den Bulte and Lilien 2001) and the role of influ-entials (Dodds and Watts 2007). In addition to pro-motional efforts, content creators control the timing ofthe release of an album as well as promotional effortsuch as radio play (e.g., Bhattacharjee et al. 2007).Such promotional efforts play a much more limitedrole on YouTube where the model of content creationis more democratic as evidenced by the blockbusterpopularity of amateur content.The results in this paper demonstrate that social

networks impact economic outcomes by structuringon the information available to other actors, whichinfluences others’ decisions, perceptions, and behav-ior. The social capital fostered through networkedinteractions might also mitigate the potential forinformation asymmetry, suggesting that research onreputation systems on the Internet (e.g., Resnick et al.2000) could incorporate social networks based expla-nations. This paper also highlights that various formsof networked social interactions exert different typesof interpersonal influence and the nature of infor-mation transmission can be different depending onwhether ties are homophilous or instrumental ties.This perspective could enrich the stream of researchon the adoption and diffusion of technological inno-vations by highlighting the process of interpersonalinfluence. Multiplier effects arising from social conta-gion within a social network can be instrumental inshaping perceptions of the usefulness of innovations(e.g., Bandiera and Rasul 2006), explaining the trajec-tory of diffusion of technological innovations.While technology-mediated communications ex-

isted earlier, the new models of social computing arecharacterized by a spontaneous emergence of commu-nities, with a wealth of opportunity for participatoryinteraction, self-expression, and collective action. Theease with which users can participate in social com-puting platforms and seek out digital content suchas music and entertainment is of tremendous inter-est to marketers and content creators. Businesses areattempting to enhance customer engagement througha community-centered approach to product devel-opment and brand building (Wells 2009), shiftingpower to the edge of the network. With the strongrate of growth of YouTube (Peck et al. 2008), andas online video supplants traditional media channelssuch as television, the question facing companies iswhether to sponsor a set of content creators to posttheir content on YouTube or to fund a set of videos

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTubeInformation Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS 17

that have already acquired momentum. We find thatthe number of outgoing subscriptions has a differentimpact from that of the number of incoming subscrip-tions of a channel, indicating that who initiates theconnections in a network matters to how the con-tent is diffused. A push model of content creation,where a channel tries to initiate connections to otherusers, might be less successful than a pull model,where a channel with high in-degree centrality actsas a fashion leader by initiating contagion. Therefore,one implication for practitioners is to consider the roleof the prestige of a channel in strategically seedingcontent in order to maximize impact.One perspective in marketing contends that a small

number of mavens or influentials play a disproportion-ate role in spreading new ideas or triggering adoptionof a new product (Gladwell 2000). Another perspec-tive questions whether influentials can act as gate-keepers and suggests that a critical mass of easilypersuadable individuals drives collective contagion(Dodds and Watts 2007). The results in this paper sug-gest that central channels in the subscriber networksact as influentials in directing search and product dis-covery. Given the critical importance of incomingsubscriptions, channels could try to build an initialuser base through purchasing key words to improvetheir listing in sponsored search, through promotionalvideos, or by listing in YouTube’s featured videoslisting.While subscription networks are important in

directing viewer attention and acquiring a criticalmass of viewers, the magnitude of the popularityalso depends on a channel’s position in the friendnetworks. Content creators could monetize contentthrough subscription fees or micro payments; how-ever, an advantageous position in the subscriber net-work alone might not be enough for a channel toobtain revenues. Rather, it could be greater userengagement driven by homophily or affiliation thatcould be important in monetization of content. Thusdifferent types of networks play different roles inensuring channel loyalty and user engagement, andthey consequently impact the monetization of content.It has been suggested that online video communi-

ties are characterized by user stickiness (Peck et al.2008). Content creators need to comprehend hownetworks of interpersonal influence promote users’identification with their channel to design futurepromotional efforts and advertising campaigns. Busi-nesses seeking to monetize social networks, in par-ticular, could obtain insights to maximize the impactof targeted marketing, advertising, and the spreadof digital content. Understanding the social struc-ture of interactions is also important in identifyingauthoritative nodes that influence product discovery

and viewing behavior. For instance, the local popu-larity of a video is considerably different from that ofglobal popularity (Baluja et al. 2008). An understand-ing of local network effects, and particularly userengagement within the local network, is importantto develop greater knowledge of what users searchfor and what impacts user experience in online socialnetworks. Analyzing such patterns of social interac-tions could prove valuable in developing recommen-dation tools and collaborative filtering systems (e.g.,Oestricher-Singer and Sundararajan 2008).

6.3. LimitationsThis study has a few limitations. We also cannot ruleout whether attention-seeking efforts by content cre-ators impact diffusion by increasing the saliency ofeach video, rather than the social interactions positedhere. While we distinguish between the impacts ofincoming and outgoing subscriptions on the aggre-gate diffusion process, we do not examine the deepertheoretical differences in the different types of sub-scriptions. Future work can examine these differencesin terms of network structure and conditions thatcan trigger a large-scale epidemic propagation. Wealso do not consider whether social influence throughoffline interactions bolsters the impact of conformitypressures and status in networked interactions online.Another issue is that the impact of channel centralityin directing search and influence potential experiencecould be a result of the involvement of the channel inthe overall YouTube network, increasing the salienceof videos posted by a channel among potential view-ers rather than the impact of social contagion.While we control for the differences in individual

connectedness of a channel, we do not investigatewhether individual heterogeneity results in differentagents updating their beliefs differently. It is possi-ble that contagion might be triggered through a com-plex interaction between network structure and deci-sion making in groups, whereby agents’ decisionscould depend not only on the incident ties of anagent but also on the characteristics and actions of theagents’ neighbors. Bala and Goyal (1998) characterizelearning from neighbors as a Bayesian updating pro-cess whereby an agent updates her beliefs based onthe actions of other agents. Dodds and Watts (2007)model interpersonal influence as a function of notonly network characteristics such as proximity butalso the influencer’s expertise and characteristics ofother individuals adjacent to the influencer. It mightbe necessary to conduct experimental studies to teaseout such impacts.

7. Conclusions and Future ResearchWhile the proliferation of social computing modelsis of interest to academics and practitioners alike, an

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube18 Information Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS

analysis of social computing platforms cannot takeplace without an understanding of the interpersonalinfluence underlying individual behavior. For busi-nesses seeking to monetize social search and digitalcontent, in particular, it is increasingly importantto understand how users interact and participate inthese settings. This study investigates how conta-gion impacts the trajectory of diffusion in a socialcomputing platform. We integrate multiple theoreticalperspectives to highlight that social interactions struc-tured through a network significantly influence notonly the spread of contagion but also the magnitudeof the contagion. We find that diffusion proceeds in atwo-stage process in which a product’s search charac-teristics matter in the early stage in triggering aware-ness diffusion creating a role for subscriber networks,while the experience characteristics matter more inthe later stage, and thus the role of a central channelin the friend networks in disseminating experience-related information.Identifying social influence is complicated due to

the difficulty in distinguishing between social influ-ence and self-selection of users. We address the reflec-tion problem by systematically separating betweenfactors that affect membership in a social networkfrom other types of social influence. We can thus ruleout the possibility that an individual’s viewing pat-tern is either solely influenced by the social group thatthey belong to. The Hausman-Taylor estimation con-siders the impact of a viewer’s time invariant loyaltyto a video. To address the issue of the unobservableattention-seeking efforts, we consider channel specificheterogeneity resulting from the aggregate connect-edness of a channel. Given the multiplicity of factorsconstituting social influence, this study makes a firststep in demonstrating not only that social influencematters but also that we can disentangle the differentmechanisms by which social influence is structured.Our estimation is robust to unobservable self-selectionof users, unobserved heterogeneity in user prefer-ences and robust to contemporaneous shocks thatmight result in serial correlation.In an increasingly hypernetworked age, individuals

have access to informational content from a wide vari-ety of online sources such as blogs, consumer forums,podcasts, and social media that influence their tastesand preferences. Individuals whose networks bridgea variety of sources have access to a diversity of infor-mation and can translate information across groups.Agents who broker across structural holes in a net-work might have an informational advantage result-ing from access to multiple sources of information.Future research can explore how information is trans-mitted across different networks and whether promi-nent users function as conduits of information linkingnetworks across different forms of social media.

AcknowledgmentsThe authors thank Sanjeev Dewan and the ISR reviewteam for constructive feedback and suggestions; V. Sam-bamurthy, Ritu Agarwal, Alok Gupta, Mani Subramani,and seminar participants at the University of Minnesota,Hong Kong University of Science and Technology, Uni-versity of Washington, Singapore Management University,Towson University, University of Washington InformationSchool, City University of Hong Kong, Workshop in E-Business 2007, International Conference in Information Sys-tems 2008, Informs Conference 2009, and the Symposiumon Statistical Challenges in eCommerce Research 2008 forcomments on earlier versions of this paper.

ReferencesAngrist, J., A. Krueger. 1991. Does compulsory school atten-

dance affect schooling and earnings? Quart. J. Econom. 106(4)979–1014.

Armstrong, C. P., V. Sambamurthy. 1999. Information technologyassimilation in firms: The influence of senior leadership and ITinfrastructures. Inform. Systems Res. 10(4) 304–327.

Bala, V., S. Goyal. 1998. Learning from neighbors. Rev. Econom. Stud.65(3) 595–621.

Baluja, S., R. Seth, D. Sivakumar, Y. Jing, J. Yagnik, S. Kumar,D. Ravichandran, M. Aly. 2008. Video suggestion and dis-covery for YouTube: Taking random walks through the viewgraph. Proc. 17th Internat. Conf. World Wide Web, Beijing,895–904. http://www.conference.org/www2008/papers/pdf/p895-svakumarA.pdf.

Bandiera, O., I. Rasul. 2006. Social networks and technology adop-tion in Northern Mozambique. Econom. J. 116 862–902.

Bass, F. 1969. A new product growth for model consumer durables.Management Sci. 15(5) 215–227.

Bemmaor, A. C., J. C. Lee. 2002. The impact of heterogeneity andill-conditioning on diffusion model parameter estimates. Mar-keting Sci. 21(2) 209–220.

Bernheim, D. A. 1994. A theory of conformity. J. Political Econom.102(5) 841–877.

Bhattacharjee, S., R. D. Gopal, K. Lertwachara, J. R. Marsden,R. Telang. 2007. The effect of digital sharing technologies onmusic markets. Management Sci. 53(9) 1359–1374.

Bikhchandani, S., D. Hirshleifer, I. Welch. 1992. A theory of fads,fashion, custom, and cultural change as informational cas-cades. J. Political Econom. 100(5) 992–1026

Bonacich. 1987. Power and centrality: A family of measures. Amer.J. Sociol. 92(5) 1170–1182.

Borgatti, S. P. 2005. Centrality and network flow. Soc. Networks 27(1)55–71.

Borgatti, S. P., M. G. Everett. 2006. A graph-theoretic framework forclassifying centrality measures. Soc. Networks 28(4) 466–484.

Borgatti, S. P., M. G. Everett, L. C. Freeman. 2002. Ucinet for Win-dows: Software for Social Network Analysis. Analytic Technolo-gies, Harvard, MA.

Boulding, W., M. Christen. 2003. Sustainable pioneering advantage?Profit implications of market entry order. Marketing Sci. 22(3)371–392.

Bramoulle, Y., H. Djebbari, B. Fortin. 2009. Identification of peereffects through social networks. J. Econometrics 150(1) 41–55.

Burt, R. 1987. Social contagion and innovation: Cohesion versusstructural equivalence. Amer. J. Sociol. 92(6) 1287–1335.

Burt, R. 1992. Structural Holes: The Social Structure of Competition.Harvard University Press, Cambridge, MA.

Centola, D., M. Macy. 2007. Complex contagions and the weaknessof long ties. Amer. J. Sociol. 113(3) 702–734.

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.

Susarla et al.: Social Networks and the Diffusion of User-Generated Content: Evidence from YouTubeInformation Systems Research, Articles in Advance, pp. 1–19, © 2011 INFORMS 19

Cha, M. H., P. Kwak, Y. A. Rodriguez, S. Moon. 2007. I Tube,YouTube, everybody Tubes: Analyzing the world’s largest usergenerated content video system. Proc. of the ACM Special Inter-est Group on Data Comm. (SIGCOMM) Internet MeasurementConf., San Diego.

Chevalier, J., D. Mayzlin. 2006. The effect of word of mouth online:Online book reviews. J. Marketing Res. 43(3) 345–354.

Coleman, J. S., E. Katz, H. Menzel. 1966. Medical Innovation: Diffu-sion of a Medical Drug Among Doctors. Bobbs-Merrill, Indianapo-lis.

Crane, R., D. Sornette. 2008. Robust dynamic classes revealedby measuring the response function of a social system. Proc.National Acad. Sci. 105(41) 15649–15653.

Dewan, S., J. Ramaprasad. 2008. Consumer blogging and musicsampling. Working paper, University of California at Irvine,Irvine.

Dodds, P. S., D. J. Watts. 2004. Universal behavior in a general-ized model of contagion. Physical Rev. Lett. 92(21). http://prl.aps.org/abstract/PRL/v92/i21/e218701.

Dodds, P. S., D. J. Watts. 2005. A generalized model of social andbiological contagion. J. Theoret. Biol. 232(4) 587–604.

Dodds, P. S., D. J. Watts. 2007. Influentials, networks, and publicopinion formation. J. Consumer Res. 34 441–458.

Easley, D., J. Kleinberg. 2010. Networks, Crowds, and Markets: Rea-soning About a Highly Connected World. Cambridge UniversityPress, Cambridge, UK.

Fichman, R. G. 2000. The diffusion and assimilation of informationtechnology innovations. R. W. Zmud, ed. Framing the Domainsof IT Management: Projecting the Future Through the Past. Pin-naflex Educational Resources, Inc. Cincinnati, OH.

Forman, C., A. Ghose, B. Wiesenfeld. 2008. Examining the relation-ship between reviews and sales: The role of reviewer identityinformation. Inform. Systems Res. 19(3) 291–313.

Freeman, L. C. 1979. Centrality in social networks: A conceptualclarification. Soc. Networks 1 215–239.

Gelman, A., J. Hill. 2007. Data Analysis Using Regression andMulitlevel/ Hierarchical Models. Cambridge University Press,Cambridge, UK.

Gladwell, M. 2000. The Tipping Point: How Little Things Can Make aBig Difference. Little Brown, Boston.

Granovetter, M. 1973. The strength of weak ties. Amer. J. Sociol. 61360–1380.

Hausman, J. A., W. E. Taylor. 1981. Panel data and unobservableindividual effects. Econometrica 49(6) 1377–1398.

Hendricks, K., A. Sorenson. 2009. Information and the skewness ofmusic sales. J. Political Econom. 117(2) 324–369.

Ibarra, H., S. B. Andrews. 1993. Power, social Influence, andsense making: Effects of network centrality and proximity onemployee perceptions. Admin. Sci. Quart. 38(2) 277–303.

Kalish, S. 1985. A new product adoption model with price, adver-tising and uncertainty. Management Sci. 31(12) 1569–1585.

Kraut, R. E., R. E. Rice, C. Cool, R. S. Fish. 1998. Varieties of socialinfluence: The role of utility and norms in the success of a newcommunication medium. Organ. Sci. 9(4) 437–453.

Liu, H., P. Maes, G. Davenport. 2006. Unraveling the taste fabric ofsocial networks. Internat. J. Semantic Web Inform. Systems 2(1)42–71.

Mahajan, V., E. Muller, F. M. Bass. 1990. New product diffusionmodels in marketing: A review and directions for research.J. Marketing 54(1) 1–26.

Manchanda, P., Y. Xie, N. Youn. 2008. The role of targeted com-munication and contagion in new product adoption. MarketingSci. 27(6) 950–961.

Manski, C. F. 1993. Identification of endogenous social effects: Thereflection problem. Rev. Econom. Stud. 60(3) 531–542.

Nelson, P. 1970. Information and consumer behavior. J. PoliticalEconom. 78(2) 311–329.

Newman, M. E. J. 2003. The structure and function of complex net-works. SIAM Rev. 45 167–256.

Oestreicher-Singer, G., A. Sundararajan. 2008. The visible hand ofsocial networks in electronic markets. Working paper, NewYork University, New York.

Parameswaran, M., A. B. Whinston. 2007. Social computing: Anoverview. Comm. Assoc. Inform. Systems 19 762–780.

Peck, R. S., L. Y. Zhou, V. B. Anthony, K. Madhukar. 2008. Con-sumer Internet, Bear Stearns equity research report. BearStearns, New York.

Putnam, R. 2000. Bowling Alone: The Collapse and Revival of AmericanCommunity. Simon and Schuster, New York.

Raymond, E. 2001. The Cathedral and the Bazaar. O’Reilly, Sebastopol,CA.

Reagans, R. E., E. W. Zuckerman. 2008. Why knowledge does notequal power: The network redundancy trade-off. Indust. Cor-porate Change 17(5) 903–944.

Resnick, P., R. Zeckhauser, E. Friedman, K. Kuwabara. 2000. Repu-tation systems. Comm. ACM 43(12) 45–48.

Rogers, E. 1995. Diffusion of Innovations, 4th ed. The Free Press, NewYork.

Rogers, E. M., L. D. Kincaid. 1981. Communication Networks. FreePress, New York.

Sacerdote, B. 2001. Peer effects with random assignment: Resultsfor Dartmouth roommates. Quart. J. Econom. 116(2) 681–704.

Salganik, M., P. S. Dodds, D. Watts. 2006. Experimental study ofinequality and unpredictability in an artificial cultural market.Science 311 854–856.

Strang, D., N. B. Tuma. 1993. Spatial and temporal heterogeneity indiffusion. Amer. J. Sociol. 99(3) 614–639.

Sundararajan, A. 2007. Local network effects and complex networkstructure. Berkeley Electronic J. Theoret. Econom. 7(1, article 46).http://www .bepress.com/bejte/vol7/iss1/art46

Szabo, G., B. A. Huberman. 2009. Predicting the popularity ofonline content. Comm. ACM. Forthcoming.

Talukdar, D., K. Sudhir, A. Ainslie. 2002. Investigating new productdiffusion across products and countries. Marketing Sci. 21(1)97–114.

Van den Bulte, C., G. Lilien. 2001. Medical innovation revisited:Social contagion versus marketing effort. Amer. J. Sociol. 106(5)1409–1435.

Van den Bulte, C., S. Stremersch. 2004. Social contagion and incomeheterogeneity in new product diffusion: A meta-analytic test.Marketing Sci. 23(4) 530–544.

Wang, F.-Y., K. M. Carley., D. Zeng, W. Mao. 2007. Social computing:From social informatics to social intelligence. IEEE IntelligentSystems 22(2) 79–83.

Wasserman, S., K. Faust. 1994. Social Network Analysis: Methods andApplications. Cambridge University Press, Cambridge, UK.

Watts, D. J. 2002. A simple model of global cascades on randomnetworks. Proc. National Acad. Sci. 99(9) 5766–5771.

Watts, D. J., P. S. Dodds, M. E. J. Newman. 2002. Science 296(5571)1302–1305.

Wejnert, B. 2002. Integrating models of diffusion of innovations:A conceptual framework. Annual Rev. Sociol. (28) 297–326.

Wells, J. 2009. It’s crunch time. Globe Mail (February 13). http://www.theglobeandmail.com/report-on-business/article971011.ece

Wooldridge, J. M. 2002. Econometric Analysis of Cross Section andPanel Data. MIT Press, Cambridge, MA.

Copyright:

INFORMS

holdsco

pyrig

htto

this

Articlesin

Adv

ance

version,

which

ismad

eav

ailableto

subs

cribers.

The

filemay

notbe

posted

onan

yothe

rweb

site,includ

ing

the

author’s

site.Pleas

ese

ndan

yqu

estio

nsrega

rding

this

policyto

perm

ission

s@inform

s.org.