Two-stage models of innovation adoptionwith partial observability
Christophe Van den Bulte Gary L. Lilien
University of Pennsylvania Pennsylvania State University
Georgetown UniversityDecember 4, 2009
Structure
1. Awareness/consideration vs. evaluation
2. Two-stage models with partial observability
3. Three applications
Monica
• A fellow sociologist asked Stanley Lieberson:
Would the name Monica become more or less popular because of the scandal?
Stanley Lieberson. 2000. A Matter of Taste: How Names, Fashions, and Culture Change. New Haven, CT: Yale University Press.
Medical Innovation
• In an earlier paper, we showed that evidence of social contagion disappears after one controls for marketing effort. However, that does not mean that social contagion was not at work:
“[O]ur hazard models did not distinguish between two important stages in the adoption process: awareness followed by evaluation conditional upon awareness (Rogers 1995). … Modeling the effect of marketing efforts and social contagion without distinguishing between awareness and evaluation might produce misleading results when marketing efforts are quite important in creating awareness, and social contagion is moderately—though still sizably—important in persuading actors to adopt the innovation. When both explanatory variables are forced into a single-stage model, the weaker social contagion effect may be washed out by the marketing effort, erroneously suggesting that social contagion was not at work.”
Christophe Van den Bulte and Gary L. Lilien. 2001. “Medical Innovation Revisited: Social Contagion versus Marketing Effort.” American Journal of Sociology, 106 (March), 1409-35.
Stages in the adoption process (Rogers 1962)
1. Awareness
2. Interest
3. Evaluation
4. Trial
5. (Sustained) adoption
Awareness / consideration
Evaluation / adoption
Awareness/consideration is rarely measured
• Not overt behavior no paper trail
• Not a memorable event poor recall
• Multi-wave surveys:
– Expensive
– Asking the question may actually make people aware
Can we bridge the gap between theory and datausing better models?
Structure
1. Awareness/consideration vs. evaluation
2. Two-stage models with partial observability
3. Three applications
Model typology
Consideration
Evaluation
Adoption
Initial state
Two-stagew/o memory
Consideration
Evaluation
Adoption
Initial state
Two-stagew/ memory
Initial state
Adoption
Evaluation
TraditionalSingle-hurdle
Standard, single-stage hazard model
Hazard (in discrete time)
= probability that you adopt, given that you have not adopted before
Pit = Pr[Ti = ti | Ti ti]
LL = Si [ ci ln { Pr[Ti = ti] } + (1 - ci) ln { Pr[Ti > ti] } ]
Pit = Pr[yit =1 | yit-1 = 0]
LL = Si St [1 - dit] [ yit ln {Pit} + (1- yit) ln {1 - Pit}]
Censoring indicator
Non-censoring indicator
Notation
Adoption
yit = 1 if i has adopted at time t, yit = 0 otherwise
Awareness/consideration
ait = 1 if i is aware at time t, ait = 0 otherwise
Positive evaluation = Adoption conditional on awareness
eit = 1 if i evaluates product positively at time t, eit = 0 otherwise
Two-stage model: set-up
Adoption
Pr[yit = 1 | yit-1 = 0] = Pr[eit = 1, ait = 1 | yit-1 = 0]
= Pr[eit = 1 | ait = 1, yit-1 = 0] Pr[ait = 1 | yit-1 = 0]
Awareness
Pr[ait = 1 | yit-1 = 0] = F1(a1x1it)
Adoption conditional on awareness
Pr[eit = 1 | ait = 1, yit-1 = 0] = F2(a2x2it)
Case 1: Full observability
One must be in one of three states:
0: ait = 0, eit not relevant (no awareness and hence no adoption)
1: ait = 1, eit = 0 (awareness, but no positive evaluation; hence no adoption).
2: ait = 1, eit = 1 (both awareness and positive evaluation, hence adoption).
Let P0it, P1it and P2it denote the probability of being in each state, given yit-1 = 0
One can then write:
LL = Si St [1 - dit] [ (ait eit) ln P2it + ait (1- eit) ln P1it + (1- ait) ln P0it ]
Practically:
One can also estimate two hazard models separately, one for awareness and one for adoption given awareness
F1 * F2 F1 * (1-F2) (1-F1)
Case 2: Partial observability, no memory
The researcher does not observe ait and eit separately, but observes only their product
ait * eit = yit
The researcher can not estimate the same LL as before, but must contract it
LL = Si St [1 - dit] [ (ait eit) ln P2it + ait (1- eit) ln P1it + (1- ait) ln P0it ]
LL = Si St [1 - dit] [ ( yit ) ln P2it + (1- yit) ln { 1 - P2it } ]
LL = Si St [1 - dit] [ yit ln { F1 * F2 } + (1- yit) ln { 1 - F1 * F2 } ]
Limitations:
1. What prevents someone who is aware to become unaware later on?
2. Full symmetry, so only the covariates provide interpretation to stages.
Case 3: Partial observability, perfect memory
The trick is to keep track of the many ways in which someone can end up adopting at time t.
Someone who adopts at time 1 must have become aware and evaluative at time 1
Pr(T = 1) = F1(1) F2(1)
Someone who adopts at time 2 may:Have become aware at time 1, but evaluative only at time 2
Have become aware only at time 2, and immediately evaluative
Pr(T = 2) = F1(1) [1 - F2(1)] F2(2) + [1- F1(1)] F1(2) F2(2)
= F2(2) { F1(1) [1 - F2(1)] + [1- F1(1)] F1(2) }
In general, we can write:
Pr(T = t) = F2(t) { F1(1) [1 - F2(1)] [1 - F2(2)] [1 - F2(3)] … [1 -
F2(t-1)] +
[1- F1(1)] F1(2) [1 - F2(2)] [1 - F2(3)] … [1 - F2(t-1)]
+
[1- F1(1)] [1 - F1(2)] F1(3) [1 - F2(3)] … [1 - F2(t-1)]
+
… +
[1- F1(1)] [1 - F1(2)] [1 - F1(3)] … [1 - F1(t-1)]
F1(t) }
= F2(t) Sst { Pk<s [1- F1(k)] } F1(s) { Psq<t [1- F2(q)] }
Case 3: Partial observability, perfect memory (ct’d)
Case 3: Partial observability, perfect memory (ct’d)
Also, we can write:
Pr(T > t) = Ppt [1- F1(p)] +
[1 - F2(t)] [ Sst { Pk<s [1- F1(k)] } F1(s) { Psq<t [1- F2(q)] } ]
Having expressions for both Pr(T = t) and Pr(T > t), we can plug them in
the general formula for hazard models
LL = Si [ ci ln { Pr[Ti = ti] } + (1 - ci) ln { Pr[Ti > ti] } ]
Estimating the models with standard software
Single-stage Any BDV software
2-stage w/o memory Limdep (“Abowd-Ferber probit”)A few lines of code in SAS or Stata
2-stage w/ memory Not as handy to code in “canned” statistical software. But can be coded rather easily in Excel
Structure
1. Awareness/consideration vs. evaluation
2. Two-stage models with partial observability
3. Three applications
Application I: Medical Innovation
Important study
Good test case
Strong marketing effects
Weak contagion effects, but probably still effects
Data on tetracycline adoption
Monthly, November 1953-February 1955 (first 17 mos.)
121 physicians in 4 small Midwestern cities
87% (105) had adopted by end of observation period
Data collected by Coleman, Katz and Menzel; covariates focus on personal characteristics and social networks
Additional archival data on marketing effort (advertising in 4 journals)
Coleman, Katz and Menzel 1966; Burt 1987; Marsden and Podolny 1990; Strang and Tuma 1993; Valente 1996;Van den Bulte and Lilien 2001
Covariates
Awareness• Number of journals (log)• Science orientation• Advertising : Mt = mt + (1- d) Mt-1 • Advisor status• Advisor status x Advertising
Evaluation / Adoption• Summer (dummy)• Age and Age2
• Chief / admin / honorary• Science orientation• Social network exposure : SNEit = [ Sj wij yjt-1 ]
g
• Advisor status• Advisor status x SNE
Medical Innovation application: Results for models with social contagion from direct ties
Intercept -4.14 **** -4.12 *** -3.47 ****
Number of journals (log) 0.86 *** 0.73 ** 0.68 **
Science orientation 1.05 **** 1.15 **** 0.89 ***
Marketing effort 3.76 *** 4.77 **** 3.22 ***
Decay rate (d) 0.26 0.22 ** 0.40 ***
Advisor status (indegree) -0.06 -0.04 -0.10
Advisor status x Marketing effort 0.42 0.07 0.58
Intercept … 2.52 -0.57
Summer -0.77 * -2.95 ** -1.62 **
Age -0.13 * -0.64 ** -0.47 **
Age2 -0.11 ** -0.58 * -0.38
Chief -0.90 ** -10.90 ** -8.95 ***
Science orientation … -1.02 0.60
SNE (Direct ties) a 1.19 4.95 * 2.98 **
g 6.26 ** 12.32 b 9.05 **
Advisor status (indegree) … 3.71 *** 3.25 ****
Advisor status x SNE -0.13 ** -0.05 0.72
-2LL 600.34 591.09 583.33
df 14 17 17
AIC 628.34 625.09 617.33
BIC 665.50 670.21 662.45
Note.—Results are from complementary log-log models. The significance levels reported are for likelihood ratio tests that the parameter of interest is zero, except for tests of g, where the test is g = 1.a SNE stands for social network exposureb Nested model with g = 1 does not converge.* P < .10; ** P < .05; *** P < .01; **** P < .001
Two-Stage Models_____________________________________________
Zero Memory Perfect Memory
Single-Stage Model
Two-Stage Models_____________________________________________
Zero Memory Perfect Memory
Single-Stage Model
Intercept -4.13 **** -3.49 **** -3.49 ****
Number of journals (log) 0.87 *** 0.59 ** 0.68 **
Science orientation 1.08 **** 0.88 *** 0.97 ****
Marketing effort 3.30 *** 4.56 **** 2.86 ***
Decay rate (d) 0.28 *** 0.31 ** 0.40 **
Advisor status (indegree) -0.07 -0.07 ** -0.10
Advisor status x Marketing effort 0.62 * 0.26 0.61
Intercept … 0.07 -0.37
Summer -0.83 * -2.67 ** -2.00 ***
Age -0.11 -0.37 ** -0.41 **
Age2 -0.10 ** -0.35 *** -0.49 ***
Chief -0.90 ** -5.44 *** -6.56 ***
Science orientation … -0.54 0.40
SNE (Structural equivalence) a 0.51 1.83 ** 1.58 **
g 1.79 3.33 b 3.05
Advisor status (indegree) … 2.34 *** 2.77 ****
Advisor status x SNE -0.10 ** 0.07 0.30
-2LL 603.56 591.82 585.01
df 14 17 17
AIC 631.56 625.82 619.01
BIC 668.72 670.94 664.13
Note.—Results are from complementary log-log models. The significance levels reported are for likelihood ratio tests that the parameter of interest is zero, except for tests of g, where the test is g = 1.a SNE stands for social network exposureb Nested model with g = 1 does not converge.* P < .10; ** P < .05; *** P < .01; **** P < .001
Medical Innovation application: Results for models with social contagion from structural equivalents
Results
Fit across three models
Evidence of contagion• Effect’s significance across models• Non-linearity effect differs between cohesion and equivalence
Other • Advertising decay rate across models• Very large effect of chief / admin / honorary position• Effects may vary across stages
• Science orientation• Advisor status
Application II: a new drug
Some key differences w/ previous study
Higher risk and ambiguityLife-threatening condition if left untreatedMore complex treatment plans
More detailed dataData on self-reported vs. sociometric leadership Data on prescription volume after adoptionData on sales calls, by month-physician
Data on new drug adoption
Monthly, 2005-2007 (first 17 mos.)
193 physicians in 3 large citiesOnly prescribers of existing drugs for same medical condition
35% (68) had adopted by end of observation period
Network dataSociometric surveyDiscussion and patient referral tiesResponse rate 45%, 32%, and 25%
We cannot properly identify positional equivalence
Prescription data for both respondents and non-respondents We can properly identify contagion through direct contacts
Sales call dataNumber of details for focal drug, by physician and by month
Data from Iyengar, Van den Bulte and Valente, MSI Report 08-120.
CovariatesAwareness
• Sociometric in-degree• Self-reported opinion leadership• Primary practice with university/teaching hospital• Not a specialist (but primary care) • Patient volume (# patients seen with medical condition) • Tendency to refer patients before initiating treatment• City dummies• Detailing : Mit = mit + (1- d) Mit-1
Evaluation / Adoption• Sociometric in-degree• Self-reported opinion leadership• Primary practice with university/teaching hospital• Patient volume (# patients seen with medical condition) • Tendency to refer patients before initiating treatment• Detailing : Mit = mit + (1- d) Mit-1
• Social network exposure : SNEit = [ Sj wij qjt-1 ]
Application to new drug: Results for models with social contagion from direct ties
Note.—Results are from probit models. The significance levels reported for single-stage and zero-memory models are from Wald tests that the parameter of interest is zero.a SNE stands for social network exposure* P < .10; ** P < .05; *** P < .01
Two-Stage Models_____________________________________________
Zero Memory Perfect Memory
Single-Stage Model
Intercept -2.57 *** -0.56 -1.30 ***
City 2 -0.03 0.14 0.09
City 3 -0.14 -0.18 -0.23
Sociometric status 0.18 *** 0.17 *** 0.20 ***
Self-reported status 0.06 -0.18 -0.13
University/teaching hospital 0.28 * 0.91 *** 0.79 ***
Non-specialist -0.31 -0.44 -0.50
Patient volume (/100) 0.05 -0.06 0.00
Early referral -0.24 -0.46 -0.35
Marketing effort 0.20 *** 0.06 0.12 ***
Decay rate (d) 0.49 *** 0.44 ** 0.37 ***
Intercept … -4.30 *** -4.85 ***
Sociometric status … 0.21 0.13
Self-reported status … 0.50 ** 0.50 **
University/teaching hospital … -0.75 -0.79 **
Patient volume (/100) … 0.98 * 3.32 ***
Early referral … 0.76 0.68
Marketing effort … 1.13 *** 1.10 **
SNE (Direct ties) a 0.01 ** 0.06 *** 0.05 ***
-2LL 500.7 460.0 463.0
df 12 19 19
AIC 524.7 498.0 501.0
BIC 551.3 540.2 543.2
Results
Fit across three models
Evidence of marketing effort• Important in both stages • In this application, decay rate does not increase across models
Measures of opinion leadership• Sociometric leadership: only in awareness/consideration • Self-reported leadership: only in evaluation
Other effects may vary across stages as well• University/teaching hospital
Application III: ATM adoption
Data set analyzed several times in economics
Good test case
Efficiency effects have been documented
Legitimation effects unknown (though expected)
Measures of both local and global density
Hannan and McDowell 1984a, 1984b, 1987; Saloner and Shepard 1995; Sinha and Chandrashekaran 1992
Data on ATM adoption
Annual, 1971-79 (first nine years of ATM use in U.S.A.)
3683 banks in operation for the entire nine-year period
392 different local banking markets
20% (739) of banks had adopted by end of 1979
Data collected by Federal Reserve; covariates focus on market structure, bank size, profitability and type.
CovariatesConsideration
• Global density (prior adoptions across U.S.)• Demand deposits as % total assets• Market share
Efficiency (incl. rivalry)• Local density (prior adoptions within market)• Demand deposits as % total assets • Market share• Off-premise ATMs legally allowed• Urban bank• Average market wage rate• Price / year dummies• Number of banks in market• CR3 concentration ratio• 1-year growth in assets• Return on assets• Total assets• Ownership by bank holding company
Comp. Intensity
Ability to pay
Economic value
Rival precedence
ATM application: Results Two-Stage Models __________________________________ Zero Memory Perfect Memory
Single-Stage Models___________________________________ Fixed base rate Flexible base rate
Note.—Results are from probit models. The significance levels reported are for likelihood ratio tests that the parameter of interest is zero.* P < .05; ** P < .01; *** P < .001
Intercept -2.551 *** -2.261 *** -1.953 *** -1.928 ***
Demand deposits … … -0.146 -0.120
Market share (log) … … 0.182 *** 0.164 ***
Global density 0.979 … 5.015 *** 4.198 ***
Intercept … … -1.322 *** -1.855 ***
1972 … -0.749 *** 0.041 0.492
1973 … -0.558 *** 0.461 0.534 *
1974 … -0.096 1.470 *** 1.090 ***
1975 … -0.299 *** 0.726 * 0.491 *
1976 … -0.144 0.875 ** 0.578 **
1977 … -0.078 0.839 *** 0.517 ***
1978 … -0.246 *** 0.210 0.168
1979 … -0.350 *** -0.185 -0.162Number of banks 0.128 *** 0.127 *** 1.121 ** 0.829 **
CR3 0.244 0.252 0.827 * 0.645 *
Market growth -0.076 0.901 ** 0.057 0.002ROA 0.655 *** 0.496 *** 3.038 *** 2.341 ***Total assets (log) 0.185 *** 0.191 *** 0.643 *** 0.535 ***
BHC 0.158 *** 0.155 *** 0.132 0.133Off premise 0.278 *** 0.285 *** 0.778 *** 0.503 ***
Urban 0.116 0.126 * 0.084 0.080
Wage rate 0.049 ** 0.040 * 0.043 0.028
Demand deposits 0.342 * 0.378 * 1.261 0.888
Market share (log) 0.114 *** 0.110 *** -0.014 0.006Local density 0.966 *** 0.951 *** 1.508 ** 1.097 ***
-2LL 6013.54 5947.70 5888.56 5843.58df 14 21 25 25 AIC 6041.54 5989.70 5938.56 5893.58 BIC 6106.01 6086.41 6053.69 6008.71
Results
Fit across three models
Evidence of legitimacy effect• Effect of global density across models
Other • Some efficiency effects that are marginally significant in
single-stage model disappear in two-stage models• Effects may vary across stages
• Market share
Conclusion
Using models to bridge gap in richness between theory and data
Areas of use• Mass media effects vs. network effects• Legitimation processes in neo-institutional theory• Deviance (deviant behavior vs. detection)• Discrimination (selective application vs. discrimination)• Life course research (e.g., sexual intercourse vs. pregnancy)
No free lunch • Need for data on time of separate transitions is substituted
by need for good covariates and theory