based in part on joint work with: denis nekipelov (uva, msr) guido imbens (stanford gsb) stefan...

50
Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Data-Driven Market Design Susan Athey, The Economics of Technology Professor, Stanford GSB Consulting researcher, Microsoft Research

Upload: angelica-flynn

Post on 17-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Based in part on joint work with:Denis Nekipelov (UVa, MSR)Guido Imbens (Stanford GSB)Stefan Wager (Stanford)Dean Eckles (MIT)

Based in part on joint work with:Denis Nekipelov (UVa, MSR)Guido Imbens (Stanford GSB)Stefan Wager (Stanford)Dean Eckles (MIT)

Data-Driven Market Design

Susan Athey, The Economics of Technology Professor, Stanford GSB

Consulting researcher, Microsoft Research

Page 2: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Marketplaces• Uber, Lyft, Airbnb, TaskRabbit, Rover, Zeel, Urbansitter• Two groups of customers

– Cross-side network effects

Auction-based platforms• Online advertising• Used cars• eBay

Market design matters

Introduction

Page 3: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

eBay examples

• Eliminate listing fees

• Make pictures free

• Change search algorithm– Emphasize price– Force sellers into more uniform categories

• Shipping costs

Search advertising examples

• Change broad match criteria for looser matching

• Change pricing

Market Design Examples: Short Run v. Long Run

4STANFORD GRADUATE SCHOOL OF BUSINESS

See, e.g.:“Asymmetric Information, Adverse Selection and Online Disclosure: The Case of eBay Motors,” Lewis, 2014“Consumer Price Search and Platform Design in Internet Commerce,” Dinerstein, Einav, Levin, Sundaresan, 2014“A Structural Model of Sponsored Search Auctions,” Athey & Nekipelov, 2012

Page 4: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Theoretical framework is key• Makes arguments coherent and precise

• Identifies equilibrium/long-term effects

• Advertiser/sellers and consumer choices incorporated

Complement with data

Influencing Market Design

Page 5: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Data can inform what kinds of designs will work better or worse in range of environments similar to existing one

Advocacy for design issues is much more effective with theory and data combined

Experimentation crucial but also has limitations• Short-term experiments can’t show long-term outcomes, feedback effects

• Can help gain insight by analyzing heterogeneity of effects

One part of empirical economics focuses structure on empirical analysis in order to learn model “primitives” and perform “counterfactuals”

• Learn bidder values, predict equilibrium responses

• Map between short-run user experience and long-term willingness to click

Data, Experimentation, and Counterfactuals

Page 6: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

A/B Testing

7STANFORD GRADUATE SCHOOL OF BUSINESS

100% users

Control: existing system

Treatment: Modified system

User interactions instrumented, analyzed, and

compared

50% 50%

Results analysis and future design decisions

Control average

outcome

Treatment average

outcome

Page 7: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Short Term A/B Tests Have Limitations: Search Advertising Case Study

Standard industry practice

– Run A/B tests on users or page views to test new algorithms

– Make ship decision based on revenue & user impact

8STANFORD GRADUATE SCHOOL OF BUSINESS

Multiple errors– Unit of experimentation is not unit

of analysis• Each advertiser only sees change

on 1% of traffic

– Interaction effects ignored, fixing bids & budgets• When released to market, other

advertisers might hit budget constraints, causing a given advertiser to hit their own

– Reactions of bids and budgets ignored

– Long term participation ignored

Page 8: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Short Term A/B Tests Have Limitations: Search Advertising Case Study

9STANFORD GRADUATE SCHOOL OF BUSINESS

Cheaper fixes (motivated by theory)- Modify evaluation criteria (short term metrics)

- Instead of measuring actual short-term performance, focus on part that is correlated with long-term

- E.g. only count good clicks- Theory: advertisers won’t pay for bad clicks in equilibrium

- Do a small number of long term studies to relate short term metrics to long term

- Up-front study only captures responsiveness to types of changes observed in study—can’t answer all questions

- Add in constraints that “protect” advertisers- E.g. constrain price increases at the advertiser level - But advertisers very heterogeneous in preferences and

responsiveness

Page 9: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Short Term A/B Tests Have Limitations: Search Advertising Case Study

10STANFORD GRADUATE SCHOOL OF BUSINESS

Expensive solution I: Long-term advertiser-based A/B test

- Apply treatments to randomly selected advertisers (stratify)

- Watch for a long time

- Problems:• Ignores advertiser interactions!!

• Unclear whether conclusions will generalize once other advertisers respond

• Expensive and disruptive to advertisers under experimentation

• Takes a long time (advertisers respond slowly), interferes with ongoing innovation

Page 10: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

11

Page 11: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

12

Page 12: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

13

Page 13: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

14STANFORD GRADUATE SCHOOL OF BUSINESS

eCommerce Markets as Bipartite Graphs

Query A

Query B

Query C

Sellers

Page 14: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Testing Interactions:Athey, Eckles, Imbens (2015)

• How to do inference properly for various hypotheses?– Method for exact p-values for a class of non-sharp null hypotheses.

• Exact: no large sample approximations• Sharp: outcome for each unit known under the null for different treatment assignments

– Test hypotheses of the form • “Treating units j related to unit i in way Z has no effect on i” • “Only fraction of neighbors treated matters, not identity or their network position”

– Novel insight• Can turn non-sharp null into a sharp one by defining an artificial experiment and

analyzing only a subset of the units as “focal” units

• Best ways to analyze existing experiments– Most powerful test statistics

• Develop model-based test statistics that perform better than commonly used heuristics.• Insight: write down a structural model and use score-based test

– How to use exogenous variation that is in the data most effectively

Page 15: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Unit Yi(0 FOF*)

Yi(>1 FOF*)

Aux Unit

Aux Wi

Alt. assignments of FOF Wi

1 2 3 4 5 6

A 3 3 C 1 1 0 1 0 0 1

B 2 2 D 0 1 0 0 1 1 0

*Holding fixed own treatment and friends’ treat

F 1 0 1 0 1 0 1

G 0 0 1 1 0 1 0

Probabilities 1/6 1/6 1/6 1/6 1/6 1/6

Test statistic: Edge Level Contrast for FOF links

between Focal and Auxiliary units

1/3 8/3-7/3=1/3

7/3-8/3=-1/3

5/2-5/2=0

5/2-5/2=0

7/3-8/3=-1/3

8/3-7/3=1/3

Testing Hypotheses About Friends of Friends

I

C

E

D

B

H

AF

G

Focal unit A

Focal unit B

Auxiliary to Focal units A and B

Auxiliary to Focal unit B

Auxiliary to Focal units A and B

Buffer for Focal unit B

Buffer for Focal units A and B

Buffer for Focal unit A

Auxiliary to Focal unit A

Aux FOF treat v. control:A has C,F v. DB has F v. D,G

Page 16: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Short Term A/B Tests Have Limitations: Search Advertising Case Study

17STANFORD GRADUATE SCHOOL OF BUSINESS

Expensive solution II: Long-term market-based A/B test

Page 17: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Community Detection

Modularity-based algorithm identifies clusters: Modularity =

: fraction of links with both nodes inside the community

: number of links with at least one node in community

Page 18: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Clustered Randomization and Community-Level Analysis

Treated

Treated

Control

Ugander et al (2013) proposal:

• Define “treatment” as having high share of friends treated

• Use propensity score weighting to adjust for non-random assignment to this condition

This is open research area

Page 19: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Short Term A/B Tests Have Limitations: Search Advertising Case Study

20STANFORD GRADUATE SCHOOL OF BUSINESS

Expensive solution II: Long-term market-based A/B test

- Cluster advertisers- Bipartite graph weighting links between advertisers and

queries- Weights are clicks or revenue- Issue: spillovers large

- Experiment at cluster level

- Watch for a long time- Problems:

• Expensive and disruptive to advertisers under experimentation• Takes a long time (advertisers respond slowly), interferes with

ongoing innovation

Page 20: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Further Approaches

1. Gain insight from A/B tests by studying heterogeneity of effects

– Understand how innovation affects different advertisers and queries

– Relate to other work showing which types of advertisers are most responsive

– Can also explore variety of scenarios interacted with advertiser characteristics

21STANFORD GRADUATE SCHOOL OF BUSINESS

2. Build a structural econometric model and do counterfactual predictions

– Requires investment in model development and validation

– Relies on assumptions, may not be accepted even if validated

Page 21: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Experiments and Data-Mining

• Concerns about ex-post “data-mining”– In medicine, scholars required to pre-specify analysis plan

– In economics, calls for similar protocols

• But how is researcher to predict all forms of heterogeneity in an environment with many covariates?

• Goal of Athey & Imbens 2015, Wager & Athey 2015:– Allow researcher to specify set of potential covariates

– Data-driven search for heterogeneity in causal effects with valid standard errors

– See also Langford et al, Multi-World Testing

Page 22: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Segments with similar treatment effects

• Data-driven search for subgroups

• Pro: Interpretability, communicability, policy, inference with moderate sample sizes

• Con: Not the best predictor for any individual; segments unstable

Fully personalized predictions

• Non-parametric estimator of treatment effect as function of covariates

• Pro: Best possible prediction for each individual, and can do inference as per Wager-Athey 2015

• Con: Confidence intervals have poor coverage with too many covariates; hard to interpret and communicate

Our contributions:

• Optimize existing ML methods for each of these goals

• Deal with issue: no observed ground truth

• Methods provide valid confidence intervals without sparsity

Segments versus Personalized Predictions

23STANFORD GRADUATE SCHOOL OF BUSINESS

Page 23: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Regression Trees for Prediction

Page 24: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Using Trees to Estimate Causal Effects

Model:

• Random assignment of Wi

• Want to predict individual i’s treatment effect – This is not observed for any individual

• Let

Page 25: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Using Trees to Estimate Causal Effects

• Approach 1: Analyze two groups separately– Estimate using dataset where

– Estimateusing dataset where

– Do within-group cross-validation to choose tuning parameters

– Construct prediction using

• Approach 2: estimate using tree including both covariates– Choose tuning parameters as usual

– Construct prediction using

– Estimate is zero for x where tree does not split on w

Observations Estimation and cross-validation not

optimized for goal Lots of segments in Approach 1:

combining two distinct ways to partition the data

Problem 1 What is a candidate estimator for

Problem 2 How do you evaluate goodness of fit

for tree splitting and cross-validation?

is not observed and thus you don’t have ground truth for any unit

Page 26: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Causal Trees

• Directly estimate treatment effects within each leaf

• Modify splitting criterion to focus on treatment effect heterogeneity

• Cross-validation criterion must estimate ground truth– Build on statistical theory

• Honest trees: one sample to split, another to estimate effects, yields valid confidence intervals– Anticipating honesty changes algorithms

• Result: for any ratio of covariates to observations and without sparsity assumptions, can discover meaningful heterogeneity and produce valid confidence intervals

Page 27: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Swapping Positions of Algo Links: Basic Results

Position 1 (natural) Position 3 Position 5 Position 100.0%

10.0%

20.0%

30.0%

25.4%

11.8%

7.5%

3.8%

Click-through rate of top link moved to lower position(US All Non-Navigational)

CTR

Moving a link from position 1 to position 3 decreases CTR by 13.6

percentage points

Page 28: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Search Experiment Tree: Effect of Demoting Top Link (Test Sample Effects) Some data

excluded with prob p(x): proportions do not match population

Highly navigational queries excluded

Page 29: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint
Page 30: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Use Test Sample for Segment Means & Std Errors to Avoid Bias

Variance of estimated treatment effects in training sample 2.5 times that in test sample

Test Sample Training SampleTreatment

EffectStandard

Error ProportionTreatment

EffectStandard

Error Proportion-0.124 0.004 0.202 -0.124 0.004 0.202-0.134 0.010 0.025 -0.135 0.010 0.024-0.010 0.004 0.013 -0.007 0.004 0.013-0.215 0.013 0.021 -0.247 0.013 0.022-0.145 0.003 0.305 -0.148 0.003 0.304-0.111 0.006 0.063 -0.110 0.006 0.064-0.230 0.028 0.004 -0.268 0.028 0.004-0.058 0.010 0.017 -0.032 0.010 0.017-0.087 0.031 0.003 -0.056 0.029 0.003-0.151 0.005 0.119 -0.169 0.005 0.119-0.174 0.024 0.005 -0.168 0.024 0.0050.026 0.127 0.000 0.286 0.124 0.000

-0.030 0.026 0.002 -0.009 0.025 0.002-0.135 0.014 0.011 -0.114 0.015 0.010-0.159 0.055 0.001 -0.143 0.053 0.001-0.014 0.026 0.001 0.008 0.050 0.000-0.081 0.012 0.013 -0.050 0.012 0.013-0.045 0.023 0.001 -0.045 0.021 0.001-0.169 0.016 0.011 -0.200 0.016 0.011-0.207 0.030 0.003 -0.279 0.031 0.003-0.096 0.011 0.023 -0.083 0.011 0.022-0.096 0.005 0.069 -0.096 0.005 0.070-0.139 0.013 0.013 -0.159 0.013 0.013-0.131 0.006 0.078 -0.128 0.006 0.078

Page 31: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

What if we want personalized predictions?

32STANFORD GRADUATE SCHOOL OF BUSINESS

Page 32: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

From Trees to Random Forests (Breiman, 2001)

“Adaptive” nearest neighbors algorithm

Page 33: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Random forest

• Subsampling to create alternative trees– +Lower bound on probability each feature sampled

• Causal tree: splitting based on treatment effects, estimate treatment effects in leaves

• Honest: two subsamples, one for tree construction, one for estimating treatment effects at leaves– Alternative for observational data: construct tree based on propensity for assignment to

treatment (outcome is W)

• Output: predictions for

Main results (Wager & Athey, 2015)

• First asymptotic normality result for random forests (prediction), extends to causal inference & observational setting

• Confidence intervals for causal effects

Causal Forests

Page 34: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint
Page 35: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint
Page 36: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint
Page 37: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Applying Method to Long-Term Prediction

• Estimate a model relating short-term metrics to long-term behavior, incorporating advertiser characteristics

• Estimate heterogeneous effects of treatment in short-term test

• Map effects to long-term impact on actors (advertisers)

• Predict long-run responses based on responsiveness of the affected advertisers

• This method has difficulty with full equilibrium response

38STANFORD GRADUATE SCHOOL OF BUSINESS

Page 38: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Approach 2: Build A Structural Model• Athey & Nekipelov (2012):

– Assume profit maximization, estimate values

• Athey & Nekipelov (2014, in progress): – Specify a set of objectives

– Estimate Bayesian model of objective type and parameters• Use experiments and algorithmic releases to identify objectives—different

objectives predict different reactions to change

– Model the decision to change bid

• Both: use model to predict how advertisers respond to short term metrics– Do short-term experiment

– Calculate new equilibrium based on changes

– Assume small interactions among advertisers are zero to approximate numerically

39STANFORD GRADUATE SCHOOL OF BUSINESS

Page 39: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

40

Page 40: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

41

Page 41: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

42

Page 42: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

43

Page 43: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

44

Page 44: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

45

Page 45: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

46

Page 46: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

47

Page 47: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

48

Page 48: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

49

Page 49: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Summary: A Structural Model

• Findings– Substantial set of bidders predictably non-responsive in

medium term (high implied cost of bid changing)

– Exact match advertisers optimize for position, while broad match optimize for ROI from clicks

– Short-term experiments and long-term counterfactuals can go in opposite direction

– Examples: switch to Vickrey auction, improving accuracy of click predictor

50STANFORD GRADUATE SCHOOL OF BUSINESS

Page 50: Based in part on joint work with: Denis Nekipelov (UVa, MSR) Guido Imbens (Stanford GSB) Stefan Wager (Stanford) Dean Eckles (MIT) Based in part on joint

Conclusions

• Power of A/B Testing leads to a culture of relying heavily on experiments

• Standard experiments are not appropriate for many problems

• Expensive to use correct experimentation approach

• Layering analytics and structural models on top of experiments is cheaper

• But a culture of short-term experiments leads to resistance to non-experimental analytics, leading to short term focus for innovation

• Advice:– Use rules of thumb to trigger when more costly long-term approaches brought to bear

– Take every opportunity to study long-term effects, e.g. use staggered rollouts

– Study heterogeneity and build insight

51STANFORD GRADUATE SCHOOL OF BUSINESS