[#500distro] measuring for impact: knowing when, what & how to a/b test

@mike_greenfield

Measuring For Impact: Knowing What, and How to A/B Test

@mike_greenfieldCEO/Co-Founder, Laserlike

2014-08-07

@mike_greenfield

You know you should A/B test.

@mike_greenfield

You also know you should exercise more eat less sugar spend less on coffee wear sunscreen etc., etc.

@mike_greenfield

(Don’t worry, I’m not going to say anything else

about sugar or sunscreen.)

@mike_greenfield

So, how do you create a culture in which people will constructively A/B

test?

Do six things.

@mike_greenfield

1. Embrace “I don’t know”

We have 2+ ideas.

I don’t know which one will be more effective.

@mike_greenfield

@mike_greenfield

2. Have Data, Choose Metrics

To test, you need:• People using your product• (Approximate) agreement on the

metrics that matter

@mike_greenfield

Not Many Users? Don’t A/B test!

• Laserlike, has ~60 users and has never run an A/B test

• We will run many, many tests when we have enough users

• A test should have at least a few hundred instances (and a lot more if effect sizes are likely to be small)

• Test iff you can have “business significance”

@mike_greenfield

Know What You Want to Optimize

• If it’s important, you should be running tests to improve it

• If it’s not important, spend time on other things

• Most tests should be aimed at improving 1-2 specific variables

@mike_greenfield

3. Have Clear Process, Tech for Testing

@mike_greenfield

A/B Testing Process• New feature: if possible, roll out to a

small test subset first (10s or 100s of thousands)

• Version change: always test things that could (cumulatively) have business impact

• Everyone on the product team should be running and resolving tests

@mike_greenfield

A/B Testing Tech• Using a third party testing service is

akin to building your site on Wordpress: great at some scales/competency levels

• No matter how you’re testing, a new test should be at most a few lines of code

• It should be easy to see how each side of a test compares across many variables

@mike_greenfield

4. Understand the Math of What to Test

@mike_greenfield

Process: Same vs. New Tweak

• What’s the probability your tweak will have a positive effect?

• What kind of effect might that have, and how might that effect change the company’s prospects?

• Will you be able to measure the change?

• Optimize on one variable, but look at others

@mike_greenfield

Process: Same vs. Big Change

• What’s the probability that your change will have a negative impact?

• How big an impact might there be?• Will you be able to measure the

change?• Holistic approach

@mike_greenfield

A/B Test for Quality

• Circle of Moms: test “warning” users when questions seemed short, low quality

• Resulting questions were graded for quality, without grader knowing test bucket

• End result: warning yielded ~5% fewer questions, but much higher quality

@mike_greenfield

5. Understand the Math of Picking Winners

@mike_greenfield

Resolving Too Soon vs. Resolving Too Late

• How big is the potential audience for this test?

• Example 1: end of year “most popular baby names” email that will never be sent again

• Example 2: Facebook signup flow

@mike_greenfield

Longitudinal Tests vs. Immediate Tests

• Longitudinal: change home page, email frequency, product framing

• Need to examine effect over a long period

• Immediate: change button color, email subject

• Likely that long-term effects will be minimal

@mike_greenfield

Automatically Resolve Tests?

• Longitudinal tests should not be automatically resolved

• Example: new home page design

• Immediate tests can be automatically resolved when speed is important and there is one clear objective function

• Example: Circle of Moms email subject optimization

@mike_greenfield

Choose robust statistics• Bad: # of page views• Good: % of users viewing at least [5,

25, 100] pages• Potentially bad: # of sales (when

small)• Potentially good: # of people getting

through the second step of a sales funnel

@mike_greenfield

6. Celebrate A/B Testing Successes

@mike_greenfield

@mike_greenfield

[email protected]@mike_greenfield

mailto:[email protected]

[#500distro] measuring for impact: knowing when, what & how to a/b test

Internet

automatically

change

mikegreenfield

tests

test

big

small

users