the state of split testing€¦ · how email marketers split test page 7 5 how do you measure...
TRANSCRIPT
1
The State ofSplit Testing
Survey Analysis
Analysed by Parry Malm – CEO, Phrasee [email protected] www.phrasee.co.uk
2
Email subject line split testing is nothing new. It’s a programmatic
experiment that has been widely used by marketers for some time
now. And yet, as you’ll read in the following analysis, it’s generally
done quite badly.
However, it’s not your fault! Well, not completely, that is. There are
many reasons why people aren’t split testing enough.
In the following pages, you’ll learn how the marketing industry
views subject line split tests... what they are doing today, and most
importantly, what they aren’t doing.
Welcome to the State of Split Testing
A note on methodology:
The State of Split Testing Survey
was conducted in October 2014
and had 304 respondents. All
responses were anonymous.
55% of the respondents were
brand-side, and 45% were
agency-side from across the
world (approximately 60% US,
25% UK, 15% other.) We’re
confident that it’s a broadly
representative sample of the
email marketing industry.
1
HOW TO
INTERPRET
THE RESULTS
PAGE 4
Contents
3
HOW EMAIL
MARKETERS
SPLIT TEST
PAGE 7
5
HOW DO
YOU MEASURE
SUCCESS?
PAGE 11
2
WHAT ELEMENTS
OF AN EMAIL
ARE THE MOST
IMPORTANT?
PAGE 5
4
WHAT DO
PEOPLE TEST?
PAGE 10
6
HOW CONFIDENT
ARE EMAIL
MARKETERS
IN THEIR SKILLS?
PAGE 13
7
BUT DON’T
TAKE IT
FROM US
PAGE 15
3
4
How to interpret the results
Most people will look at the “sample mean,” or the average, of the values in the survey responses. However, averages from a survey sample always have some variance in them and shouldn’t be trusted by themselves. The sample mean may not be the true mean of the population. It’s an estimate – but not the exact number.
Therefore, we encourage you to look at the 95% confidence interval range, represented by the high and low bars on each bar. This is the band of values which represents the likely range of the population mean.
OR, SIMPLY PUT: The typical email marketer’s responses exist within the confidence interval, but not necessarily at the average point.
The average can be a misleading statistic, so don’t focus too much on it. We’ve included it because it’s what people expect. We’ve included the confidence intervals because it’s what people learn from.
At Phrasee, we hate it when people use statistics badly. So, we’ve done our best to analyse the results of the survey in the most robust way possible.
The charts all look something like this:
How much do you love statistics?
Lots!
Meh.
Not a big fan.
FML!
44% 39% 49%
30%
20%
6%
95% confidence
interval
Sample mean
5
What elements of an email are the most important?
Different people, often with different agendas, will say various things are more or less important when it comes to email marketing. For example, data agencies will tell you that CRM matters most. Or, an ESP will tell you their technology is the most important. UX people will tell you that you need to go responsive or die. And, designers will tell you human creativity trumps all.
So, we asked people what, in their experience, affects the response rate of an email campaign, with one being low, and four being high.
In your experience/opinion, how much do the following elements affect the response rate of an email campaign?
Quality of data
Segment selection
Subject line
Rendering on multiple devices
Deliverability infrastructure
Human creativity
Email aesthetics / design
Time of day
ESP technology
List size
1LESS 2 3 MORE
3.4
3.3
3.7
3.6
3.5
3.1
3.1
3.1
3.0
2.9
2.5
2.4
3.3
2.8
2.8
2.8
2.7
2.6
2.1
2.1
3.5
3.4
3.4
3.0
2.9
2.9
2.9
2.7
2.3
2.3
6
Quality of data, segment selection and the subject line are all viewed as the most important elements of an email campaign.
Quality of data makes intuitive sense, of course – bad data equals spam boxes. Segment selection – yep, sending the right email to the right people is generally a pretty good idea.
The subject line – it’s the only part of your email campaign everyone is guaranteed to see!
Note that the confidence intervals of all three “winners” overlap. Therefore you can interpret the result as follows: the three elements are all viewed as being roughly equally important, as it’s not clear which one is the “most important.”
From these statistics we can, however, robustly say that: Quality of data, segment selection and your subject line is the holy trinity of successful email marketing.
And the bad news – sorry, human creativity and deliverability infrastructure… and especially sorry to ESPs! Your input is not viewed as being highly important to the outcome of a campaign.
What about results for Brands vs. Agencies? A few telling things come to light. Perhaps not surprisingly, agencies appear to view services that they generally offer – things such as data quality services, deliverability advice/infrastructure, and segmentation abilities – as more important than factors they don’t control.
Once noticeable difference is subject lines! Brands view them as being more important than agencies do.
Brand vs. Agency
Subject line
BrandAgency
Quality of data
Segment selection
Human creativity
Email aesthetics / design
Time of day
Rendering on multiple devices
Deliverability infrastructure
List size
ESP technology
1LESS 2 3 MORE
3.4
3.6
2.9
3.0
2.1
3.7
3.0
2.8
3.1
2.3
3.5 3.2
3.2
3.5
3.4
2.8
3.7
3.7
3.9
3.9
3.2
3.1
3.1
3.2
3.3
2.4
2.5
3.5
3.6
3.2
3.1
3.1
3.0
2.9
2.5
2.5
3.0
3.0
2.7
2.7
2.5
2.8
2.8
1.9
2.0
2.6
2.6
2.5
2.4
2.1
2.1
3.3
2.9
2.8
2.3
3.3
2.9
2.8
2.6
2.3
7
How email marketers split testBased upon the preceding results, we can all agree that subject lines are hugely important to both brands and agencies. This is not news.
So, surely if the subject line is of huge importance to the response of an email campaign, email marketers should be testing all the time, right?
About a quarter of email marketers never split test their subject lines. And, about half of people only test subject lines on just a few of their emails.
This result is counter-intuitive! While people agree that subject lines are of huge importance to a campaign’s response, most people aren’t trying to test and learn how to do them better.
Hopefully when marketers do split test their subject lines, they at least make the most of the opportunity for maximum learning by testing out numerous subject lines at once. Right?
Of all the email campaigns you've sent out in the last month, in how many of them did you split test your subject lines?
None
A few
Most
All
100 20 504030 60
27%
54%
25%
18%
44%
17%
22%
49%
21%
7% 5% – 10%
8
Huh. Not so much.
The vast majority of email marketers only test A/B splits. Which is good, don’t get us wrong, it’s better than nothing, but it still limits what you can learn from any given experiment.
Why are marketers restricting their learnings to A/B? There are a couple of reasons. The first reason is clear from this next question:
Wow. This is clearly a problem.
Even in this day and age of programmatic marketing, many major ESPs (about a third of the sample) only offer A/B split testing… which is less than ideal for marketers who want to supercharge their email results.
When you split test, how many subject lines do you normally test?
None None
A/B A/B
3-4 3-6
5+ 7 – 16
17+
Unlimited
Not sure
74% 37%64% 27%
1% – 4%
10% – 17%
0% – 2%
11% – 19%15% 9%
69% 32%
14% 15%
2% 0%
1%
What is the maximum number of subject lines you can split test in your ESP?
17%
27%
6% 11%
11%
13%
19%
21%
31%22%
100 20 30 40 50 60 70
0 10 20 30 40
9
The statistics show that people view subject lines as being hugely important to an email campaign. And yet, barely anyone spends more than an hour thinking about them.
This is a very odd result…
Consider this: how much time do you spend making an email look great in your ESP’s HTML editor? And how long do you spend picking the data? And making sure it’s responsive across all devices? And so on, and so on…
It’s probably more than an hour, right?
This is perplexing. Subject lines are viewed as important, that’s clear, and yet people don’t spend much time on them. WTF?
How much time do you and your team spend thinking of subject lines to test?
Who on your team thinks of the subject lines to test?
Fortunately most people include their team in the brainstorming process, which is a good sign.
The human brain, in short amounts of time at least, can only think of so many ways to say the same thing. When the same people think of subject lines over and over, it’s incredibly difficult to find new ways to say the same thing.
You & team A few minutes
You alone About an hour
Another About half a day
Your boss More than a day1% – 4% 1% – 4%
3% – 8%
10% – 18%
78% 41%
14% 42%
6% 15%
2% 2%
73% 36%
37%
11%
82% 46%
47%
19%
10 20 4030 50 60 70 800 0 10 20 30 40
10
What do people test?
Other things that people test out, although in very small amounts, are things like personalisation, front- vs back-loading features, and including your brand name in the subject line.
What’s surprising is that only about a quarter of email marketers test out different product features… despite it being your products that people buy from your emails. This result is surprising… and is likely an area of opportunity for the astute email marketer.
We at Phrasee have looked at hundreds of thousands of subject lines (we’re not exaggerating – we love subject lines! We’re also great at parties.)
And we’ve seen an enormous amount of split tests, some of which work, and some of which don’t.
Subject lines generally don’t work when people test out the wrong things. When you test out subject lines, what do you test? This is what email marketers around the world are testing:
Different call to action phrases
Length of subject line
Different adjectives
Including the price / discount
Price differentials (i.e. $50 vs 50%)
Different product features
Punctuation
73%
53%
82%
63%
48%
41%
34%
30%
7% – 13%
38%
31%
25%
20%
77%
58%
43%
36%
30%
25%
10%
0 10 20 30 40 50 60 70 80
When testing out subject lines, which element(s) do you test the most commonly?
11
How do you measure success?
Each business will have its own reasons for using different metrics. For example, if you’re marketing a premium priced product that only gets a couple conversions per email, using conversion rate as a success metric is statistically unreliable. (We’ll save you a long rant about statistical insignificance here
What metric(s) do you use to measure success of a subject line?
Open rate
Click to open rate
Click rate
Conversion rate
Unsubscribe rate
34% 45%
40%
40%
29%
30%
30%
20%
40%
35%
35%
24%
Doing subject line tests just for the sake of doing them is nonsensical. The purpose is, of course, to improve your results. So how do email marketers measure success?
75% 83%79%
“[We] don't know what [we’re] testing until
last possible moment. Short-term focus on
testing (no methodology.)”
0 10 20 30 40 50 60 70 80
– drop us a line if you are curious why this is the case.)
However, one thing that’s common is marketers focus on short-term results. Getting a little lift in open rates or click rates, and considering that success.
This is a problem. Where the real power of split testing comes in is when you follow an experimental plan and apply longitudinal learnings.
That is to say, learn about your audience over time, over a series of planned split tests, so your response uplift isn’t just fleeting but delivers you long-run value.
To put it in the words of one of the survey respondents:
12
So the important question isn’t
“How do you measure success?”
The important question is,
“How do you analyse your previous successes to predict better
subject lines in the future?”
Do you analyse your past history of split tests to help design future splits? If so, how do you conduct the analysis?
Don't really do much
Gut feel
Determine causal variables
Print out all subject lines and look for trends
Sentiment/natural language processing analysis
Build a model to predict subject line performance
34%
29%
44%
39%
37%
37%
20%
27%
27%
12%
2% – 6%
39%
34%
32%
32%
16%
4%
Now, this is worrying. Most people don’t do very much, rely on intuition, or look at all the subject lines in a list and try to eyeball trends.
10 20 30 400
As one respondent noted:
“[We ] make poor assumptions
of our audiences based on the
'success' of one campaign.”
And another:
“[We] haven't planned a
robust series of tests and don't
know what we're looking
for to demonstrate success/
improvement.”
Only a paltry 4% of people have tried to build a model to predict subject line performance. This is, of course, a challenge, in terms of technology, statistical know-how, and not having enough data. Thus, predictive model building remains a pipe dream for the majority of email marketers.
13
How confident are email marketers in their skills?
Being confident in your skills makes you want to do whatever the skill is more often. And if you’re not overly confident yet, knowing where to get expert advice is equally important.
The question is derivative of “Net Promoter Scores,” which is a way of measuring confidence in business services.
A Net Promoter score is based on a scale of -100% to +100%. The lower the score, the less confident people are recommending a skill. The higher, the more confident they are.
Net Promoter Scores: How likely are you to recommend the following to others?
-64%
-56%
-53%
-52%
-47%
-46%
0%-50% 50% 100%-100%
How confident people are in their subject line expertise, not to mention those around them?
Your ESP's (or email agency's) advice on subject lines
Your company's subject line strategy
Your team's subject line expertise
Your company's subject line testing methodology
Your own subject line expertise
Your ESP's split testing functionality
14
OK, here’s a big problem.
Email marketers aren’t extremely confident in their own subject line expertise (-47%.) Usually in this situation people would seek out external help for this problem.
However, they have even less confidence (-64%) in their ESP and/or email agency’s subject line advice!
So where are email marketers supposed to get help,
if not from ESPs and email agencies?!?
Email marketers are caught in a vicious cycle of wanting to do better, but not having enough time, lacking enough knowledge to improve things, and having no one to call on to help them do better.
This is the State of Split Testing in the email industry.
“I would [like to] allow more time for build when
there is a split time to be done as
[ESP name] is such a retarded programme and
makes split testing a pain.”
As one of the respondents noted:
15
But don’t take it from us
Two qualitative questions were asked:
“Just out of curiosity, what are a few things
that most email marketers do wrong when it
comes to split testing?”
“And, maybe this is a stupid question, but:
if you could wave a magic wand, what split
testing practices/methodology/whatever
would you change in your organization?
(Note: you don't have a magic wand, sorry,
it's a hypothetical question.)”
The quantitative results paint an interesting picture. But, what’s equally important is to determine what people are thinking. What their challenges are. What keeps them awake at night.
Here are the most common words and phrases that the respondents used when answering these questions:
time lineslast
plan
content
planning
variableswithin
analysis
thinkcampaign significance
based often
also
trying
minute
structured
measure
a/b
makedata
line
ratherBetterdifficult
using
gut
usejust enough
results
testcorrelation
get
two
spend
learning methodology
understandingbuilt
next
knowlooking
16
“I would change the ad hoc split testing we do and create a strategic plan for what we are trying to achieve or measure. Also, I would wave a wand for everyone involved with email testing to have experience with actually sending one so we can all understand what we are testing and learning.”
Not enough time. Or understanding. Or methodology. Or data.
You’ve read lots of analysis in this document of cold, hard statistics. The words of email marketers are more powerful
than any statistics could ever be. So, we’ll leave you with their words, anonymous of course.
“Actual analysis and less of a reliance on gut feeling. Freedom from Commercial team's orders, not having to create sends and email out to people who aren't going to convert.”
“I would make my team stop relying on using the same phrases and be more creative.”
“We use only A/B split testing, they send the winner subject line out too soon.”
“Testing two arbitrary subject lines each time based on gut feel rather than testing a specific causal variant and testing it over and over until you have proven that it is in fact, a successful variant.”
And we’ve saved our favourite for last:
“Not planning split testing - so it's Joe's favourite subject line v Fred's - no learning can be gained.”
Forget what you thought you knew about subject lines. When you combine artificial
intelligence and human language, you get Phrasee. It's awesome.
Contact us now to find out how you can become a subject line superhero.
Design by the www.thepinkgroup.co.uk
www.twitter.com/phrasee
www.linkedin.com/company/phrasee
www.phrasee.co
Phrasee Ltd
5 & 6 Crescent Stables
139 Upper Richmond Road
London SW15 2TN