How to screen 100+ concepts with MaxDiffSKIM | Hans Willems | April 6th 2017
Agenda
2
1 2 3MaxDiff Intro and challenges Number of MaxDiff items MaxDiff’s relativity issue
3
Maximum Difference scaling (MaxDiff)Methodologies
MaxDiff was originally
invented as a superior
alternative to rating,
ranking and chip
allocation questions
Proved to be useful as a
good alternative to
simple conjoint
applications
Can be used to
answer a variety of
business
questions
Maximum Differential Scaling, in short MaxDiff
4
MaxDiff: How does it work?
MaxDiff forces consumers to make trade-offs between certain features/benefits
5
Main advantages
More discriminating
and refined ratings
Scale free, hence not biased by cultural
differences
Engaging and intuitive
exercise for respondents
Sub segments can be
identified
Generally lower costs and
shorter timelines
Conventional MaxDiff New SwipeDiff
How does a MaxDiff exercise look?
6
7
MaxDiff: Example output (fictional data)
7
1 Monthly costs 12.72
2 Data allowance 12.32
3 Network coverage 7.93
4 Digital security 7.13
5 4G network 6.97
6 Free calls/texts within provider network 6.76
7 Handset price 6.62
8 Customer service 5.29
9 Voice allowance/call rates 4.93
10 Mobile phone model/handset 4.54
11 Ease of understanding mobile phone plan/rates 3.81
12 Roaming rates 3.80
13 Contract length 3.78
14 Out of bundle call/text/data rates 3.47
15 Reputation of Brand 2.80
16 Text allowance/text rates 2.56
17 Availability of regular phone upgrades 2.44
18 International call/text rates 2.14
Rank Average Scores
MaxDiff: Challenges?
8
How many items can be included in a MaxDiffexercise?
How good are the winning (or losing) items?
Trade-off between number
of screens per
respondents, number of
items per screen and
number of observations
per item
Sometimes more items
need to be tested than
what can be done with the
standard MaxDiff method
Ranking provides insights on relative
preferences between items, but not on
overall acceptability/likeability of the full set
of items
9
MaxDiff: Number of ItemsMethodologies
MaxDiff: Number of items
How many items can be
included in a MaxDiff exercise?
Trade-off between number of screens per respondents,
number of items per screen and number of observations per
item
• 4 items per screen is standard, 6 considered to be the
maximum
• Rule of thumb: Show each item at least 3 times to each
respondent, for example:- 12 items: 9 screens with 4 items or 12 screens with 3 items
- 20 items: 12 screens with 5 items or 15 screens with 4 items
- 30 items: 15 screens with 6 items or 18 screens with 5 items
• Generally, the more items per screen the more robust the
read on the best and worst items, however at the expense
of less robustness on the middle range
What solution to use when having over >30 items?
E.g. 50? or 100(+)?
X3
10
11
MaxDiff: Including more than 30 itemsMethodologies
Sparse MaxDiff
Every item will only be shown 1
time to each respondent
Including 30-50 items: Sparse and Express MaxDiff
12
Express MaxDiff
A (random) subset of items out of
the total set will be tested per
respondent, with (at least) 3
observations for each item within
the subset
Both methods
require
information to
be borrowed
from other
respondents
Although Express MaxDiff seems more respondent friendly, some research has indicated
that Sparse MaxDiff leads to slightly better results
Including over 30 items often requires an unacceptably high number of screens for
respondents. There are two alternative MaxDiff approaches to handle this:
13
1 2 3 4
The algorithm utilizes a step-wise process based on successive model estimations to be able to
increase the frequency that items with high potential are shown to respondents
After each respondent aggregate level utilities are
calculated on-the-fly
Based on these the top 10-20 items are selected
On top of that 10 additional items are selected (semi-)
randomly
These 20-30 items are now shown to the next
respondent
Items most potential are shown at a higher frequency whereas items with least potential are
reduced in the frequency of being included in the item sets Still all items are shown to at least 30 respondents for a robust read
Including >50 items: SKIM’s Thompson MaxDiff (TMD)
Thompson MaxDiff: Innovations and advantages
14
What is new/different compared to a standard MaxDiff?
Main advantages
Able to handle a large number of
MaxDiff items without having to show
an excessive amount of screens
Focuses on the top performing
items
Estimates real-time popularity and
uncertainty
Learns from each new respondent
Stronger reads on the top
ranked items
Lower sample size needed to handle
large number of items
Thompson Sampling vs Sparse Design
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
320
340
360
380
400
420
440
460
480
500
520
540
560
580
600
620
640
660
680
700
720
740
760
780
800
820
840
860
880
900
920
940
960
980
100
0
102
0
Hit r
ate
%
# of respondents
Top 3 hit rates
~4x as many respondents
required to achieve the
same hit ratesFixed Sparse
Design
Thompson
30 no split /
20/10 split
Source: Fairchild, K., Orme, B. and Swartz, E. (2015), “Bandit Adaptive MaxDiff Designs for Huge
Number of Items”, 2015 Sawtooth Software Conference Proceedings
15
Thompson Sampling - Misinformed start
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
320
340
360
380
400
420
440
460
480
500
520
540
560
580
600
620
640
660
680
700
720
740
760
780
800
820
840
860
880
900
920
940
960
980
100
0
102
0
Hit r
ate
%
# of respondents
Top 3 hit rates
Thompson
20/10 split with
misinformed start
Thompson
30 no split with
misinformed start
Thompson
30 no split /
20/10 split
Source: Fairchild, K., Orme, B. and Swartz, E. (2015), “Bandit Adaptive MaxDiff Designs for Huge
Number of Items”, 2015 Sawtooth Software Conference Proceedings
16
17
MaxDiff’s relativity issueMethodologies
18
MaxDiff: The Relativity Issue
One issue MaxDiff suffers from:
relativity
• We don’t know if all items are good, bad,
or some are good and some are bad
Most
Preferred
Least
Preferred
Head ache
Having a cold
Broken toe
Pulled muscle
The Solution: Using MaxDiff Anchoring – Two methods
19
Indirect Approach• Ask to identify
acceptable items from
entire list (SKIM’s own
Kevin Lattery’s Direct
Approach)
• Can also ask as
unacceptable, least
preferred, would not
consider buying, etc.
• Ask to indicate whether
all items in a set are All
Good, All Bad, or Some
Good and Some Bad.
(Louviere’s Indirect
Approach)
Direct Approach
19
MaxDiff Anchoring: Direct approach
20
Acceptable? Yes/No
Item 1 Yes
Item 2 No
Item 3 No
Item 4 No
PREFERENCE
Most
Preferred
Least
Preferred
Item 1
Item 2
Item 3
Item 4
Item1
Item2
Item4
Item3 Anchor
MaxDiff Anchoring: Indirect approach (Dual none)
21
Item1
Item2
Item4
Item3 Anchor
PREFERENCE
Most
Preferred
Least
Preferred
Item 1
Item 2
Item 3
Item 4
Considering only the items above…
None of these are preferred
Some of these are preferred
All of these are preferred
MaxDiff Anchoring: Indirect approach (Dual none)
22
PREFERENCE
Most
Preferred
Least
Preferred
Item 1
Item 2
Item 3
Item 4
Considering only the items above…
None of these are preferred
Some of these are preferred
All of these are preferred
Item1
Item2
Item4
Item3 Anchor
MaxDiff Anchoring: Indirect approach (Dual none)
23
PREFERENCE
Most
Preferred
Least
Preferred
Item 1
Item 2
Item 3
Item 4
Considering only the items above…
None of these are preferred
Some of these are preferred
All of these are preferred
Item1
Item2
Item4
Item3Anchor
MaxDiff: Example output with Anchor (fictional data)
1 Monthly costs 11.72
2 Data allowance 11.43
3 Network coverage 7.52
4 Digital security 6.88
5 4G network 6.67
6 Free calls/texts within provider network 6.57
7 Handset price 6.41
8 Customer service 5.24
9 Voice allowance/call rates 4.91
10 Mobile phone model/handset 4.55
11 Ease of understanding mobile phone plan/rates 3.83
12 Roaming rates 3.81
13 Contract length 3.79
14 Out of bundle call/text/data rates 3.54
15 Anchor 3.12
16 Reputation of Brand 2.85
17 Text allowance/text rates 2.55
18 Availability of regular phone upgrades 2.48
19 International call/text rates 2.13
Rank Average Scores
24
MaxDiff Anchoring: Recommended method?
25
None of the
methods
proven to be
superior
Largely based
on context and
personal
preference
More research
and experience
needed
Be aware of potential pitfalls of both methods
DIR
EC
T
IND
IRE
CT• When a scale question is used, the cut-off logic
could be arbitrary
• The additional question introduces a potential scale
bias again
• More questions/ clicks for a respondent
• When having 5 or more items on a screen, it is likely
that many responses will be for “some are preferred”
which does not provide much information
26
Advanced MaxDiff: Key take-awaysMethodologies
Key take-aways
27
Large number of
items can be an
issue
Sparse and
Express MaxDiff
solution for 30-50
items
Thompson
Sampling MaxDiff
for over 50 itemsAnchoring can be
used to tackle the
MaxDiff’s relativity
issue
Two methods:
• Direct approach
• Indirect approach
MaxDiff great technique but also challenges
30-50
>50
Hans WillemsResearch ManagerBased in [email protected]
Contact us
skimgroup.com
@SKIMgroup
SKIMgroup
SKIMgroup
28