advanced statistical models for pricing, mass ...file/dis3937.pdf · advanced statistical models...
TRANSCRIPT
Advanced Statistical Models forPricing, Mass Customization and Forecasting
- A Bayesian Approach -
D I S SERTAT ION
of the University of St. Gallen,
School of Management,
Economics, Law, Social Sciences
and International Affairs
to obtain the title of
Doctor of Philosophy in Management
submitted by
Daniel Philipp Stadel
from
Germany
Approved on the application of
Prof. Dr. Andreas Herrmann
and
Prof. Dr. Torsten Tomczak
Dissertation no. 3937
Difo-Druck GmbH, Bamberg 2011
The University of St. Gallen, School of Management, Economics, Law,
Social Sciences and International Affairs hereby consents to the printing of the
present dissertation, without hereby expressing any opinion on the views herein
expressed.
St. Gallen, May 13, 2011
The President:
Prof. Dr. Thomas Bieger
To my family
V
Acknowledgment
This cumulative dissertation has been a very challenging and interesting en-
deavor, which would not have been possible without the contributions of a num-
ber of persons. Therefore, I would like to thank all the people who supported
me throughout this ambitious project.
First of all, I would like to thank my primary advisor, Professor Andreas
Herrmann, and my secondary advisor, Professor Torsten Tomczak, for their
guidance, encouragement, and supervision. They have provided vital support
for my study and research. I am also indebted to numerous individuals at the
University of St. Gallen. Particularly, I would like to thank Dr. Julia Stefanides,
Antonia Erz, Christian Hildebrand and Christian Purucker for their ongoing
willingness to discuss issues related to my dissertation project, their valuable
comments and the great working atmosphere. I also would like to thank my co-
authors Professor Florian Stahl, Professor Raghuram Iyengar, Professor Bene-
dict Dellaert and Dr. Jan Landwehr for their support and effort in completing
the papers.
Sustaining me throughout has been the ever present support and understand-
ing of my family and friends. Therefore, I would like to thank my friends for
their comprehension and patience and my family for their continuous encour-
agement and support. Without them, the completion of this dissertation would
not have been possible.
St. Gallen, May 2011
Daniel P. Stadel
VII
Table of Contents
A. Summary - Zusammenfassung
B. Article I
Stahl, F., Stadel, D. P., Iyengar, R., and Herrmann, A. (in preparation for sub-
mission). Subscriptions Pricing and Intertemporal Tradeoffs. ManagementScience.
C. Article II
Stadel, D. P., Dellaert, B. G. C., Herrmann, A., and Landwehr, J. R. (submit-
ted). Locked-In To Luxury: First- and Second-Order Default Effects in Mass
Customization. Marketing Science.
D. Article III
Stadel, D. P. (submitted). Online Data: Predictive Power or Obscure Delusion?
International Journal of Research in Marketing.
E. Curriculum Vitae
IX
Summary
In many fields of business studies such as finance, econometrics and quantita-
tive marketing the importance of advanced statistical methods steadily increases
as the problems gain complexity. With faster computers and new sources for
vast amounts of data, statistical approaches are challenged to cope with these
aspects, and must therefore be improved. For quantitative marketing, the in-
formation gained by complex models, and the insights given by new advanced
methods strengthen the ability of companies to faster react to their customers’
needs. This dissertation discusses the application of advanced statistical meth-
ods to a variety of research objectives, such as tariff design, mass customization
profitability, and forecasting.
The first essay broaches the issue of consumers’ intertemporal tradeoffs
among subscription plans and the respective consequences for its optimal pric-
ing. Subject of further investigation is the individual’s discounting behavior
which has a significant influence on the perceived value for the customer. We
augment a general discount function by an additional parameter to account for
flexibility preferences in the individual’s discounting behavior. Such behavior
has a tremendous influence on optimal pricing strategies for service providers.
The second essay investigates default-based upselling potentials in mass cus-
tomization systems such as online car configurators. We study whether compa-
nies can start their customers off on a high-margin decision path based on a few
high-end default selections early on in the configuration process. We analyze
whether default attribute levels within the customization process can increase
consumers’ choices for high-margin attribute levels (first-order default effects)
and whether these effects help or hurt margins on later subsequent attribute
level choices (second-order default effects). We offer a conceptual framework
to managerially guide default selection to accommodate these two effects. The
third essay provides an analysis of online data from a car configurator and web
search queries to assess its usefulness as input for forecasting models. Due to
large amounts of data available online, companies can benefit from proper anal-
X
yses of such data pools. Therefore, time series methods are applied to the data.
The forecasting performance is compared to models without incorporating the
online data. It is shown that such data can significantly improve the forecasting
performance and, hence, companies should face the challenge to cope with the
task of utilizing available online data.
The intended research projects, and therefore the resulting essays show ap-
plications of advanced statistical tools to cover complex but important issues
among the economic interaction of companies and their customers. Based on
these methods, I am able to conduct several analyses to draw conclusions of
high importance and relevance for managerial implications.
XI
Zusammenfassung
In den unterschiedlichsten Teilgebieten der Wirtschaftswissenschaften efreuen
sich statistische Methoden aufgrund der stetig steigenden Komplexität der
Fragestellungen immer größerer Beliebtheit. In der Disziplin des quantita-
tiven Marketings können hochentwickelte Methoden den nötigen Wissensvor-
sprung liefern, um bestmöglichst auf sich ändernde Kundenanforderungen zu
reagieren. Die vorliegende Dissertation diskutiert statistische Modelle für An-
wendungen in den Bereichen Tarifdesign, Konfigurator-Optimierung und Prog-
noserechnung.
Der erste Aufsatz diskutiert die Gestaltung von Abo-Tarifen bezüglich
Laufzeit und Preis unter Berücksichtigung des individuellen Planungshorizonts
der Kunden. Der Fokus liegt dabei auf dem individuellen Diskontierungsverhal-
ten der Konsumenten. Der zweite Beitrag untersucht das Upselling-Potential
mittels Produktkonfiguratoren am Beispiel eines Car-Konfigurators. Es wird
analysiert, wie Konsumenten auf voreingestellte Optionen innerhalb des Kon-
figurationsprozesses reagieren (Effekt 1. Ordnung), und ob diese Auswirkun-
gen auf spätere Entscheidungen innerhalb des selben Konfigurationsprozesses
(Effekt 2. Ordnung) haben. Wir erarbeiten ein konzeptionelles Gerüst für die
optimale Auswahl von Defaults unter Berücksichtigung beider Effekte. Der
letzte Artikel befasst sich mit der Prognose von Verkaufsmengen. Für zwei
Pkw-Modelle werden mittels Online-Daten und einem statistischen Zeitreihen-
Modell zukünftige Bestellungen vorhergesagt. Es wird gezeigt, dass sich, unter
Berücksichtigung der Online-Daten, die Prognose-Güte signifikant verbessern
lässt, und den Herstellern somit eine weitere zuverlässige Datenquelle für Prog-
nosemodelle gegeben ist.
Die wissenschaftlichen Ausführungen zeigen die Anwendung statistischer
Methoden zur Bearbeitung und Lösung komplexer Fragestellungen im Zusam-
menhang der wirtschaftlichen Interaktion zwischen Unternehmen und ihren Kun-
den. Basierend auf diesen Methoden und deren Ergebnisse, können Implika-
tionen und Empfehlungen von großer Bedeutung für Management-relevante
Entscheidungen abgeleitet werden.
Article I
Stahl, F., Stadel, D. P., Iyengar, R., and Herrmann, A. (in preparation for sub-
mission). Subscriptions Pricing and Intertemporal Tradeoffs. ManagementScience.
Subscriptions Pricing and
Intertemporal Tradeoffs
Florian Stahl ∗
Daniel P. Stadel †
Raghuram Iyengar ‡
Andreas Herrmann §
∗Florian Stahl ([email protected]) is Assistant Professor of Marketing at the University of Zurich, 8032 Zurich,
Switzerland.†Daniel P. Stadel ([email protected]) is Ph.D. candidate at the University of St. Gallen, 9000 St. Gallen,
Switzerland.‡Raghuram Iyengar ([email protected]) is Assistant Professor of Marketing at the The Wharton
School, Philadelphia, PA - 19104.§Andreas Herrmann ([email protected]) is Professor of Marketing at the University of St. Gallen,
9000 St. Gallen, Switzerland.
2 ARTICLE I
Abstract
A common form of subscriptions to many services (e.g., health clubs, Internet access) is charac-
terized by duration that a consumer has access to a service and a one-time flat fee for unlimited
use. A key aspect of such subscriptions is that the price per-time unit declines with longer du-
rations. Such a pricing mechanism forces consumers to face a tradeoff in their choice among
plans - a short membership plan gives the flexibility to switch plans or providers while a long
one provides a price discount. For consumers, such decisions involve the flat fee, discounting of
future service benefits and their valuation of flexibility. For a firm offering a subscription-based
service, it is important to understand how consumers discount future benefits as it impacts their
willingness-to-pay. Using experimental data, we find that consumers’ discounting pattern is
inverse N-shaped (decrease-increase-decrease) with respect to membership duration. We also
show that a key driver of this pattern is the maximum contract duration that consumers typically
subscribe to a service. To determine the implications of our findings for managerial decisions,
we parameterize the observed discounting pattern and incorporate it within a model of con-
sumer choice among plans. The model is estimated using experimental data on consumers’
willingness-to-pay for membership plans for a health club. We compare the optimal menu of
plans predicted from our model with those based on an alternative model, which assumes only
hyperbolic discounting. Our results show that firms would give much smaller price discounts
to customers for longer membership durations if they ignore the inverse N-shaped discounting
pattern. Translated in terms of profitability, the failure to account for the observed discounting
leads to a reduction of 19% in firm profit.
Key words: Subscriptions, Membership Plans, Pricing, Intertemporal Choice, Preference for
Flexibility
ARTICLE I 3
1. Introduction
Subscriptions are a popular pricing practice used bymany business-to-consumer
companies. A common form of such subscriptions is a membership plan, which
is characterized by the length of time a customer can access a service (member-
ship duration) and a one-time flat fee (membership fee) for its unlimited use.
For example, an online newspaper, Radiance Weekly, charges a one-time fee
of $85, $125, $175 and $225 for unlimited access to articles for 1, 2, 3 and 5
years, respectively (see radianceweekly.com). Similarly, Greyhound bus service
charges $239, $439 and $539 for unlimited rides for 7, 30 and 60 days, respec-
tively as part of their Discovery package (see discoverypass.com). Such flat rate
plans are becoming increasingly popular as compared to usage-based pricing for
a variety of services such as Internet access, fixed-line telephone and access to
many online services (OECD 2009). A key aspect of such subscriptions is that
the price per-time unit declines with a longer subscription period. For example,
customers pay only $45 per year to Radiance Weekly if they subscribe for 5
years as opposed to $85 for a one year subscription.1
With such type of plans, consumers face a tradeoff in their choice among
them - a membership plan of short duration has a high price per-time unit but
gives consumers the flexibility to switch plans or providers. With a long mem-
bership plan, customers lose their flexibility but benefit from the lower price
per-time unit. For consumers, a choice of a plan involves consideration of im-
mediate costs (flat fee), future benefits from use of service and their valuation
of flexibility (DellaVigna and Malmendier 2006). How consumers discount fu-
ture benefits in this context has an impact on their willingness-to-pay (WTP)
for subscriptions of differing lengths and, consequently, for the optimal design
of subscriptions.
A rich stream of past literature on consumers’ intertemporal preferences has
shown that individuals discount future utility according to a hyperbolic func-
tion (Ariely and Loewenstein 2000, Ariely and Zauberman 2000, Laibson 1997,
Loewenstein and Prelec 1992, Thaler 1981). A majority of this work has fo-
1There are other types of subscriptions in which consumers are charged for both access and usage using either
a two-part tariff (Danaher 2002, Essegaier et al. 2002) or multi-part tariff (Iyengar et al. 2007, 2008). In this paper,
we focus on a popular pricing plan used by many types of services, where there is a one-time access fee charged
for giving consumers unlimited usage for a given time duration.
4 ARTICLE I
cused on consumers discounting of utility at discrete future time points. More
recent literature has considered the impact of “duration” and “intervals” on indi-
viduals’ discount patterns (Ariely and Loewenstein 2000, LeBoeuf 2006, Over-
ton and MacFadyen 1998, Read et al. 2005, Scholten and Read 2006, 2009).
However, such investigations have been in the context of how far future out-
comes are removed from the present, how far these outcomes are removed from
one another and its effect on the evaluation of a sequence of outcomes. Please
see Berns et al. (2007) and Frederick et al. (2002) for a more detailed discussion
of past work about intertemporal discounting, and DellaVigna and Malmendier
(2006) for a discussion of consumers’ discounting of future benefits from prod-
uct usage. As described above, subscriptions are also characterized by a time
duration for which customers have access to a service. However, in the case of
subscriptions, consumers have to discount future benefits from a service over a
continuous duration (length of membership) rather than at discrete future time
points. While it is plausible that consumers may discount such future bene-
fits still using a hyperbolic function, it is not obvious how their valuation for
flexibility would impact their discounting pattern.
Several researchers have also focused on tariff design but typically with-
out considering consumers’ discounting behavior (Dolan 1987, Miravete 2009,
Räsänen et al. 1997). For example, Dolan (1987) provides guidelines to design
quantity discount schedules. Miravete (1999) considers the design of optimal
menu of nonlinear tariffs when consumers are uncertain about their future con-
sumption. Within an analytical framework, Essegaier et al. (2002) investigate
the effect of capacity constraints and heterogeneity in consumers’ usage for
pricing of access services. Other research has also explored how consumers’
usage of a service differs under tariffs of varying durations and its implications
for service renewal but has not considered how consumers discount future util-
ity from the service (Gourville and Soman 2002, Soman and Gourville 2001).
Please see Wilson (1993) for a detailed discussion of past research on the design
of pricing plans. As noted earlier, it is important to determine how consumers’
discounting pattern of future benefits may impact the optimal design of sub-
scriptions.
ARTICLE I 5
In summary, there are two related issues we address herein: (1) how con-
sumers discount future benefits from a subscription-based service and (2) the
managerial implications for optimal design and pricing of tariffs. While previ-
ous research has investigated some of these issues separately, subscriptions are
an appropriate context to investigate their interplay. To this end, we use three
experimental studies to explore consumers’ preferences for future benefits from
a subscription-based service. In all three studies, we find that the monthly dis-
count rate has an inverse N-shaped pattern with respect to membership duration
- the discount rate initially decreases (consistent with hyperbolic discounting),
after a certain length of membership, it actually shows an increase and then re-
verts to a decreasing pattern. We also propose and find confirming evidence that
a key driver of this pattern, and a significant predictor of the time of increase
in the discount rate, is the maximum contract duration that consumers typically
subscribe to a service. The latter duration is a measure of how much consumers
value flexibility. Thus, consumers require large price discounts for choosing
plans with membership duration exceeding their maximum considered service
length. To quantify the impact of our findings on optimal pricing, we param-
eterize the observed discounting scheme and incorporate it within a model of
consumer choice among plans. The model is estimated using our experimental
data. We contrast the optimal menu of plans based on our model and an al-
ternative model that considers only hyperbolic discounting. Our results show
that firms would give much smaller price discounts to customers for longer du-
rations if they ignore the inverse N-shaped discounting pattern. Translated in
terms of profitability, the failure to account for the observed discounting leads
to a reduction of 19% in firm profit.
The remainder of the paper is organized as follows. We begin by investi-
gating how consumers discount their future benefits for subscription-based ser-
vices. Thereafter, we assess the managerial relevance of our findings for opti-
mal pricing of subscriptions. To do so, we describe a model of consumer choice
among subscriptions, incorporate the observed discounting pattern within this
choice model and discuss pricing policy optimization. The paper concludes
with a summary of our findings, limitations, and directions for future research.
6 ARTICLE I
2. Discounting pattern for subscription-basedservices
Our key objective is to explore consumers’ discounting patterns of future bene-
fits from ongoing access to a subscription-based service. In this section, we de-
scribe three studies. In all three studies, we collect data using surveys in which
we ask participants to state the maximum price they are willing to pay to switch
to various membership durations of a service conditional on their willingness-
to-pay (WTP) for a baseline membership duration of that service. We ask such
conditional questions as we are interested in understanding the pattern of con-
sumers’ discounting and not in determining their absolute WTP for a specific
service. In what follows, we describe three studies and their overall findings.
2.1 Study 1
Study 1 explores the discounting behavior of consumers while they decide
whether to remain with a given baseline subscription plan or switch to an al-
ternative plan. We ask consumers to state their WTP to switch to a proposed
alternative membership plan from the given baseline plan. Given the vast lit-
erature on hyperbolic discounting, a similar pattern is plausible, which would
indicate that consumers are discounting their future benefits from a service at
higher rates for shorter membership durations than for longer durations. How-
ever, it is not obvious how consumers’ valuation for flexibility will affect their
discounting pattern.
Method
One hundred and five undergraduate and masters level students participated in
the study. As the context, we considered membership plans to a health club. For
all participants, we specified that they were willing to pay an initial one-time fee
of $300 for a membership plan that gave 3 months of unlimited use of service
(i.e., a price of $100 per month for 3 months). In addition, respondents were
told that the payment of the one-time flat fee corresponding to any subscrip-
tion would be at the start of membership. This is typically how firms charge
ARTICLE I 7
subscribing consumers. Participants then had to state the maximum price they
were willing to pay to switch to an alternative of longer duration, i.e. "Forwhich monthly price p, would you choose a subscription of duration T monthsthan T ∗ months ? $X.−" with p being the monthly payment, T > T ∗ the sub-
scription period (e.g., 12 months) and T ∗ the baseline subscription period (i.e.,
3 months). By stating their WTP for switching to longer subscriptions, par-
ticipants provide information on how they tradeoff between being flexible and
benefiting from a price discount. Consequently, with this information we can
determine the discount pattern for each participant. Each respondent answered
nine such questions and we obtained the maximum price that they are willing
to pay to switch to contracts with durations of 6, 9, 12, 18, 24, 36, 48, 60 and
72 months (i.e., ΔDuration ∈ {3, 6, 9, 15, 21, 33, 45, 57, 69}). The order of the
questions was counterbalanced across respondents.2
As described, we asked participants to state their WTP for a subscription
duration T months of a service given that they are willing to pay $100 per
month for 3 months of the same service. Thus, for a membership duration of T
months, we can estimate participants’ monthly discount rate using the following
equation:
δT (100 · T )− p · T = 0, (1)
where δT = exp(−rT ), is the discount factor, which indicates the level of dis-
count on the monthly price that a consumer requires to switch to a plan longer
than the baseline duration, and r is the monthly discount rate, which may vary
with membership duration. In this way, we can determine the variation of con-
sumers’ monthly discount rate with membership duration.
Results
Table 1 contains the average (across respondents) discount rate for all offered
durations. The table also shows the difference in the discount rates between suc-
cessive membership durations. As the data may contain correlated observations
within individuals, we determine the statistical significance of such differences
using paired t-tests. For the health club membership, we obtain a decreasing
pattern in the monthly discount rates until the contract duration exceeds thirty
2The detailed survey can be provided by the authors upon request.
8 ARTICLE I
Table 1 Sudy 1 - monthly discount rates
Gym Subscriptions
Δ Duration Mean r (Sd) Decrease in rin months in %
3 4.49 (4.44)6 3.60 (2.72) 0.89 *9 3.84 (3.94) -0.24
15 2.53 (1.92) 1.31 **21 2.36 (1.42) 0.1733 5.08 (16.84) -2.7245 13.44 (32.31) -8.36 **57 13.19 (32.19) 0.2569 13.11 (32.22) 0.08
*p-value < .1 **p-value < .05
six months. This negative pattern is indicative of hyperbolic discounting (Za-
uberman et al. 2009). After thirty six months, we observe a significant increaseof around 8 percentage points in the monthly discount rate. For membership
plans that exceed 36 months, we again get a decreasing pattern. To summarize,
the study provides initial evidence that consumers show hyperbolic discounting
even for future benefits over an entire duration of service membership. How-
ever, as the inverse N-shaped pattern for monthly discount rates indicates, just
hyperbolic discounting cannot explain the observed behavior.
Our first study has some limitations. First, it can be argued that our use of a
sample of undergraduate and masters level students may have biased the results
as, being college students, they inherently have limited need for membership
services for a long duration.3 Second, it is conceivable that our use of 3 months
as the duration of the baseline membership plan may have affected the results.
Third, it is unclear what are the underlying drivers of the observed discounting
pattern. We address these concerns in the following study.
3For the students who participated in the study, the average time remaining to graduate was 2 years.
ARTICLE I 9
2.2 Study 2
Our second study is broadly designed to be similar to the first one and addresses
its limitations. We expect to find evidence for the inverse N-shaped discounting
pattern similar to that from the first study.
To hypothesize on the underlying drivers of observed discounting pattern,
we consider past research on intertemporal preferences and reference points.
Such research suggests that differing reference points used to evaluate alterna-
tives can significantly alter the choice among them (Kahneman 1992, Loewen-
stein 1988, Ordonez et al. 2000). In the current context, respondents may use
two reference points for making their WTP decisions for each offered subscrip-
tion (1) the baseline duration provided in the survey and (2) the maximum con-
tract duration that they typically subscribe to a service (termed as “critical”
duration). The latter reflects how much consumers value flexibility - the shorter
(longer) is this duration, the more (less) they value flexibility. We propose that
respondents’ critical duration should be a significant predictor of the timing of
increase in their monthly discount rate.
Method
Forty nine professionals (Executive MBAs) participated in the study. They were
randomly assigned to one of two versions of the survey. The two surveys had
different services – health club and online-video-rental service. We chose two
different services to explore whether there were any differences in discounting
behavior between a more common subscription service, such as a health club
membership, and a more innovative one such as online-video-rental service. As
noted earlier, the design for this study is similar to that in study 1, with the dif-
ference being that we considered several durations for the baseline subscription
tariff, e.g. 3, 6 or 12 months. The monthly fee for the baseline tariff was kept
constant at $100. For each baseline tariff, the respondents were asked for their
WTP to switch to alternative tariffs. Each participant had to answer twenty such
questions and the order of questions was counterbalanced across respondents.
At the end of the survey, we asked respondents to state their critical duration for
the offered service.
10 ARTICLE I
Table 2 Study 2 - monthly discount rates
Gym Subscriptions Online-Video-Rental Service
Δ Duration Mean r (Sd) Decrease in r Δ Duration Mean r (Sd) Decrease in rin months in % in months in %
3 4.04 (0.98) 3 9.43 (6.26)6 2.87 (0.92) 1.16 *** 6 7.30 (4.29) 2.13 ***9 1.98 (1.09) 0.89 *** 9 5.47 (2.63) 1.83 **
12 2.58 (0.90) -0.59 ** 12 6.09 (3.05) -0.62 +
15 2.07 (0.88) 0.51 *** 15 5.22 (2.81) 0.86 ***18 2.10 (0.59) -0.03 18 4.71 (2.51) 0.51 **24 1.95 (0.78) 0.16 * 24 3.85 (1.78) 0.86 ***30 1.77 (0.74) 0.17 *** 30 3.63 (1.80) 0.22 **36 1.80 (0.69) -0.02 36 3.29 (1.18) 0.34 **48 1.51 (0.59) 0.29 *** 48 2.70 (0.80) 0.59 ***
*p-value < .1 **p-value < .05 ***p-value < .01+p-value = .13
Results
For each respondent, we calculate the monthly discount rate for every offered
membership duration. Table 2 shows the discount rates averaged across all
baseline membership durations and respondents.4 As before, we determine the
underlying discount pattern for the two services using paired t-tests. The re-
sults corroborate those from study 1 – we observe an inverse N-shaped pattern
in the monthly discount rates for the health club membership as well as for the
online-video-rental service. For the latter, the monthly discount rate declines
until the membership duration exceeds 15 months (ΔDuration = 12 months).
We then find a small increase in the monthly discount rate (p < 0.13). There-
after, we obtain the declining pattern. Note that consistent with past research
(Kalish 1985, Mukherjee and Hoyer 2001), the discount rates for the online-
video-rental service (a more innovative product) are higher than that for a more
common service such as health club memberships (p < 0.001). A probable rea-
son is that, for an innovative service, consumers are more uncertain about their
future use of service and require much larger discounts to make it attractive for
them to switch to longer membership durations. Later, we discuss the issue of
4We average the data across all baseline durations as the pattern of discounting was very similar. The discount
rates for each baseline duration is available from the authors upon request.
ARTICLE I 11
Figure 1 Study 2 - observed pattern of discounting
Δ Duration (in months)
Mon
thly
Dis
coun
t Rat
e r
2 %
4 %
6 %
8 %
10 %
10 20 30 40
Gym SubscriptionsOnline−Video−Rental Service
uncertainty in future usage in greater detail. Figure 1 summarizes these patterns
graphically. Next, we use the critical duration from each respondent and inves-
tigate its relationship with the timing of increase in their monthly discount rates.
For the health club membership, the average critical duration was 23.28 months
while for the online-video-rental service, it was 15.88 months. Consistent with
intuition that consumers will be less likely to subscribe to longer durations for
a more innovative service, the latter is significantly smaller than the former
(p < 0.001). We find that for 64.60% (64.71%) of respondents, the increase in
their monthly discount rates for health club memberships (online-video-rental
service) either coincides with their self-stated critical duration or is the pre-
vious shorter or next longer membership duration (health club: χ2df=1 = 9.64,
p-value< .01; online-video-rental service: χ2df=1 = 10.29, p-value< .01). This
provides confirming evidence that consumers’ critical duration is a significant
driver for determining when their discount rate will increase.
12 ARTICLE I
2.3 Study 3
We performed a third study to emphasize the robustness of the inverse N-shaped
discounting pattern and test its relationship with respondents’ critical duration.
In this study, we replicate Study 2 with membership plans to a health club as a
context.
Method
This study was carried out as a web survey with fifty-five respondents. The
sample consisted of senior research associates. As in study 2, the last ques-
tion asked respondents to state their critical duration. In this study, we used six
months as baseline membership duration. The monthly fee was set at $100 per
month. Similar to the calculation in the first two studies, we determined the
monthly discount rates for all individuals and every offered membership dura-
tion (9, 12, 18, 24, 36 and 48 months).
Results
Table 3 contains the discount rates. Our finding of the inverse N-shaped pattern
in monthly discount rates is robust (see also Figure 2). This reaffirms the im-
portance of both hyperbolic discounting and customers’ valuation of flexibility
to understand how they discount future benefits from a subscription-based ser-
vice. The critical duration from respondents has an average of 25.42 months
with a standard deviation of 12.38 months. From Table 3, we note that the sig-
nificant increase in the monthly discount rates occurs around a duration of 18
months (ΔDuration= 12 months) and falls within the one standard deviation
around the average. We also find that for 83.64% of respondents, the increase
in their monthly discount rates for health club memberships either coincides
with their critical duration or is the previous shorter or next longer membership
duration (χ2df=1 = 24.89, p-value < .01). This strengthens our earlier finding
that consumers’ critical duration is an important factor for determining when
their discount rate will increase.
ARTICLE I 13
Table 3 Study 3 - monthly discount rates for health club subscriptions
Gym Subscriptions
Δ Duration Mean r (Sd) Decrease in rin months in %
3 2.036 (2.055)6 1.876 (1.668) 0.160 +
12 2.200 (1.592) -0.324 **18 1.979 (1.219) 0.221 **30 1.888 (0.999) 0.09142 1.691 (0.801) 0.197 **
**p-value < .05, +p-value = .13
2.4 Discussion of experimental results
The results from all three studies suggest that customers experience a loss of
flexibility with long contract durations and it is reflected in how they discount
future benefits from a service. Our finding of how consumers’ critical duration
impacts their discounting pattern is consistent with past research on how con-
sumers may use multiple reference points while making intertemporal choices
(Kahneman 1992, Loewenstein 1988, Ordonez et al. 2000). In our context, con-
sumers may use (a) the baseline duration that is provided in the survey and (b)
the critical duration as two reference durations for making their decisions. The
observed inverse N-shaped monthly discounting pattern can arise from the in-
terplay between these two reference durations with the first (second) region of
decreasing monthly discount rate arising from the comparison of each offered
duration with the baseline (critical) duration. The effect of consumers’ critical
duration on their discounting pattern may also be due to consumers’ uncertainty
in future use of service. It is likely that as consumers evaluate subscriptions with
durations greater than their critical duration, they are more uncertain about their
future usage and hence require much larger discounts for these subscriptions
to be attractive. This is consistent with past work of Jones and Ostroy (1984)
which notes that the more uncertain consumers are about their future beliefs
(e.g., future use of service), the more flexible they would like to be (e.g., use a
shorter contract length).
14 ARTICLE I
Figure 2 Study 3 - discounting behavior
Contract Duration (in months)
Dis
coun
t Rat
e r
1.8 %
2.0 %
2.2 %
2.4 %
12 18 24 30 36 42
Thus far, our investigation has documented an inverse N-shaped discount
pattern for subscription-based services and showed that this pattern emerges
largely due to consumers’ critical duration. The remainder of the paper is fo-
cused towards showing the implications of this finding for managerial decision
making, in particular for optimal pricing for subscriptions. To do so, we begin
by describing a model of how consumers choose among subscriptions. There-
after, we use the model for determining optimal prices for subscriptions.
3. Model for consumer choice among subscriptions
In this section we present a managerial application of our empirical findings.
We begin with a model for consumer choice among subscriptions. The model
formalizes how consumers’ discounting pattern affects their willingness-to-pay
for subscriptions. Thereafter, we parameterize the observed discounting pattern
and incorporate it in the model.
ARTICLE I 15
3.1 Utility model
Consider a single firm offering a product or service based on J subscription
plans. Each alternative j (j = 1, . . . , J) is described in terms of length of time
a customer can access the service and an initial, one-time flat fee (e.g., a health
club membership for 6 months for a one-time fee of $600). For an alternative
j, let the membership duration be Tj . We assume that the utility consumer i
associates with product j, vij(Tj), increases with duration. In addition, we set
the utility of zero duration to zero. This means that the consumer derives no
utility if s/he does not subscribe to the service (i.e., vij(0) = 0). We formulate a
discounted utility type specification (Samuelson 1937, Koopmans 1960) where
the utility from the alternative j depends on the duration that a consumer may
access the service. That is:
vij(Tj) =
∫ Tj
t=0
νiδi(t)dt, (2)
where νi ≥ 0 is the utility consumer i derives from consuming service j for
a unit time interval and δi(t) is the consumer-level discount factor for future
utility from service, which may be a function of time. The utility function in
equation (2) is the discounted utility from accessing service j for duration Tj .
Note that the utility function in equation (2) reduces to vij(Tj) = νi · Tj when
consumers do not discount future utility (i.e., δi(t) = 1.0).
We assume that consumer i (i = 1, . . . , I) cannot choose more than one
alternative. Let p(Tj) be the price associated with Tj duration of service j.
Consistent with economic theory, we assume that there is an individual-specific
composite (outside) good with unit price pyi and that consumer i has a budget yi.
A consumer can spend the entire budget on the composite good, or spend some
of it on the composite good and the rest to buy one of the J choice options (e.g.,
service j with Tj duration). Let zij denote the number of units of the composite
good.
Let uij(Tj, zij) represent the utility consumer i obtains from Tj duration of
service j and zij units of the composite good. We assume that the consumer
maximizes his or her utility, subject to a budget constraint p(Tj) + zijpyi = yi.
Without loss of generality, we normalize the price of the composite good to
16 ARTICLE I
unity, i.e., pyi = 1. Hence the number of units of the composite good is given
by zij = yi − p(Tj). We specify the following quasilinear utility function for
consumer i:
uij(Tj, zij) = vij(Tj) + βi(yi − p(Tj)), (3)
where vij(Tj) is specified in equation (2) and βi > 0 is the income effect or
price sensitivity.
Let j = 0 denote the no-choice option. Then the utility of allocating the
whole budget to the composite good (i.e., no-choice) for consumer i reduces to
ui0(0, yi) = βiyi since vi0(0) = 0. Thus a utility maximizing consumer would
choose alternative j if it has the maximum utility {uij > uik, k = 0, . . . , J, k �=j} and would choose none of the alternatives if the no-choice option (j = 0) has
the maximum utility {ui0 > uij, j = 0, . . . , J}. Note that in a choice context,
the term βiyi is irrelevant to the choice decision since it is a consumer-specific
constant across alternatives. Consequently, the utility of the no-choice option is
set to zero.
Our interest is in determining consumers’ willingness-to-pay for a given du-
ration of service. When the utility function is quasilinear, utility maximization
is equivalent to surplus maximization (Jedidi and Zhang 2002). Thus dividing
uij(Tj, zij) in equation (3) by the price coefficient βi gives the consumer surplus
function:
sij(Tj) =vij(Tj)
βi− p(Tj) = θi
∫ Tj
t=0
δi(t)dt− p(Tj), (4)
where sij(Tj) is the surplus (WTP−price) that consumer i derives from choos-
ing alternative j and θi =νiβi
is consumer i’s WTP for a unit time interval of the
service with no discounting.
The left-hand side component of equation (4) represents the WTP function
which describes the maximum price a consumer is willing to pay for a given
duration of service j.5 This function is given by:
WTPij(Tj) = θi
∫ Tj
t=0
δi(t)dt. (5)
5WTP or reservation price is the price point that equates the utility of consuming Tj units of service j to the no
choice utility, which we set to zero (see Jedidi and Zhang (2002)).
ARTICLE I 17
Figure 3 WTP function with different types of discounting
Duration t
WTP
($)
400
800
1200
1600
2000
2400
2800
3200
3600
4000
4400
4800
6 12 18 24 30 36 42 48
●
(1)
(2)
(3)
(4)
(5)
(6)
(1) δδ(t) = δδt with δδ = 1.00(2) δδ(t) = δδt with δδ = 0.80(3) δδ(t) = ββ ⋅⋅ δδt with ββ = 0.90 and δδ = 0.90(4) δδ(t) = ββ ⋅⋅ δδt with ββ = 0.85 and δδ = 0.70
(5) δδ(t) = ((1 ++ αα ⋅⋅ t))−−ββαα with αα = 0.50 and ββ = 0.060
(6) δδ(t) = ((1 ++ αα ⋅⋅ t))−−ββαα with αα = 0.03 and ββ = 0.025
Let θi = 100 per-month for alternative j. Figure 3 depicts the shape of the
WTP function for different types of discount functions (i.e., different discount
factors). The WTP function is linear when δi = 1. As the figure suggests,
when the discount factor has other types of variation with duration (i.e., δi(t)),
the WTP function takes various shapes. To accurately determine the WTP for
a subscription plan, and hence its optimal pricing, it is then important to cor-
rectly capture the underlying discount pattern. To this end, we describe how we
parameterize the discount function observed in the experiments.
3.2 Augmented discount function
We begin with a description of the generalized hyperbolic function,
δi(t) = (1 + αit)−βiαi , αi, βi > 0, (6)
which was proposed by Loewenstein and Prelec (1992). In this function, the
coefficient αi captures the divergence from exponential discounting. As the
coefficient αi goes to 0, the discount function becomes an exponential function
with a parameter βi, i.e., δi(t) = exp(−βit). When the coefficient αi becomes
18 ARTICLE I
Figure 4 Schematic diagram for discounting behavior
Contract Duration
Dis
coun
t Rat
e r
2 %
4 %
6 %
8 %
ConsideredDurations
CriticalDurations
UnconsideredDurations
large, the discount function becomes a step function. Note that a hyperbolic
function imposes that the discount pattern is decreasing with duration.
Our studies provide evidence for an inverse N-shaped discounting behavior,
which is schematically illustrated in Figure 4. Such a pattern cannot be cap-
tured by just using a hyperbolic function. To describe this pattern, consider the
exponential discount function, namely,
δi(t) = exp(−rit), (7)
where ri is the constant discount rate. As discussed in past research (Zauberman
et al. 2009), a hyperbolic discount function can be expressed in terms of the
exponential discount function in the following manner.
δi(t+Δt) = exp(−(ri −ΔriΔt)(t+Δt)), where ΔriΔt > 0. (8)
For consumer i, let tcrit.i be the critical duration. For parsimony, we assume
that the loss of flexibility affects the discount rates only through a shift, i.e.,
a positive shock during the time period that exceeds the critical value. After
the shock, the discount scheme is again consistent with hyperbolic discounting.
ARTICLE I 19
Figure 5 Characterization of shift in discount rates
Contract Duration
Dis
coun
t Rat
e r
2 %
4 %
6 %
8 %
ConsideredDurations
MeanCritical
Duration
UnconsideredDurations
Thus, when t > tcrit.i , there is a shift in the discount rates.6 Figure 5 shows our
characterization of this shift in the discount rates. Given this assumption, for
t+Δt > tcrit.i , we obtain the following discount function:
δi(t+Δt) = exp(−(ri −ΔriΔt)tcrit.i − (ri −ΔriΔt + rcrit.i )1− (9)
(ri −ΔriΔt)(t+Δt− tcrit.i − 1)),
with t + Δt > tcrit.i and rcrit.i > ΔriΔt. A simple algebraic manipulation leads
to the following:
δi(t+Δt) = exp(−(ri −ΔriΔt)(t+Δt)) · exp(−rcrit.i ), (10)
with t + Δt > tcrit.i and rcrit.i > ΔriΔt. Note that the first term, exp(−(ri −ΔriΔt)(t+Δt)), implies hyperbolic discounting (Zauberman et al. 2009) and we
can replace the exponential discount function with the generalized hyperbolic
discount function. Thus, we get
δi(t) = fI(t>tcrit.
i)
i · (1 + αit)−βiαi , (11)
6It is possible that after the critical duration, the hyperbolic discounting pattern has a different slope as well.
For parsimony, we capture the loss of flexibility only by a shift in the discount pattern. Such a parsimonious
specification requires the estimation of only one additional parameter. As the results of model estimation later
show, this specification fits the empirical discounting pattern well.
20 ARTICLE I
with I the Indicator-Function, such that
I(t>tcrit.i ) =
{1, if t > tcrit.i
0, else
and fi = exp(−rcrit.i ). We denote it as the flexibility parameter.
To summarize, we have parsimoniously incorporated the impact of critical
duration on the discount function as a multiplicative factor. For a consumer,
this multiplicative factor becomes applicable when the offered membership du-
ration exceeds their critical duration. We refer to our parameterized form as the
augmented discount function. Next, we include this discount function in the
WTP expression.
3.3 Willingness-to-pay model with flexibility
We can incorporate the augmented discount function in our WTP specification.
Thus, for consumer i and service j with duration Tj , we obtain:
WTPij(Tj) = θi
∫ Tj
t=0
fI(t>tcrit.
i)
i · (1 + αit)−βiαi dt. (12)
This completes the description of our model, which we denote as the "WTPmodel with flexibility". When consumers’ preference for flexibility is not in-
cluded in the discount function, their discount pattern of future benefits may be
captured by a generalized hyperbolic function. We denote this alternative model
as "WTP model without flexibility". Next, we use the models to determine opti-
mal prices / durations for a menu of subscription plans.
4. Optimal design of membership plans
In this section, we discuss the optimal design of membership plans. First, we
describe how the parameters of the two models, WTP model with (without)
flexibility, can be estimated. As an illustration, we use the data from Study 3
ARTICLE I 21
Table 4 Parameter estimates - posterior means and 95% posterior intervals
Parameter Model Modelwithout Flexibility with Flexibility
μα 0.0233 0.0246(0.0146, 0.0332) (0.0160, 0.0414)
μβ 0.0280 0.0236(0.0206, 0.0291) (0.0193, 0.0281)
μf - 0.9391- (0.9105, 0.9714)
for model estimation. Next, using the estimated parameters, we determine the
optimal menu of plans from each of the two models. A comparison of the opti-
mal menu shows how the pattern of discounting affects both the characteristics
of the offering and firm profitability.
4.1 Parameter estimates
We use Markov Chain Monte Carlo (MCMC) methods to estimate the two mod-
els using the data from Study 3. Our approach follows the standard Bayesian
estimation for hierarchical models (Rossi and Allenby 2003). Please see the
appendix for details on model estimation. Table 4 provides the posterior means
and 95% posterior intervals for the parameters in parenthesis. The estimate
for μα, which corresponds to the divergence from exponential discounting, is
marginally higher in the model with flexibility as compared to that in the model
without flexibility. The estimate of μβ in the latter model is higher than in the
former. This can be explained by the higher discount rates for longer durations,
which are captured in the model with flexibility by inclusion of a parameter
for flexibility. The average flexibility preference (μf ) is 0.9391, which sug-
gests that, on average, the positive shock in the discount rate for any member-
ship duration that exceeds a consumer’s critical duration is around 6% (rcrit. =
−Ln(0.9391) = 0.0628). In other words, on average, consumers require an
extra 6% discount on the monthly fee for any subscriptions that exceed their
critical duration. Finally, from the consumer-level estimates, we find that across
consumers the flexibility factor (f ) ranges from 0.92 to 0.96. This suggests that
22 ARTICLE I
Figure 6 The predicted discount functions (for six randomly chosen individuals)
(a) Model without Flexibility
Duration t
Dis
coun
t Fu
nctio
n δ(
t)
0.4
0.6
0.8
1.0
6 12 18 24 30 36 42 48
discount variation for six randomlychosen individuals (model without flexibility)
(4)(3)
(5)
(6)
(1)
(2)
(b) Model with Flexibility
Duration t
Aug
men
ted
Dis
coun
t Fu
nctio
n δ(
t)
0.4
0.6
0.8
1.0
6 12 18 24 30 36 42 48
discount variation for six randomlychosen individuals (model with flexibility)
(4)
(3)
(5)
(6)
(1)
(2)
consumers vary in their requirements of additional monthly discounts ranging
from around 4% (= −Ln(0.96)) to 8% (= −Ln(0.92)).7
To assess goodness of fit of the two models, we calculate the mean squared
error (MSE) for both models based on a comparison of the model-predicted
discount pattern with the observed discount rate for our sample. For the model
without flexibility, the MSE is 5.8 · 10−4 while for the model with flexibility, it
is 3.0 · 10−4. Consistent with expectations, the MSE comparison suggests that
the latter model better explains the pattern in the monthly discount rates.
We also contrast the individual-level discount functions (δi(t)) predicted by
the two models. For this comparison, we randomly choose six individuals (from
fifty-five respondents) and predict their discount function using both the model
with and without flexibility. Figure 6 shows the predicted discount functions
for the six individuals, labeled from 1...6, with the left (right) panel based on
the model without (with) flexibility. Recall that as the model without flexibility
assumes hyperbolic discounting, the discount function decreases with duration.
In the right panel, there are three points to note. First, for each individual,
there is a drop in the discount function (i.e., an increase in the monthly discount
rate) when the membership duration is greater than their critical duration. For
instance, person 3 has a critical duration of 24 months and we see a drop in
7We also explored whether there was any relationship between individual demographics (age, gender) and the
required monthly discounts. We found that age had no effect and women require marginally higher discounts than
men (p < 0.1).
ARTICLE I 23
their discount function thereafter. Second, as individuals vary in their critical
duration, the drop in their discount function occurs at different time durations.
As an example, person 3(5) has a critical duration of 24(36) months. Finally, the
magnitude of the drop in the discount function varies across individuals, e.g.,
person 2(5) has a much bigger (smaller) drop in their discount function. In sum,
our comparison indicates that the individual-level discount functions based on
the two models are different. Clearly such differences across the two models
will impact the predicted optimal menu of plans, which is discussed next.
4.2 Optimal tariff structure
For this illustration, we assume that the menu is comprised of two membership
plans; each with a specific contract period and an initial, one-time, flat fee.8
Let t represent the period of time of a particular subscription plan and p be
the fee associated with the plan. We define t1 (t2) as the “shorter” (“longer”)
contract period and p1 (p2) as its price. The key decision variables for the
firm are the contract periods t1 and t2 and the related prices p1 and p2. For
profit calculations, we assume that for a given membership duration (t), the
cost per customer, c(t), is an increasing function of the membership duration,
i.e., c(t) = c · t. Thus, the variable cost (c) to produce the product or service
per time unit is constant for all durations. As such, a simple assumption for
the cost function is useful as it allows us to focus our analysis on the impact of
consumers’ discounting patterns on the optimal design of subscription plans. It
is straightforward to incorporate other types of cost functions within our model
framework.
Note that our model with flexibility has four individual-level parameters,
namely, αi, βi, fi and θi. The first three parameters are estimated using the
pattern of discounting from the survey. The remaining individual-level param-
eter, θi, is estimated in the following manner. As described earlier, we told the
respondents that they were willing to pay a maximum of $100 per-month for a
six months baseline plan, i.e., a total price of $600 for the membership. Conse-
8The number of offered plans is clearly an important managerial decision in the context of designing product
lines (Iyengar and Lepper 2000, Lim and Ho 2007). Our focus on how consumers may choose only among two
offered plans helps sharpen our investigation of the impact of intertemporal discounting on plan choice and optimal
pricing.
24 ARTICLE I
Table 5 Optimal tariff structure
Optimal Tariff Structure Optimal Tariff StructureModel without Flexibility Model with Flexibility
Optimal Duration t1 8 Optimal Duration t1 4Optimal Duration t2 12 Optimal Duration t2 16Price p1 per month 93.97 Price p1 per month 107.14Price p2 per month 85.77 Price p2 per month 81.24Expected Profit per customer 1012.24 Expected Profit per customer 1268.20Discount (t1 → t2) 8.71 % Discount (t1 → t2) 24.17 %
quently, using the WTP model and imposing Tj = 6, this assumption has to be
consistent with the following:
θi
∫ 6
0
fI(t>tcrit.
i)
i (1 + αit)−βiαi − 600 = 0
For each consumer i, the parameter θi can be estimated by solving this equation.
The estimation procedure for parameter θi within the model without flexibility is
very similar with the sole difference being that there is no flexibility parameter
(fi) to be estimated from the survey. If the health club offers two membership
plans then, to maximize profits, the optimal levels of the fee and membership
duration for each plan have to be determined. To do so, we perform the fol-
lowing grid search. For each of the two offered plans, we vary the duration in
increments of 1 month from 1 to 48 months. We also vary the prices of the
plans from $45 to $125.9 From each menu of two plans, individuals choose
the tariff that provides a higher positive surplus. If none of the two plans gives
a surplus greater than zero, then a customer will not subscribe to the service.
We assume a $1 monthly variable cost per-customer (i.e., c=$1) and fixed cost
as zero. The optimal prices are determined by maximizing the sum of profits
over all respondents in our sample using a simulated annealing optimization
algorithm. Table 5 displays the results for the optimal menu of plans with the
highest expected contribution per-customer from the WTP model with/without
flexibility. The model without flexibility gives an optimal menu of plans that
have durations of 8 and 12 months with a monthly fee of $93.97 and $85.77,
respectively. The model with flexibility gives plans with durations of 4 and 16
9The self-stated monthly WTP from respondents falls within our chosen range of monthly price.
ARTICLE I 25
months with a monthly fee of $107.14 and $81.24, respectively. A comparison
of the optimal menu derived from the two models indicates that the model with
(without) flexibility predicts a shorter (longer) subscription duration for Tariff
1. This is reasonable as when consumers value flexibility, they are less willing
to subscribe to a long membership duration. Also consistent with intuition, the
model with flexibility predicts that Tariff 2 should be offered with a much larger
price discount. This quantifies the impact that the augmented discount function
has on the characteristics of the offered menu of plans.
To show the profit implications from ignoring the flexibility effect in the dis-
count function, we use the model with flexibility to assess the profitability of the
optimal pricing plans identified by the model without flexibility. This mimics
a scenario in which a firm may erroneously set the optimal prices without con-
sidering the flexibility effect, when, in reality, customers behave as observed in
our experiments. Thus, this analysis indicates the magnitude of profit reduction
that will ensue from using a misspecified model. We find that the firm should
make an expected profit of $1017.18 per customer. The firm would therefore
be forgoing a $251.02 (=1268.20-1017.18) profit per customer. Put differently,
the failure to account for the flexibility effect leads to about a 19% (=[1268.20-
1017.18]/1268.20) reduction in the firm’s profit.
5. Conclusion
Subscriptions are an often used pricing strategy by business-to-consumer com-
panies. A common type of subscription is characterized by a length of time
(e.g., a month or a year) a customer has access to a service and a corresponding
one-time flat fee for its unlimited use. A key aspect of such pricing is that the
price per-time unit declines with a longer subscription period. Given such plans,
customers face a tradeoff in their choice among them. With a choice of short
membership duration, customers incur a high average price per unit of time
but retain their flexibility to either drop the service altogether or switch service
providers. Upon choosing a subscription with a long duration, customers lose
their flexibility but benefit from price discounts. The critical information for
designing such plans is how consumers tradeoff their cost, future benefits from
26 ARTICLE I
a service and their valuation of flexibility.
In this paper, we explore the relationship between consumers’ discounting
of future benefits and the pricing of subscriptions. To this end, we conduct
several experiments. Across all studies, we find that the discount rate for con-
sumers has an inverse N-shaped pattern with respect to membership duration
- the discount rate initially decreases, after a certain length of membership, it
shows an increase and then reverts to a decreasing pattern. We also propose
and find confirming evidence that a significant predictor of when there is an
increase in the discount rate, is the maximum contract duration that consumers
typically subscribe to a service. The latter is a measure of how much consumers
value flexibility and our finding indicates that consumers require large price dis-
counts for choosing plans with membership duration exceeding their maximum
considered service length.
To draw managerially relevant implications, we parsimoniously specify the
inverse N-shaped discount function and incorporate it in a consumer-level model
for subscription choice. Using data from an experiment, we estimate individual-
level parameters of the model. We compare the optimal menu of two subscrip-
tion plans determined by our model to those from a model that assumes only
hyperbolic discounting. There are two key findings. First, we find that the menu
based on the model with the observed discounting pattern contains plans with
larger price discounts than those in the menu from a model with the hyperbolic
discounting scheme. Second, in terms of profitability, the failure to account for
the inverse N-shaped discounting pattern leads to a reduction of 19% in firm
profit.
In this paper, we empirically investigated the discounting behavior of con-
sumers in the context of subscriptions. Further research could examine the im-
pact of usage uncertainty and contract obligations for any product and/or service
on the observed discounting behavior. Another interesting area of future work
would be to explore the relationship between multiple reference points and the
discounting pattern. From a modeling perspective, we examined the design of
plans assuming that the service provider is a monopoly that offers a menu of
two plans. Future research could generalize our investigation by considering
the impact of competition on tariff design as well as the design of an offered
ARTICLE I 27
menu with more than two tariffs. Finally, we restricted our focus to the initial
purchase of a subscription. Future research may consider issues related to re-
purchase of plans, customer retention and actual usage in consumers’ choice of
tariffs. We hope this paper encourages work in these and related directions.
28 ARTICLE I
ReferencesAriely, D., G. Loewenstein. 2000. When does duration matter in judgment and decision mak-
ing? Journal of Experimental Psychology: General 129(4) 508–523.Ariely, D., G. Zauberman. 2000. On the making of an experience: The effects of breaking and
combining experiences on their overall evaluation. Journal of Behavioral Decision Making13(2) 219–232.
Berns, G. S., D. Laibson, G. Loewenstein. 2007. Intertemporal choice - toward an integrativeframework. Trends in Cognitive Sciences 11(11) 482–488.
Danaher, P. J. 2002. Optimal pricing of new subscription services: Analysis of a market exper-iment. Marketing Science 21(2) 119–138.
DellaVigna, S., U. Malmendier. 2006. Paying not to go to the gym. American Economic Review96(3) 694–719.
Dolan, R. J. 1987. Quantity discounts: Managerial issues and research opportunities. MarketingScience 6(1) 1–22.
Essegaier, S., S. Gupta, Z. J. Zhang. 2002. Pricing access services. Marketing Science 21(2)139–159.
Frederick, S., G. Loewenstein, T. O’Donoghue. 2002. Time discounting and time preference:A critical review. Journal of Economic Literature 40(2) 351–401.
Gourville, J. T., D. Soman. 2002. Pricing and the psychology of consumption. Harvard BusinessReview 80(9) 90–96.
Iyengar, R., A. Ansari, S. Gupta. 2007. A model of consumer learning for service quality andusage. Journal of Marketing Research 44(4) 529–544.
Iyengar, R., K. Jedidi, R. Kohli. 2008. A conjoint approach to multi-part pricing. Journal ofMarketing Research 45(2) 195–210.
Iyengar, S. S., M. R. Lepper. 2000. When choice is demotivating: Can one desire too much ofa good thing? Journal of Personality and Social Psychology 96(6) 995–1006.
Jedidi, K., Z. J. Zhang. 2002. Augmenting conjoint analysis to estimate consumer reservationprice. Management Science 48(10) 1350–1368.
Jones, R. A., J. M. Ostroy. 1984. Flexibility and uncertainty. The Review of Economic Studies51(1) 13–32.
Kahneman, D. 1992. Reference points, anchors, norms, and mixed feelings. OrganizationalBehavior And Human Decision Processes 51(2) 296–312.
Kalish, S. 1985. A new product adoption model with price, advertising, and uncertainty. Man-agement Science 31(12) 1569–1585.
Koopmans, T. C. 1960. Stationary ordinal utility and impatience. Econometrica 28(2) 287–309.Laibson, D. 1997. Golden eggs and hyperbolic discounting. Quarterly Journal of Economics
112(2) 443–477.LeBoeuf, R. A. 2006. Discount rates for time versus dates: The sensitivity of discounting to
time-interval description. Journal of Marketing Research 43(1) 59–72.Lim, N., T.-H. Ho. 2007. Designing price contracts for boundedly rational customers: Does the
number of blocks matter? Marketing Science 26(3) 312–326.Loewenstein, G. 1988. Frames of mind in intertemporal choice. Management Science 34(2)
200–214.
Loewenstein, G., D. Prelec. 1992. Anomalies in intertemporal choice: Evidence and an inter-pretation. Quarterly Journal of Economics 107(2) 573–597.
ARTICLE I 29
Miravete, E. J. 1999. Quantity discounts for taste-varying consumers. CARESSWorking PapersfromUniversity of Pennsylvania Center for Analytic Research and Economics in the SocialSciences. Downloaded from http://www.econ.upenn.edu/caresspapers.
Miravete, E. J. 2009. Competing with menus of tariff options. Journal of the European Eco-nomic Association 7 188–205.
Mukherjee, A., W. D. Hoyer. 2001. The effect of novel attributes on product evaluation. Journalof Consumer Research 28(3) 462–472.
OECD. 2009. Oecd communications outlook 2009. Organisation for Economic Co-Operationand Development, Report.
Ordonez, L. D., T. Connolly, R. Coughlan. 2000. Multiple reference points in satisfaction andfairness assessment. Journal of Behavioral Decision Making 13(3) 329–344.
Overton, A. A., A. J. MacFadyen. 1998. Time discounting and the estimation of loan duration.Journal of Economic Psychology 19(5) 607–618.
Räsänen, M., J. Ruusunen, R. P. Hämäläinen. 1997. Optimal tariff design under consumerself-selection. Energy Economics 19(2) 151–167.
Read, D., S. Frederick, B. Orsel, J. Rahman. 2005. Four score and seven years from now: Thedate/delay effect in temporal discounting. Management Science 51(9) 1326–1335.
Rossi, P. E., G. M. Allenby. 2003. Bayesian statistics and marketing. Marketing Science 22(3)304–328.
Samuelson, P. A. 1937. A note on measurement of utility. Review of Economic Studies 4(2)155–161.
Scholten, M., D. Read. 2006. Discounting by intervals: A generalized model of intertemporalchoice. Management Science 52(9) 1424–1436.
Scholten, M., D. Read. 2009. The Psychology of Intertemporal Tradeoffs. SSRN eLibrary,available at http://ssrn.com/paper=1444094 .
Soman, D., J. T. Gourville. 2001. Transaction decoupling: How price bundling affects thedecision to consume. Journal of Marketing Research 38(1) 30–44.
Thaler, R. H. 1981. Some empirical evidence on dynamic inconsistency. Economics Letters8(3) 201–207.
Wilson, R. B. 1993. Nonlinear pricing. Oxford University Press, New York, NY.
Zauberman, G., B. K. Kim, S. A. Malkoc, J. R. Bettman. 2009. Discounting time and time dis-counting: Subjectice time perception and intertemporal preferences. Journal of MarketingResearch 46(4) 543–556.
30 ARTICLE I
Appendix. Model estimation
In this appendix, we describe the estimation of parameters for the “ WTP modelwith flexibility”. For the “ WTP model without flexibility”, the estimation method
is very similar.
Consider a sample of N customers with each customer giving K observa-
tions of their willingness-to-pay to switch to an alternative membership plan
from a baseline membership duration. Let dit denote the discount factor for the
monthly price p of individual i and subscription duration t (i = 1, ..., N ). The
assumed relationship between the discount factor dit and the membership dura-
tion t is:
dit = fI(t>tcrit.
i)
i (1 + αit)−βiαi + εit. (A-1)
Here, for individual i, the parameter fi captures the need for flexibility and tcrit.i
is the critical duration. We assume that εit is normally distributed with zero
mean and variance σ2. The conditional likelihood Li|(αi, βi, fi, σ2) of observ-
ing the discounting behavior of consumer i across the K membership durations
is as follows:
Li|(αi, βi, fi, σ2) = (2π)−
K2 σ−K exp(− 1
2σ2(∑t∈T
(dit− fI(t>tcrit.)
i (1+αit)−βiαi )2)),
(A-2)
with T = {9, 12, 18, 24, 36, 48}. To allow for correlation among parameters we
set γi = (αi, βi, fi)T , and to account for customer heterogeneity, we assume
that the individual-level parameter vector γi = (αi, βi, fi)T follows a multi-
variate normal distribution with mean vector μγ = (μα, μβ, μf)T and variance-
covariance matrix Σ, and is restricted to the space (0,∞) × (0,∞) × [0, 1].10
Such a restriction ensures positive discounting.
10These distributional specifications were made to ensure parameter constraints such as α, β > 0 and f ∈ [0, 1].The censoring can be done without loss of generality since we do not get any modes in the posterior distributions
at the specified boundaries. If values outside these bounds are likely to occur, we would obtain modes at those
bounds. The value of the draw would be set to the bound since the contribution to the likelihoods remains the
same.
ARTICLE I 31
The unconditional likelihood L for a random sample of N consumers is:
L =N∏i=1
∫Li|(γi, σ2)f(γi|μγ,Σ)dγi, (A-3)
where the density function f(γj|μγ,Σ) isMVN(μγ,Σ). We use Markov Chain
Monte Carlo (MCMC) methods to estimate the model. This approach follows
the standard Bayesian estimation for hierarchical models (Rossi and Allenby
2003). We use the following set of proper but noninformative priors for all
population-level parameters. As a prior for the hyperparameter-vector μγ, we
use a multivariate normal distribution censored to the space (0,∞)× (0,∞)×[0, 1] with mean vector equal to (0, 0, 0.5)T , and variance-covariance matrix
with diagonal elements equal to 1000, and off-diagonal elements equal to zero.
As a prior for σ2, we use an inverse Gamma-Distribution IG(a, b) with a =
0.01 and b = 0.01.11 For the variance-covariance hyperparameter Σ we use
an inverse Wishart prior with degrees of freedom equal to 4 and scale matrix
with diagonal elements equal to 1000, and off-diagonal elements equal to zero.
We ran sampling chains for 30,000 iterations and assessed the convergence by
monitoring the time-series of the draws. We report results based on 15,000
draws retained after discarding the first half of the draws as burn-in iterations.
11Parameterization of the inverse Gamma-Distribution with density f(x) = ba
Γ(a)x−a−1 exp(−b
x )
32 ARTICLE I
Article II
Stadel, D. P., Dellaert, B. G. C., Herrmann, A., and Landwehr, J. R. (submit-
ted). Locked-In To Luxury: First- and Second-Order Default Effects in Mass
Customization. Marketing Science.
Locked-In To Luxury:
First- and Second-Order Default Effects
in Mass Customization
Daniel P. Stadel ∗
Benedict G. C. Dellaert †
Andreas Herrmann ‡
Jan R. Landwehr §
∗Daniel P. Stadel ([email protected]) is Ph.D. candidate at the University of St. Gallen, 9000 St. Gallen,
Switzerland.†Benedict G.C. Dellaert ([email protected]) is Professor of Marketing at the Erasmus School of Economics,
Erasmus University Rotterdam, 3000 DR Rotterdam, The Netherlands.‡Andreas Herrmann ([email protected]) is Professor of Marketing at the University of St. Gallen,
9000 St. Gallen, Switzerland.§Jan R. Landwehr ([email protected]) is Assistant Professor of Marketing at the University of St. Gallen,
9000 St. Gallen, Switzerland.
2 ARTICLE II
Abstract
Mass customization is a growing business practice and a strategy for firms to simultaneously
support consumer choice and increase profits. An important business objective of mass cus-
tomization is to increase sales of high-margin products by selling products that are more closely
in line with consumer preferences. In this study, we use a field experiment approach to inves-
tigate how consumers can be directed toward high margin decision paths in mass-customized
product choices through the use of defaults. We propose that default attribute levels in mass
customization affect consumers’ choices of high-margin attribute levels not only for the at-
tributes for which the defaults are set (first-order default effects) but also for subsequent at-
tributes (second-order default effects). The impact of second-order default effects is tested in
an online mass customization configurator in the car industry. Based on existing manufacturer
data, we first build a multivariate multinomial probit model that takes into account both first-
and second-order default effects on consumer product choices. We use this model to define
defaults that promote high-margin mass-customization choices by consumers. Next, we im-
plement these defaults in an online car configurator. An analysis of the resulting real-world
consumer choices demonstrates significant increases in product margins for the attributes for
which the defaults are set and, more critically, also significant effects on later attribute choices
for which no defaults were set. Finally, we conduct a satisfaction survey among users of the
online car configurator in which the defaults are implemented. The results show no negative
effects of the proposed high-margin defaults on user satisfaction. This alleviates concerns of
customer defection due to high-margin product sales. Therefore, this study provides empirical
evidence for the theoretical relevance of second-order default effects and their managerial im-
pact in designing mass-customization system configurators for greater profitability.
Key words: Mass Customization, Consumer Choice, Defaults, Online Configurators, Multivari-
ate Multinomial Probit Model
ARTICLE II 3
1. Introduction
In the age of mass customization, there are many opportunities for consumers
to customize products according to their own preferences. Consumers can de-
sign their own personal items, such as watches, shirts, bags, jackets, and shoes,
etc. It is estimated that worldwide, over 40,000 configurators are in place
across virtually all industries that allow customers to design their own products
(www.conf igurator-database.com). For example, in the car industry in certain
segments of the European car market, already over 70% of car buyers configure
their vehicles online.
When designing a product with a configurator, customers are typically asked
to make decisions to select their desired level for many attributes of the prod-
uct (e.g., in the case of cars, consumers select the level of horsepower for
the engine, the color for the exterior, the type of rims, etc.). Jointly, these
choices result in the consumers’ desired customized product, which is defined
by the set of attribute levels selected for each of the attributes. An important
way in which companies can support customers in this decision process is by
providing defaults or starting points for the attributes (Dellaert and Stremer-
sch 2005, Randall et al. 2005). Defaults are the pre-set attribute levels that
customers obtain in their product unless they make an active choice to se-
lect another level (Brown and Krishna 2004, Goldstein et al. 2008, Park et al.
2000). They are commonly observed in online configurators, which typically
present consumers with firm-specified default levels for each attribute. Aside
from the technical necessity of implementing defaults in mass-customization
configurators to be able to automatically complete the product specification,
defaults also play a decisive role in the customer decision-making process. Em-
pirical studies show that defaults are commonly selected by decision makers
and that they serve as a reference to which other available options are compared
(Brown and Krishna 2004, Johnson et al. 2002, McKenzie et al. 2006, Park
et al. 2000). Therefore, firms can have a strong influence on consumers’ mass-
customization choices by the defaults that they offer.
In this study, we specifically focus on the question of how defaults can be
used by firms to direct consumers towards more profitable high-margin decision
paths.
4 ARTICLE II
Configuring a product such as a car is a multi-attribute decision process
(Seetharaman et al. 2005). Therefore, we investigate in particular whether the
influence of a default in mass customization is not only on the consumer’s level
choice for the attribute for which the default was set but also on the level choices
they make for other attributes that come up later on in the mass-customization
process. For example, in an online car configurator, consumers need to make
up to 60 consecutive attribute-level choices and the consumer’s choice in a later
attribute may be affected by the selection of a level in an earlier attribute (e.g.,
leather seats, leather steering wheel). Due to such cross-category dependencies
among attributes (Gentzkow 2007, Manchanda et al. 1999, Song and Chinta-
gunta 2006) we expect that the effect of a change in the default for one at-
tribute will carry over to the consumer’s choices for other attributes (Sriram
et al. 2010). Indeed, anecdotal evidence from the car market shows that the en-
gine selection has a considerable impact on many other decisions regarding car
components (Wedel and Zhang 2004).
We expect this effect across attribute-level choices to have an upward profit
potential. First, because of positive cross-category dependencies among the at-
tributes (Gentzkow 2007, Manchanda et al. 1999, Song and Chintagunta 2006),
a high-margin change in the default for one attribute is likely to increase high-
margin choices for other attributes later on (Sriram et al. 2010). Second, we
expect consumers to form mental images (selective accessibility mechanism)
that drive the consistent selection of their current attribute levels (Mussweiler
2003). Therefore, once high-margin attribute levels are selected early on in the
mass-customization process, this is likely to increase consumers’ later choices
for high-margin attribute levels. Third, we also expect consumers to be sub-
ject to focalism when going through the mass-customization decision process
(Häubl et al. 2010). This implies that after selecting high-margin levels due to
the defaults set in the early stages of the mass-customization process, consumers
are likely to not fully take into account their past spending commitments when
selecting attribute levels later on in the mass-customization process (Levav et al.
2010). The reason is that consumers initially do not take into account all costs
of buying a high-level product but later are less sensitive to their earlier choices
and more focused on their current choices.
ARTICLE II 5
We empirically investigate the default-based up-selling potential in mass-
customization configurators. Our aim is to investigate how profitability in mass
customization can be increased by the extent to which default effects extend
across attributes. To better understand the effects of default selection in mass
customization, we address the first- and second-order effects of defaults early
in the process on the profitability of mass customization. The first-order effect
causes consumers to select higher-margin attribute levels within the attributes
for which the defaults were set. The second-order effect operates through
changes in consumers’ selections for attributes in the later choices of the mass-
customization process. We offer a conceptual framework to managerially guide
the selection of defaults to accommodate these two effects. Therefore, in con-
trast to past research on mass customization that addressed profitability mainly
by focusing on benefits such as organizational learning and lean management
(Alford et al. 2000, Silveira et al. 2001, Kotha 1995, 1996), we analyze the po-
tential of the profitability of a firm’s mass customization based on consumers’
choices within the mass-customization process. To do so, we model the in-
terplay of pre-set defaults and subsequent decisions and their effects on the
revenue-generating process.
The remainder of the paper is organized as follows. Section 2 discusses the
effects of defaults in mass customization configurators, a utility model
specification of consumer choice in mass-customization, and the multivariate
multinomial Probit model for the analysis of how consumers choose among
specific target attributes conditional on choices they have already made. We
also propose a procedure to maximize profitability based on the consumer pref-
erence model. In Section 3, we provide a context for the methodology and use
an example of online configuration from the automotive industry as an explicit
application to real data. We analyze product selections in the data to estimate a
preference model as a basis for optimal default selection. Then, we test the
specified optimal defaults in a field experiment with the same configurator.
With a follow-up survey, we also address the question of whether customer
satisfaction is affected by the high-margin defaults in the product configurator.
We conclude with a discussion and managerial implications of our findings in
Section 4.
6 ARTICLE II
2. High-Margin Default Selection in Mass-Customization Configurators
Mass-customization strategies offer a high variety in product design with the
aim of matching customers’ product preferences more closely than when only
a few product options are available (e.g., Davis 1987, Duray et al. 2000, Kotha
1995). A commonly used approach to achieve this aim is to provide consumers
with access to online product configurators (Randall et al. 2007, Salvador et al.
2009). These are in essence elaborate choice boards with which customers can
compose products according to their own needs. Product designs can then either
be saved to the customer’s own account or be directly placed as an order to the
firm (e.g., www.nikeid.com).
Recent research has investigated the (economic) value of mass-customized
products from a customer’s perspective (Franke et al. 2009, Franke and Schreier
2010, Franke et al. 2010, Schreier 2006). Mass customization can provide a
higher preference fit (Franke et al. 2009, Franke and Schreier 2010, Franke
et al. 2010, Schreier 2006), process benefits (Franke and Schreier 2010, Schreier
2006), a self-design- effect (Franke et al. 2010, Schreier 2006), design effort
(Franke et al. 2010), and product uniqueness (Schreier 2006). All these aspects
directly affect the perceived value of a customized product. Thus, companies
have to face this challenge of striking the balance between complexity and util-
ity in mass customization (Dellaert and Stremersch 2005). One opportunity to
address both the preference fit and the design effort is the use of intentionally
set defaults for several options within the customization process. Such defaults
lower the complexity of the decision-making process, as they serve as a refer-
ence to which other options are compared (Brown and Krishna 2004, Johnson
et al. 2002, McKenzie et al. 2006, Park et al. 2000) but do not limit the products’
individuality, as the defaults can be changed when customers make an active
choice (Brown and Krishna 2004, Goldstein et al. 2008, Park et al. 2000).
2.1 First- and Second-Order Default Effects
In an online configurator, the customer has to go through several choices in
order to select the attribute levels they prefer for the product that is being cus-
ARTICLE II 7
tomized. Theoretically, we expect a strong direct effect of defaults among the
attributes for which they are set, as this is shown in past research (Brown and
Krishna 2004, Johnson et al. 2002, McKenzie et al. 2006, Park et al. 2000).
Therefore, we expect that consumers frequently follow the manufacturer’s pre-
set levels when making their mass-customization choices (in our field study, the
default acceptance rates range from 9% to 87%).
In this study, we are specifically interested in the effects that defaults may
have on subsequent option choices because such sequential decisions often are
not made independently (Gabaix et al. 2006, Iyengar and Lepper 2000, Muraven
and Baumeister 2000). Although one might expect that budget considerations
should drive consumers to select less expensive attribute levels later on in the
customization process once they have made more expensive choices in the be-
ginning, we propose that there is a likely positive effect of initial high-margin
defaults on later high-margin attribute choices. This is due in part to cross-
category complementarities among the attributes (Gentzkow 2007, Manchanda
et al. 1999, Song and Chintagunta 2006), a change in the default for one at-
tribute is likely to also affect choices for other attributes later on (Sriram et al.
2010).
Furthermore, we also expect behavioral influences to increase the selections
of high-margin attribute levels later on. From a psychological perspective, de-
faults constitute the point of reference that people have in their mind when they
evaluate the other attribute levels of the considered attribute. They thus en-
gage in a comparison process, where they evaluate the expected utility of each
alternative attribute level in comparison to the default level to come up with
their decision. Such comparison mechanisms are very well understood in the
psychological literature, and it has been shown that people usually engage in a
form of hypothesis-consistent testing (Mussweiler and Strack 1999, Mussweiler
2003). That is, people start to mentally activate images and associations that are
consistent with choosing the default. For instance, a high level for horse-power
might activate mental images of luxury, freedom, and youthfulness. The alterna-
tives are then judged in comparison to these mental images and only regarding
the dimensions that have been activated by the default (Houston and Roskos-
Ewoldsen 1998). The default thus has an advantage over the alternatives that
are only chosen if they are considerably superior. For the present context, the
8 ARTICLE II
most important prediction of this comparison process is that mental images that
have been activated for the comparison process stay activated in the mind and
are therefore likely to influence later choices (Mussweiler and Strack 1999).
That is, the activated mental images and associations stay highly activated in
the person’s working memory and work as priming for later decisions that thus
have an increased likelihood of being consistent with the activated mental im-
ages. Therefore, if an image of luxury, freedom, and youthfulness has been
activated, later decisions are made to be consistent with that image and thus
correspond to the initial default.
Finally, due to focalism in the consumer decision process, we expect con-
sumers to cognitively focus more strongly on their current attribute-level choices
than on previous or later attribute-level choices (Häubl et al. 2010). This effect
is bidirectional in that consumers (i) do not fully take into account their fu-
ture attribute-level choices when selecting their current attribute levels (Levav
et al. 2010), and (ii) are not likely to fully take into account all past attribute-
level choices when making their current decisions (cf. Miller 1956). We expect
that this focalism effect will lead consumers to select relatively more expen-
sive attribute levels in their mass-customization choices than when they choose
between complete product alternatives simultaneously. The reason is that con-
sumers initially do not take into account all costs of choosing a high-level at-
tribute option but later on forget about the costs of the earlier high-level attribute
choices they have already made.
Conceptual framework
Using the data from their online configurator, firms can specify two sets of
attributes for different purposes in the default-setting process: (i) the set of at-
tributes among which they offer the default levels and which we refer to as key
attributes, and (ii) the set of attributes among which they expect second-order
effects and which we refer to as target attributes. Most desirably, the category
of key attributes is characterized by its importance for the product; that is, one
level of each attribute is essential for the product functionality (e.g., engine for
a car), and the category of target attributes is characterized by high margins.
ARTICLE II 9
Figure 1 Default Effects Framework
A general framework helps to understand the relationships that are being an-
alyzed. The framework demonstrates the expected association between key and
target attributes. We expect the default selection among key attributes to have
significant carry-over effects on the subsequent level choices among the target
attributes. Figure 1 summarizes the position of key and target attributes within
the analysis and displays the relationship we explore. There are two optimiza-
tion considerations. First, one can analyze what would be the best default set
among the key attributes to entail optimal consumers’ choices among the key
attributes themselves from the firms’ perspective (which here is profitability),
termed first-order optimization. Second, one can analyze what would be the
best pre-set options among key attributes to entail optimal decisions within the
sets of key and target attributes, termed second-order optimization (the second-
order effect included). In other words, the second-order optimization incor-
porates the interplay of choices among pre-specified and subsequent attribute
choices within the product design process into the calculation. We address the
joint first- and second order optimization problem in this study (see Figure 1),
and determine what is the best default set among key attributes in terms of re-
sulting level choices under consideration of effects (second-order effects) within
the category of target attributes. This means, we analyze which defaults are best
in terms of profitability with respect to the joint revenue from attributes where
the pre-set options are directly applied to (key attributes) and of attributes whose
choice probability is most likely to depend on choices among the key attributes
10 ARTICLE II
(target attributes). In Section 2.4, we approach the question of how to find an
optimal default set, to maximize profitable across the two effects. Sections 2.2
and 2.3 develop the underlying utility model to drive this optimization.
2.2 Utility specification
For the utility specification, we consider a utility surplus model such that we
calculate the differences between two aspects of utility for each target attribute
level; (i) the joint utility from the target attribute level in combination with all
key attribute level choices (uK+T ), and (ii) the utility from the key attribute level
choices only (uK). The resulting utility component uT = uK+T − uK is then
the relevant combination of the target attribute utility and its interactions with
previous key attribute choices. With such a model formulation, we separate out
the key attributes effects from the joint product utility. Hence, with the resulting
utility we can capture the key attributes’ influence on the target choice. As a
consequence we can also determine the different effects for different choices
among key attributes on target attribute utility, which allows us to select the
optimal default setting mix to affect both the key attributes for which they are
set and the subsequent target attribute choices. Before we formulate the surplus
model in all technical details, we start with the notation and handling of the
different variables in our model.
To develop our utility model, we consider each attribute level for the key
attributes for which the defaults are set as a single dichotomous variable in the
decision process. Therefore, let xlv be the indicator for choosing the vth level
of the lth attribute, namely,
xlv =
{1, if the vth level of attribute l is choosen0, if the vth level of attribute l is not choosen , (1)
where l = 1, . . . ,m, where m represents the number of attributes in the data
set (or considered in the analysis, respectively), and v = 1, ..., kl with kl the
number of levels for Attribute l.
For the target attributes, we take a different approach and consider each
attribute as a polychotomous (multinomial) variable. Let dj be the chosen level
ARTICLE II 11
for attribute j such that
dj =
⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩
1, if the 1st level of attribute j is choosen
2, if the 2nd level of attribute j is choosen...
k, if the kth level of attribute j is choosen
0, if no level of attribute j is choosen
, (2)
where j = 1, . . . , J , with J representing the number of attributes in the data
set (or considered in the analysis, respectively), and kj = |{0, 1, . . . , k}| with
kj representing the number of levels available for attribute j.12 The data can
be defined and formatted in any way that is convenient for a specific analysis.
Generally, as output from a configurator, the data consist of discrete, dichoto-
mous, and/or polychotomous entries in one of the given structures above and
can be easily transformed into one another.
To analyze the choice behavior through a probability model, we translate the
choice behavior into a utility context. First, as each level of a target attribute, in
combination with levels of key attributes, has a certain utility (either positive or
negative) for each customer i, we formulate a constant utility specification for
each level of the target attributes. That is:
vij(tj) = βConfigjtj
, (3)
where vij(tj) is the joint utility consumer i perceives from choosing level tj
of the jth target attribute in combination with the previous choices among the
key attributes. Second, we define a linear utility specification for the set of key
attributes. Therefore, according to equation (1), let x be the indicator vector
whose entries are set to one if the corresponding attribute level is chosen and
zero otherwise. Furthermore, let βKey = (βKey11 , . . . , βKey
1k1, . . . , βKey
mkm)T be the
vector of utilities associated with each of the k· levels of the m key attributes
represented in the choice vector x = (x11, . . . , x1k1, . . . , xmkm)T . Therefore,
assuming a linear-additive utility function for each customer i we get
ui(x) = xTβKey , (4)
12Using this structure of the data, the no-choice option is included in the number of kj attribute levels (e.g., threechoice options: no choice, standard or advanced→ kj = 3).
12 ARTICLE II
where ui(x) is the utility consumer i perceives from choosing a specific combi-
nation x from the key attribute levels. We assume that consumer i (i = 1, ..., n)
cannot choose more than one level of each attribute. A consumer i will then
only choose a certain attribute level if it maximizes his/her utility surplus. In
other words, the choice for a specific level tj of a target attribute j has to pro-
vide the maximal possible increase of his/her perceived joint utility from the
set of chosen key attribute levels x and the chosen level of the specific target
attribute. Consequently, the resulting utility surplus from equations (3) and (4)
for consumer i and level tj of the jth target attribute is given by
zij(tj) = vij(tj)− ui(x) = βConfigjtj
− xTβKey . (5)
Therefore, a utility-maximizing consumer i would choose level t∗j of the jth
target attribute, if it has the maximum utility surplus {zij(t∗j) > zij(tj), ∀tj �=t∗j}, and would choose none of the alternatives if the no-choice option (tj = 0)
has the maximum utility {zij(0) = 0 > zij(tj), ∀tj �= 0}.
Our utility surplus specification in equation (5) is being considered sepa-
rately for each of the J choices among the target attributes within the multi-
variate multinomial probit model as it is a combination of J multinomial probit
models. This implies an assumption of no interaction effects between target
attributes as the choices among them are being made sequentially. This means,
we have no extra parameters in the model that account for interaction effects
of the current target attribute being chosen and previous choices among them.
Possible interaction effects are then included in the attribute level intercepts
(βConfigjtj
) as this also accounts for the average effect of earlier target attribute
selections. The reason for such an assumption is simply of practical matter to
not overwhelm the model with parameters that significantly increase the model
complexity and the resulting computational effort for parameter estimation. In
order to accomodate for such interactions between target attributes, we further
model a full correlation structure; that is, we allow for correlations among al-
ternatives within decisions and among alternatives between all other choices,
as we have already pointed out that choices are not being made independently.
Consequently, if a target attribute level in choice step 1 has a positive interac-
tion with a target attribute level in choice step 3, the corresponding correlation
with respect to the underlying utility is significantly greater than zero. In such
ARTICLE II 13
a way, we still incorporate associations between target attribute levels although
we have not specifically adressed them in the formulation of the surplus model.
As already mentioned, we use a probit model approach, as such models
are common practice in the analysis of polychotomous response data (Albert
and Chib 1993). Due to our definition and the nature of the data, the actual
choices being made among the target attributes, according to (2), can be con-
sidered multinomial outcomes. Because we have more than one target attribute
(at least in the general case), we end up having a multivariate multinomial out-
come vector. To address this issue, in the next section, we specify a multivariate
multinomial Probit model. To account for heterogeneity among individuals,
we follow a standard Bayesian approach for parameter estimation (Rossi and
Allenby 2003).
2.3 Multivariate multinomial Probit model
Starting from amass of n customers, suppose that each individual i (i = 1, ..., n)
has to make J choices within the set of target attributes. For the first choice, k1
options are offered, for the next, k2 options, and so on up to the last choice with
kJ options (Recall: The no-choice option is included in the number of options
kj). Let, then, according to equation (2), di = (di,1, ..., di,J)T denote the index
vector of the alternatives the ith individual chooses for the J decisions. Assume
that each of these J choices follows a multinomial Probit model (Zhang et al.
2008)13. As explanatory variables for all J decisions, we introduce the previous
choices among a specified set of key attributes into the model, as defined in
equation (1), x = (x11, . . . , x1k1, . . . , xmkm)T . Therefore, following our utility
specification in equation (5), for the jth choice, j = 1, ..., J , there exists a
13For detailed discussions of multinomial Probit models, and the treatment of additive and multiplicative redun-
dance, see for example McCulloch and Rossi (1994), McCulloch et al. (2000) and Imai and van Dyk (2005a,b).
14 ARTICLE II
(kj − 1)-dimensional underlying utility vector zi,j, with
zi,j =
⎛⎜⎜⎝
1 0 · · · 0 −xT0 1 · · · 0 −xT...
.... . .
......
0 0 · · · 1 −xT
⎞⎟⎟⎠×
⎛⎜⎜⎜⎜⎜⎝
βConfigj1
βConfigj2...
βConfigj(kj−1)βKey
⎞⎟⎟⎟⎟⎟⎠+ εi,j = Xi,jβj + εi,j,
where βj = (βConfigj1 , βConfig
j2 , . . . , βConfigj(kj−1), (β
Key)T )T , with βKey = (βKey11 , . . .,
βKey1k1
, . . ., βKeymkm
)T , satisfying
di,j =
⎧⎨⎩
0 if max1≤l≤kj−1(zi,j,l) < 0
r if max1≤l≤kj−1(zi,j,l) = zi,j,r > 0(6)
where zi,j,l is the lth component of the utility vector zi,j, zi,j ∼ N(Xi,jβ,Σj)
and {Σj}(1,1) = 1 for i = 1, ..., n and j = 1, ..., J ; εi,j ∼ N(0,Σj). Equation (6)
describes a fully identifiable multivariate multinomial Probit model (MVMNP;
see Zhang et al. 2008).
Contemplating all information together, we can now set up our MVMNP for
the J choices within the target category as follows:
zi = Xiβ + εi, εi ∼ N(0,Σ)
where zi = (zi,1, . . . , zi,J)T are the utility surpluses for each alternatives in
the several decisions with zi,j = (zi,j,1, . . . , zi,j,(kj−1)), j = 1, . . . , J , Xi =
(XTi,1, . . .,X
Ti,J)
T , β = (βConfig11 , βConfig
12 , . . ., βConfig1(k1−1), . . . , β
ConfigJ1 , βConfig
J2 , . . .,
βConfigJ(kJ−1), (β
Key)T )T and εi ∼ N(0,Σ)with diagonal elements ofΣ equal to one,
i.e. σq,q = 1, where q = 1, (1+k1−1) = k1, (1+k1−1+k2−1) = (k1+k2−1), . . . , (1+k1−1+k2−1+. . .+kJ−1−1) = (k1+k2+. . .+kJ−1−(J−2). More
generally, q = 1+∑j−1
s=1(ks− 1), with j = 1, . . . , J . Our explanatory variables
ARTICLE II 15
can be represented by a(∑J
j=1(kj − 1))×(∑J
j=1(kj − 1) +∑m
l=1 kl
)-matrix;
Xi =
⎛⎜⎜⎝
1 0 · · · 0 −xi,11 · · · −xi,1k1 · · · −xi,mkm
0 1 · · · 0 −xi,11 · · · −xi,1k1 · · · −xi,mkm...
.... . .
......
......
......
0 0 · · · 1 −xi,11 · · · −xi,1k1 · · · −xi,mkm
⎞⎟⎟⎠
︸ ︷︷ ︸A
︸ ︷︷ ︸B
with m the number of key attributes. The matrix A is a∑J
j=1(kj − 1) ×∑Jj=1(kj − 1)- identity matrix and includes the constants equal to 1 for the
configuration utilities, vij(tj) = βConfigjtj
, of the choice options for all J possible
choices. The matrix B includes the actual choices individual i made within the
set of key attributes. Obviously, the choices among the key attributes are the
same for all subsequent decisions within the target category. Consequently, the
rows of matrix B are identical. The parameter estimation process, a Bayesian
sampling algorithm for MVMNP models, is presented in appendix B.
The corresponding choice probabilities P(di) for the outcome vectors di =
(di,1, ..., di,J)T are then simply calculated as the probabilities for the latent utility
surpluses zi = (zi,1, ..., zi,1)T to fall within the ranges to match the resulting
choice vector di. This can easily be done as in our multivariate multinomial
probit model, we assume the latent utility surpluses to arise from a multivariate
normal distribution, zi ∼ N(Xiβ,Σ).
2.4 Profitability maximization
We propose a two-tiered attribute selection procedure as a profitability maxi-
mization strategy. First, the knowledge, experience, and expertise of company
authorities and/or area experts is used to limit the pool of possible attributes
to be considered in the analysis. This qualitative survey for attribute selection
also guarantees the managerial feasibility of using the levels of possible key
attributes as pre-set options in the online configurator and for the most promis-
ing target attributes to try and influence. In other words, company experts and
authorities can give helpful information about expected associations between at-
tributes, the willingness to pre-set levels for the specified attributes and whether
16 ARTICLE II
these pre-set options are practicable from a technical perspective. Second, a
contingency analysis can be conducted to investigate the relationships of the
qualitatively appointed attributes. This analysis reveals which relations among
the appointed attributes are significant. This attribute selection procedure is an
iterative process and results in a set of attributes best meeting the managerial
expectations and requirements. To do this, we use a simple maximization pro-
cedure.
Once we have estimated the probability model, we can go one step further
and determine the choices within the set of key attributes that are most likely
to increase the joint profitability of mass-customization systems with respect to
the appointed key and target attributes. The result will then be referred to as an
optimal default set. As discussed in the introduction, we build on the results of
previous research that has shown that pre-set options are a very powerful tool
to support consumers’ choices (e.g., Goldstein et al. 2008, Brown and Krishna
2004). Therefore, we attempt to provide a default combination among the key
attributes that has the highest probability of putting consumers on a high-margin
decision path jointly considering key and target attributes, that is, not neglecting
possible second-order effects depending on previous decisions. We use a sim-
ple iterative maximization procedure. The object to be maximized is the joint
expected profit E(profit|x) coming from the key and target attributes. In other
words, the profit for the target attributes weighted with the corresponding prob-
ability determined in the previous section conditional on choices among the key
attributes plus the profit coming from the chosen key attribute levels. Thus, we
have to solve the following maximization problem:
maxx
[E(profit|x)] =
maxx
[k1∑
l1=1
. . .
kJ∑lJ=1
P(d1 = l1, . . . , dJ = lJ) · v(d1 = l1, . . . , dJ = lJ)
+ xT (p(x)− c(x))
],
(7)
with v(d1 = l1, . . . , dJ = lJ) = p(d1 = l1, . . . , dJ = lJ)− c(d1 = l1, . . . , dJ =
lJ), where p(d1 = l1, . . . , dJ = lJ) simply denotes the sum of the prices for
ARTICLE II 17
the according choices among the target attributes to be paid by the customers
and c(d1 = l1, . . . , dJ = lJ) the sum of costs for the car manufacturer for
the chosen target attributes. The vector x, according to (1), stands for the
specific key attribute levels, p(x) is the corresponding price vector and c(x)
is the cost vector associated with the key attribute levels in x. The probabilities
P(d1 = l1, . . . , dJ = lJ) are calculated from the multivariate multinomial Probit
model (6) presented earlier in Section 2.3. The corresponding optimal defaultset resulting from our model is then given by
xopt = argmaxx
[k1∑
l1=1
. . .
kJ∑lJ=1
P(d1 = l1, . . . , dJ = lJ) · v(d1 = l1, . . . , dJ = lJ)
+ xT (p(x)− c(x))
].
(8)
Consequently, the optimal default combination is determined via an iterative
grid search algorithm that evaluates the expectation function for each possible
combination x of default setups within the set of key attributes (or any other
defined support for x that one wishes to optimize the profit for). For better
clarification and to provide evidence for the relevance of such an analysis, we
bring the multivariate multinomial Probit model in the context of an online car
configurator. Therefore, in the next sections, we refer to real data from the
automotive industry.
3. Online car configurator: Existing real-worlddata and a field experiment
We now apply the proposed probability and optimization model to real data in
an application in the automotive industry. In particular, we analyze existing data
and conduct a field experiment on the basis of the online car configurator of a
renowned premium car manufacturer. First, we describe the existing configurator
consumer choice data for 2,500 real-world customers, the parameter estimation
of the multivariate multinomial Probit model based on this data and the re-
18 ARTICLE II
sulting default level selection. Next, we discuss a field experiment with 8,608
customers that was designed on the basis of the initial analysis and that allows
for a more controlled test for the first- and second-order effects. Finally, we
introduce the results of a customer satisfaction survey to investigate whether
the selection of a high-margin default level might harm customers’ satisfaction
with the product and the decision-making process.
3.1 Existing online configurator customer choice data
A large data set containing 2,500 real online car configurations was available
for analysis. These online car configurations contain the choices of real poten-
tial customers who are very likely to be interested in purchasing a car, as the
data in our analysis is the actual output of a functioning online car configurator
provided on the website of a renowned premium car manufacturer. This is a
unique source of data regarding real decision behavior in a mass customization
environment for durable goods, particularly cars. The data contain individual
car designs customers have stored to their account. The company can obtain
these configurations from its own website on which the co-design platform is
implemented.
The data is structured as follows. For each stored configuration, all attribute
levels to be individually chosen within the configuration process are stored with
either a "chosen" label or a "not-chosen" label. Consequently, we can set up a
data structure as described in Section 2.2. We define the variables representing
the key attribute levels according to Equation (1) as dichotomous variables, and
the variables representing the target attribute levels according to Equation (2) as
polychotomous (multinomial) responses, respectively. This allows us to set up
the multivariate multinomial Probit model as covered in detail in Section 2.3.
In our field study, several default combinations determined by using the pro-
posed model are embedded in the manufacturer’s real online car configurator.
When starting the configuration process, a customer was randomly assigned to
one out of five configuration conditions. The five conditions include one con-
dition without any defaults being used and four sets of a varying number of
defaults of different levels. If a customer was assigned to one of the four default
ARTICLE II 19
conditions, she was asked whether the car configuration should start with pre-set
options or not. A reasonably large number (15%) decided to start with defaults.
The data set for our field study contains 8,608 observations. In Section 3.2,
we describe the experiment in detail and conduct several tests. We test whether
there are significant differences among the five configuration conditions with
respect to profit from key and target attributes. The results provide evidence
for first- and second-order default effects. In the next section, we start with the
determination of the key and target attributes relevant for our application.
3.1.1 Key and target attributes
A first step with respect to our analysis, is the specification of all variables of
interest, that is, the dependent variables (the choices among target attributes)
and the independent variables (the choices among key attributes). The criteria
for selecting the key and target attributes are set by the firm and can be se-
lected to match the specific application. Therefore, we first maintained several
interviews with company authorities of the premium car manufacturer. The ba-
sic principles, at this step, for the selection of key and target attributes, were the
following: (i) the experts’ opinions and appraisals concerning relations between
consecutive attributes in the configuration process; (ii) the company’s willing-
ness to test several pre-set options for the key attributes in the real online car
customization platform (i.e., to test the pre-set options with real potential cus-
tomers), and (iii) the margins with respect to the target attributes. In more detail,
for the target attributes, the company authorities focused on extra equipment
and accessories such as multimedia attributes. One characterization of such at-
tributes is their ordering in huge quantities at a considerably low price by the car
manufacturer. Therefore, they potentially generate high profit levels when sold
with cars. The specified set of five key attributes includes the business pack-
age (1 level), rims (2 levels), upholstery (4 levels), steering wheels (3 levels)
and seats (3 levels) with a total of 13 different attribute levels. Therefore, there
are 13 variables for the set of key attributes whose levels can either be chosen
or not (dichotomous variables). The infotainment-attributes are the appointed
target attributes in our analysis. We have four target attributes, namely, naviga-
tion systems, radio, phone equipment and sound systems, each with more than
one level (polychotomous variables). Figure 2 summarizes the model setup for
20 ARTICLE II
Figure 2 Default Effects Framework - Online Car Configurator
our analysis according to Figure 1. In a second, successive step, we conducted
4×5 = 20 contingency analyses to statistically investigate the relations between
the key and target attributes. As a result, we obtain significant relations between
the appointed sets of key and target attributes. Table A-1 provides the results of
the respective contingency analysis. Accounting for α-error inflation, we still
get an overall significance level of α∗ = 1− (1−0.0001)19(1−0.0283) = 0.03
for all χ2-tests. The χ2-based measure of coherence, Cramer’s V , shows the
degree of association (see Table A-1). Therefore, we statistically justified the
reasonability of the attribute selection to be used in our model; therefore sig-
nificant parameter estimates are to be expected. Further aspects that were con-
sidered in the selection of the key attributes were the actual positions of the
attributes within the product (car) configuration process. Our attempt was to
avoid default crowding in one single configuration screen, as only a few pre-set
options can prevent customers from forming negative metacognitive impres-
sions (Wright 2002). The appointed key attributes are equally distributed over
the entire customization sequence, but are set previous to the target attributes.
Therefore, we make sure that consumers do not feel patronized in their choices
by a large number of pre-set options. Finally, we have adequately set up the
probability model. In the following sections, we estimate the corresponding
model parameters, describe a field study, and assess further consequences of
defaults in the car configurator.
ARTICLE II 21
3.1.2 Parameter estimation
As already given above, we have access to a large data set from a renowned
premium car manufacturer. The data consist of 2,500 car configurations stored
from the company’s real online car configurator on its corresponding website.
We follow a standard Bayesian approach for parameter estimation (Rossi and
Allenby (2003); see also appendix B). To estimate our model parameters, we
ran sampling chains for 100,000 iterations and assessed the convergence by
monitoring the time series of the draws. We report results based on 50,000
draws retained after discarding the first 50,000 of the draws as burn-in itera-
tions. Table A-2 from appendix A provides the posterior means and 95% pos-
terior intervals for the parameters βConfig11 , βConfig
12 , βConfig21 , βConfig
31 , βConfig32 ,
βConfig33 , βConfig
41 , βConfig42 , (βKey)T . We can obtain that almost all parameters,
except βConfig42 (Configuration with - Bose Surround Sound) are significantly
different from zero (i.e., the zero is not included in the 95% posterior interval).
The parameter estimates are only to be relatively interpreted as how the surplus
function (5) changes. Considering the table more closely, we can obtain that
all estimated configuration utilities (βConfigjk , k = 1, . . . , kj − 1, j = 1, . . . , 4)
are smaller than zero (except βConfig41 ), that is, the utility surplus is smaller than
zero if no key attribute is chosen at all. This implies that there is a strong
second-order effect, as the specification is only chosen if the surplus becomes
positive. In other words, the choices of the target attributes certainly depend
on the chosen levels of key attributes. Negative estimates for the key attribute
levels coefficients imply that a chosen key attribute level increases the utility
surplus. This confirms the reasonability of the selected sets of attributes and the
effect of the appointed key attributes on subsequent level choices according to
our MVMNP model. Only for the DSP-sound system level do we get a posi-
tive surplus even if no key attribute is chosen14. Looking at the margins of the
parameter estimates for the attribute sound systems (βConfig41 and βConfig
42 ) and
all key attributes (βKey·· ), we can obtain that any combination x of key attribute
levels does not affect the order of the utility surpluses for the DSP sound system
and the Bose sound system, that is, maxx zi4(2) = maxx(βConfig42 − xTβKey) =
0.0988 < minx zi4(1) = minx(βConfig41 − xTβKey) = 0.283. This is an indica-
14The DSP sound system was the only attribute level within the target attributes that could be chosen by the
customers at no additional costs.
22 ARTICLE II
tor that for the sound system attributes the predictive power of the key attribute
levels might be weak according our model. The corresponding posterior distri-
butions (densities) can be obtained from Figure A-1 in appendix A. To assess the
goodness of fit for the model, we could use the pseudo-R2 of Nagelkerke (1991)
and Cragg and Uhler (1970), a measure of determination based on the idea of
the R2 from OLS regression. However, Hagle and Mitchell II (1992) state that
if the underlying latent variable is available, the usual OLS-R2 itself is the mea-
sure of determination to be used. In the MCMC sampling algorithm described
in appendix B, we sample the underlying latent variable zi, with zi = Xiβ.
Therefore, we can simply calculate the OLS R2 to determine the variability ex-
plained through our model. Because we have MCMC samples, we can actually
determine an empirical distribution of the R2 of the underlying regression prob-
lem and determine a confidence region for this measure of determination. The
resulting R2 and R2adj., respectively, have a 95% highest probability density re-
gion ranging from 0.59 to 0.61. Accordingly, on average, 60% of the variation
in our data can be explained by our proposed model. This is reasonably good
for real-life data. The corresponding densities of the empirical distributions of
R2 and R2adj. are provided in Figure A-2 in appendix A. Next, we determine
default sets, which are to be expected to have influence on the profitability of
the car configurator. These defaults are subject to be tested in the affiliated field
study.
3.1.3 Optimal default selection
In this section, we determine the optimal default set (8) as a solution of our max-
imization problem (7) and four other sets, which we will later use in the field
study. Due to the lack of cost data, we will neglect the costs (i.e., c(·) = 0) in
our calculations15 and simply focus on the joint turnover (sum of prices) coming
from the key and target attributes.16 The grid search evaluates the expectation
function E(profit|x) for the 480 possibilities of different default combinations
15This can be done without loss of generality since (i) we want to investigate the interplay between previous
and later choices conditional on defaults, and (ii) its effect on managerial relevant variables such as profit. If cost
data are available, the results might be different in terms of different defaults and different predicted values but the
implications remain the same.16In the following it is still referred to profit as the variable of interest, as in the general case the costs are not
equal to zero
ARTICLE II 23
Figure 3 Expected Profit Levels for Default-Combinations
a) Key−Attributes − Profit Directions
Default Sets
Pre
dict
ed D
irect
ions
Set I Set II Set III Set IV
Baseline: No Target Attributes Profits
b) Target−Attributes − Profit Directions
Default Sets
Pre
dict
ed D
irect
ions
Set I Set II Set III Set IV
Baseline: Sample Average Profit for Target Attributes
x. Table A-3 in appendix A provides the results. The optimal default set most
likely to maximize the expected profit E(profit|x) includes the business pack-
age, 19-inch aluminum rims, leather upholstery, a multi-function steering wheel
with a wheel mounted shifter and integrated heating, and, finally, a memory
function for the front seats. This basically tells us that in our obtained data
set, configurations including exactly these levels among the key attributes, on
average, result in the maximal joint profit coming from both, the key and tar-
get attributes. It can also be seen that higher (lower) valued attribute levels are
included in the optimal set (set I) for pre-set options. This is an indicator that
these customers configured their cars on a luxury (economy) path. The other
four combinations in Table A-3 were determined to provide a variety of ex-
pected outcomes for the field study. One that is likely to lead to a small profit,
one that is likely to generate a high profit, and the other two sets were specified
to give insights about the sensitivity of changes among the pre-set options, that
is, to lie between the expectations of the low-level (set I) and high-level (set
IV) defaults. The performance of these four sets is subject to be investigated
in the field study presented in the consecutive section. We will later compare
the realized profit levels from the field study. Figure 3a) displays the expected
directions for the profit levels of the key attributes according to the default sets
I to IV. Figure 3b) provides the corresponding expected directions of the profit
levels as well as the no-default baseline for the target attributes according to
our model. As we can see (from Figure 3a)), the expectations for the profit in-
creases with respect to the key attributes are all positively directed which is due
to the fact that a consumer’s choice of one attribute level generates more profit
24 ARTICLE II
than no choice. We also obtain an increasing pattern within the key attributes
profit levels. This is simply explained by the increasing number of defaults and
the higher default levels, respectively; therefore, we see that the default levels
increase, say, from economy to luxury. The expected profit levels for the target
attributes (see Figure 3b)) draw a slightly different picture. We obtain nega-
tively directed profit increases, i.e. profit decreases, for the default sets I and II,
and we expect positive influences on the target attributes’ profits for the default
sets III and IV. In other words, we simply expect the average return from the
target attributes to be lower for the default sets I and II, and to be higher for the
default sets III and IV as compared to the average return for the target attributes
from configurations in the "no default"-condition.
Summarizing, we expect the first-order default effect to only be postively di-
rected and to increase for higher level defaults, and we expect the second-order
default effect to be negatively directed for the sets I and II, as well as positively
directed for the sets III and IV. In general, we expect different results for differ-
ent attribute levels being pre-set in the online car configurator. Consequently,
the presence of such expected differences in the performance of defaults should
motivate companies to conduct such types of analysis, as the results can lead
to improvements of product configurators and reveal sensitivities to different
default levels being used. We subsequently use the results of our model to test
the proposed default sets in the real online car configurator of the premium car
manufacturer. In the following, we provide the field study and the respective
results.
3.2 Field experiment: The impact of profit margin increasingdefaults in a real online car configurator
In the previous section, we estimated our model and determined the default sets
to be tested in a real life mass customization configurator. We will then describe
how we proceeded with our field study, and discuss the corresponding results.
Furthermore, we discuss an additional survey, which was designed to investigate
how defaults affect customer satisfaction. The goal of this additional survey was
simply to see whether customers form a negative impression when confronted
with defaults within the customization process. The results show that we can
ARTICLE II 25
replicate the raising pattern with respect to the expected profit levels, as actu-
ally realized profit levels, for the default sets I to IV with respect to the key
attributes. For the target attributes, we can also replicate the decrease-increase
pattern and show that we get strong (statistically significant) second-order ef-
fects for the default sets I to IV. In addition, we also show that customers do not
form negative impressions due to defaults in the mass-customization process.
Therefore, we offer a reliable foundation for managerial usage in companies
offering product configurators as strategy of sales and distribution.
Modus operandi
For the field study, the authors were actually allowed to embed the several sets
of pre-set options (i.e., four default sets - I, II, III and IV) into the real online
car configurator. We randomly assigned one of the specified sets (see Table A-
3) to consumers using the car configurator of our premium manufacturer. To
recall the research objective: we investigate, whether we can set consumers off
on a high-margin path using accordingly determined pre-set options or not. We
expect to replicate the raising pattern for the key attributes (first-order effect)
in the generated profit levels conditional on the different sets of defaults dis-
played in Figure 3a), namely, for the default sets I, II, III and IV. For the target
attributes, we expect an decrease-increase pattern as in Figure 3b). In the end,
we compare the average profit levels from key and target attributes of the five
conditions (I, II, III, IV and no defaults) with each other. The study lasted 49
weeks.
Results
After the 49 weeks (and 8,608 car configurations), we have 292 configurations
with Default Set I, 267 configurations with Default Set II, 248 configurations
with Default Set III, 238 configurations with Default Set IV and 7,149
configurations with no pre-set options at all. The remaining 414 configurations
have been removed from the data set due to the fact that these customers skipped
the infotainment attributes within their configuration process, and therefore an
analysis of the second-order effect has not been possible. This basically means
that 12.75% of all configurations have accepted to start with defaults. This im-
plies that with approximately 8,700 stored car configurations per year (based on
8,200 configurations in 49 weeks), the yearly profit-multiplicator for different
26 ARTICLE II
Figure 4 Generated Profit Levels for Default-Combinations
a) Realized Key−Attributes Profits
Default Sets
Pro
fits|
x
3000
3200
3400
3600
3800
4000
4200
4400
Set I Set II Set III Set IV
Profits|x=3672.39
Profits|x=3933.62
Profits|x=4136.11
Profits|x=4497.71
Average Profit without Defaults: Profits|0=3077.27incl. Predicted Profit Directions
b) Realized Target−Attributes Profits
Default Sets
Pro
fits|
x
3550
3600
3650
3700
3750
3800
3850
Set I Set II Set III Set IV
Profits|x=3748.25
Profits|x=3658.91
Profits|x=3854.46
Profits|x=3575.39
No Defaults: Profits|0= 3673.65
incl. Predicted Profit Directions
default sets adds up to 1,109 (12.75%×8,700; about 1,109 designing customers
accept to start their configuration with a set of pre-set options). The results of
the realized profit levels generated under the named conditions are displayed in
Figure 4a). From the figure, we can obtain that we fully replicated the increas-
ing pattern of the four default sets considering the generated profit levels for the
key attributes. In other words, the different default sets could influence the aver-
age profit in the predicted direction, i.e. increase the generated revenue. We also
see that the effects differ in strength which is consistent with our prediction that
higher level defaults more significantly increase the profit. We assessed the sta-
tistical significance of the key attributes profits through pairwise comparisons.
The results are displayed in Table 1. From the table we see that there exist
significant differences with respect to different default levels, and most impor-
tantly all default sets generate significantly higher profits for the key attributes
than the configurations without any pre-set options. This provides evidence
for the first-order default effect already discussed in various past research (e.g.,
Brown and Krishna 2004). This research has shown that consumers frequently
follow defaults when making their actual attribute level choice. Looking closer
at our results, we can confirm the findings of these previous studies. Table 2
provides the frequencies in percentages of how often costumers have kept our
default suggestions within the different default sets. The first column of this ta-
ble contains the choice frequencies in the "no-default"-condition. First, we see
that the default acceptance rates for the different attribute levels range from 9%
to 87%, and are always higher than the choice frequencies in the "no default"-
condition. Depending on the attribute level, we consequently get rather high
ARTICLE II 27
Table 1 Pairwise Profit Comparisons
Pairwise T-Tests for Key attribute Profit Levels of ...p-values Default Set I Default Set II Default Set III Default Set IV
No Defaults <0.0001 <0.0001 <0.0001 <0.0001Default Set I 0.0566 0.0081 <0.0001Default Set II 0.4469 0.0123Default Set III 0.0826
Pairwise T-Tests for Target attribute Profit Levels of ...p-values Default Set I Default Set II Default Set III Default Set IV
No Defaults 0.2256 0.8196 0.0050 0.1851Default Set I 0.3062 0.2194 0.0676Default Set II 0.0278 0.3869Default Set III 0.0037
acceptance rates. Second, we conducted several χ2-tests to determine the sta-
tistical significance of the higher choice frequencies for the four default sets.
All tests revealed significant differences for all attribute levels compared to the
baseline at significance levels less than α = 0.01. Finally, these results for the
key attributes provide strong additional evidence for the first-order effect of de-
faults in mass-cutomization. The generated profit levels for the target attributes
are provided in Figure 4b). The profit levels only match the predictions of our
model for default sets II and III. For the remaining default sets I and IV, we get
the opposite directions in the results. The corresponding significance levels for
the profit differences for the four default sets and the "no default"-baseline can,
as for the key attributes, be obtained from Table 1. Here, the only significant
differences occur bewteen default set III and the "no default"-condition, set II
and III, and set III and IV, respectively. From these results, we can conclude
that there exists a "window" for defaults in mass customization for which our
model is able to perform the correct prediction on profit directions for the target
attributes. A company’s challenge then is to determine this window in order to
adequately apply pre-set options in their product configurator. As we can see, a
too extreme high-end default set can have the opposite effect than intended. This
might be explained by boundary effects, that is, that for the higher default level,
people are still aware of high spendings early on in the mass-customization pro-
cess which is then taken into account for the current choice. Unfortunately, we
28 ARTICLE II
Table 2 Comparison of Default Choice Frequency with No Default Baseline
Attribute No Default Default Default DefaultLevel Defaults Set I Set II Set III Set IV
Business 71.86% 87.33%*** 83.90%*** 81.93%***Package
18-Inches 15.29% 34.08%***Aluminum Rims
19-Inches 11.81% 26.61%*** 20.17%***Aluminum Rims
Upholstery 16.97% 25.09%***Fabric Mistral
Upholstery 1.19% 9.59%***Fabric Arkana
Upholstery 16.39% 27.42%***Leather Valcona
Upholstery 9.37% 30.67%***Leather Milano
Multi-Function 37.68% 54.68%*** 51.21%*** 65.13%***Steering Wheel
Memory-Function 5.86% 32.60%*** 44.96%***for Frontseats
*** p-value<0.001
did not have access to individual-level data that would have allowed us to incor-
porate budget constraints into our analysis. Other reasons for the missmatch in
prediction for default set I and default set IV might simply be the fact that not
all customers kept all default levels as choices within their car configuration, or
made additional choices within the key attributes, or even among attributes not
considered in the analysis, and therefore not captured by the model. At any rate,
the aggregate results for the target attributes are worthwile to take a closer look
at. As previously mentioned, the default acceptance rates under single attribute
consideration for the several pre-set options range from 9% to 87%. But consid-
ering the default sets as a whole, and separating the costumers into "accepters"
ARTICLE II 29
and "rejecters" can then reveal the true model performance. Therefore, we sep-
arated each data set for each default condition (set I, set II, set III and set IV)
into those configurations that have accepted more than half of the defaults pro-
posed in the condition and into those configurations that have accepted less than
half the pre-set options. Figure 5 shows the results for the splitted profits for all
four default sets with respect to the two different groups. From the figure, we
obtain that the predicted directions of profit increases for all default sets match
the realized profit levels for the group of default-accepters. For the default sets
I, II and IV, we see that the mean profits from the target attributes for the "re-
jecters" are significantly17 greater (set I and II), and significantly less (set IV)
than the mean profit from the "no default"-baseline, as opposed to the predic-
tions by our model. For default set III the effect for the "rejecters" is in the
same direction as for the "accepters". This explains the correct prediction for
the aggregate data for default set III. The correct prediction for default set II is
due to the interplay of sample size and difference margins such that the opposite
direction of the effect for the "rejecters" is of no consequence. Also considering
the sample sizes in the conditions and difference margins for default sets I and
IV, the contrary second-order effects obtained for the "rejecters" overwhelm the
predicted second-order effects. What we have shown here is the existence of
second-order default effects in mass-customization. For customers that follow
the defaults, our model accurately predicts the direction of the second-order
effect. What might happen for the "rejecters" is that they spend a lot of cogni-
tive effort to reject the defaults and consequently do not follow the underlying
mindset for a positive second-order effect. All together, this is evidence that
with intentionally chosen pre-set options the company can achieve a customer
lock-in to high margin decision paths. In other words, if companies are able to
steer their customers into such a high margin path early on in the configuration
process, they can benefit later on from easily generated profit at no additional
cost. The results also show that one should be careful in the use of defaults and
that the agenda definitely cannot follow the belief that "a higher default level
generally translates into higher profit levels". Our analysis provides evidence
for two types of respones to defaults, (i) the acceptance response which can be
17The significances of the diferences for the target attribute profits for the separated groups have been assessed
by two-sample statistical tests. For the "accepters" in default set I and III, we conducted Wilcoxon Rank-Sum tests
as the sample sizes are small (n=24,31), and for the "rejecters we conducted t-tests. The significances in Figure 5
are indicated as follows: * p-value<.10, ** p-value<.05, *** p-value<.01
30 ARTICLE II
Figure 5 Generated Profit Levels of Target Attributes for Default-Combinations- Profit Split -
a) Default set I (N=292) − Profit split
Pro
fits|
x
3000
3200
3400
3600
3800
4000
More than halfthe defaults accepted
Less than halfthe defaults accepted
’ACCEPTERS’n1=24
’REJECTERS’n2=268
Profits|x=2968.75 ***
Profits|x=3818.06 ***
PredictedDirection
b) Default set II (N=267) − Profit split
Pro
fits|
x
3200
3400
3600
3800
4000
More than halfthe defaults accepted
Less than halfthe defaults accepted
’ACCEPTERS’n1=75
’REJECTERS’n2=192
Profits|x=3271.47 ***
Profits|x=3810.26 **
PredictedDirection
c) Default set III (N=248) − Profit split
Pro
fits|
x
3700
3800
3900
4000
More than halfthe defaults accepted
Less than halfthe defaults accepted
’ACCEPTERS’n1=31
’REJECTERS’n2=217
Profits|x=3763.71
Profits|x=3867.42 ***
PredictedDirection
d) Default set IV (N=238) − Profit splitP
rofit
s|x
3500
3600
3700
3800
3900
4000
4100
4200
More than halfthe defaults accepted
Less than halfthe defaults accepted
’ACCEPTERS’n1=103
’REJECTERS’n2=135
Profits|x=3698.16
Profits|x=3481.74 *
PredictedDirection
utilized according our model, and (ii) the rejectance response which leads to a
second-order effect of different direction. The second type of response to the
defaults is critical and can have quite some different origins. One explanation
might be the boundary effect that is responsible for the contrast in the shift of
profit levels among the target attributes when the defaults become too excessive.
There could be two arguments for this boundary effect: (i) the first-order default
effect is too strong, and consequently focalism and/or the selective accessibility
mechanism cannot explain the decision behavior, as customers are still aware
of high-level attribute choices made previously (due to high-level defaults), and
(ii) because customers frequently follow the manufacturer’s pre-specified lev-
els (Brown and Krishna 2004, Johnson et al. 2002, McKenzie et al. 2006, Park
et al. 2000), the design effort is not high enough to positively affect the willing-
ness to pay (Franke and Schreier 2010, Franke et al. 2010). The investigation of
the origins for such boundary effects and/or the assessment of the explanations
for the different response types would be beyond the scope of this paper, but is
ARTICLE II 31
certainly subject to further research. Here, we simply focus on the existence of
such first- and second-order default effects and provide a conceptual framework
to utilize these effects from a firm’s perspective. The effects are not limited to
only the postive or negative, as our model can also handle both directions of
first- and second-order default effects.
To better quantify these results in a managerial relevant measure, the overall
profit levels per year, recall that the yearly multiplier for the additional profit
generated by different default sets is 1,109. Considering Figure 4a), we see
that approximately between 659,988.08 (= [3, 672.39−3, 077.27]×1, 109) and
1,530,907.96 (= [4, 457.71−3, 077.27]×1, 109) of additional yearly profit can
be realized among the key attributes according to different default sets compared
to no fixed pre-set options. For the target attributes the profit levels represent a
mixed blessing (see Figure 4b)). Here, the additional yearly profit ranges from
a loss of 108,970.34 (= [3, 575.39− 3, 673.65]× 1, 109) to a win of 200,518.29
(= [3, 854.46 − 3, 673.65] × 1, 109). This indicates that it is necessary to con-
duct such an analysis when using default options in mass customization. Con-
sequently, using proper defaults, the additionally generated turn-over can be
fairly large (e.g., 1,374,771.85, joint profit for default set III compared to the
no-default results) keeping in mind that the profit driver, setting the defaults,
does not cost the company a single cent.
3.3 Customer Satisfaction
Customer satisfaction should also be taken into consideration when considering
the use of defaults to increase profit margins. The objective to be investigated
is the effect of the encroachment into the customer’s free decision-making pro-
cess via pre-set options. One could assume besides feeling supported in their
decisions by defaults individuals might also feel patronized in their decision-
making, which could, as a consequence, lower customer satisfaction and hurt
the company’s image. This would then be a negative side effect of default strate-
gies in mass customization. To address this issue, we designed a follow-up
study in which we asked people to go through the exact same decision-making
process as in the real online car configurator. Therefore, 175 respondents were
drawn from an automotive panel that consisted of people who were planning
32 ARTICLE II
on buying a new car within the next 12 months (average age: 45.64, 48.81%
females). They were randomly assigned to one of five conditions consisting
of "no defaults", "set I", "set II", "set III" and "set IV". At the end, we asked
people to answer several questions on seven-point Likert scales. These ques-
tions are indicators for three latent constructs, namely customer satisfaction,
process complexity, and preference certainty. The three-dimensional construct
(customer satisfaction, process complexity, and preference certainty) is used as
an overall measure of customer satisfaction in our specific context. The initial
measurement scales can be obtained from Table A-4. The internal consistency
of the three scales, customer satisfaction, process complexity and preference
certainty, was reasonably good for all three dimensions (customer satisfaction:
Cronbach’s α =0.88, AVE=0.62; process complexity: Cronbach’s α =0.74,
AVE=0.44; preference certainty: Cronbach’s α =0.91, AVE=0.78). There-
fore, for further analysis, we calculated the mean scores and conducted a mul-
tivariate analysis of variance (MANOVA). The MANOVA was applied to see
whether there are any differences in the overall customer satisfaction scores
between customers that configured their car in the different conditions. We
could not determine any statistically significant differences (Pillai’s trace=0.04,
F(12,492)=0.61, p-value=0.84).18 This implies that intentionally set defaults
do not negatively affect customer satisfaction. In addition to determining the
effect of defaults on customer satisfaction, we also assessed whether the cus-
tomers felt patronized when they started their configurations with pre-set op-
tions. Those customers had to answer additional questions to those correspond-
ing to customer satisfaction. These additional questions are indicators for the
latent construct of perceived patronization when configuring the vehicle. The
measurement scale can be obtained from Table A-5. The reliability of the scale
was reasonably good (Cronbach’s α=0.85, AVE=0.87). An analysis of vari-
ance (ANOVA) did not detect any significant differences among the different
default conditions with respect to perceived patronization (F(3,130)=1.07, p-
value=0.37). With a mean score of 2.81, this indicates that people do not feel
patronized in their decision-making when they are confronted with intentionally
set defaults. As a last question, all of the respondents (all conditions) were asked
to also state their buying probability on a seven-point scale anchored at (1) very
unlikely and (7) very likely. An Analysis of Variance (ANOVA) for the buy-
18The other three test-statistics for MANOVA (Wilks’ λ, Hotellings-Lawley Trace and Roy’s Maximum Root)
also indicate that we have no significant differences (p-values>0.27).
ARTICLE II 33
ing probability also failed to detect any significant differences (F(1,167)=0.61,
p-value=0.44). All together, we provide strong evidence that customer satis-
faction and perceived patronization is not an issue for default strategies in mass
customization when applying them properly.
To this end, we have shown that our developed methodology from section 2.
can be adequately used for analyzing mass-customization systems with respect
to first- and second-order effects. The main goal was to show how such ef-
fects can influence the generated profit levels of attribute combinations. We
show that the results can be used to determine optimal strategies for pre-set op-
tions, assuming that consumers frequently follow the manufacturer’s proposed
attribute level as shown in various past research (e.g., Johnson et al. 2002). We
can confirm this effect and extend its influence on subsequent decisions and
provide evidence for a second-order default effect. In addition, we have shown
that defaults do not have a negative effect on customer satisfaction, which would
be critical when implementing defaults in online configurators. Following, we
conclude with discussion, implications and further research directions.
4. Conclusion
Mass customization is a growing business practice and a common strategy in the
automotive industry to simultaneously support consumer choice and increase
firm profits. Default strategies are commonly used to support the customer’s
decision-making process. Herafter, we discuss our findings, draw implications,
reflect on limitations and motivate further research.
4.1 Discussion
In a mass-customization environment, customers have to make several decisions
regarding the desired levels of all required attributes. It is common business
practice for companies offering product configurators to support customers in
such decisions. Key drivers for customers’ willingness to pay are the prefer-
ence fit and design effort. Companies are challenged to strike a balance be-
tween these two dimensions (Dellaert and Stremersch 2005). To do so, defaults
34 ARTICLE II
can be used that play a decisive role in the customer decision-making process,
and companies provide such defaults for selected attribute levels to support cus-
tomers with their choices. Product configurations can be seen as multicategory
decision processes where due to the complementary nature of attribute levels,
a choice in one category affects the selection of attribute levels in other cate-
gories. When consumers design their desired product, it comprises individual
selections of attributes and their corresponding levels from each category. We
propose a statistical model incorporating the interplay between previous and
subsequent attribute selections. For our statistical analyses and empirical in-
vestigations, we do not use hypothetical data and convenience samples as in
various past studies. We analyze real life data from customers of a premium car
manufacturer. We show that previously chosen attribute levels have a significant
effect on subsequent option choices. The customers higher willingness to pay
is positively affected by the higher preference fit and adequate design effort
(Franke et al. 2010)19. Because in a mass-customization system the customer
participation happens during the configurations process (Piller et al. 2004), this
effect can be utilized from a firm’s perspective to increase the profitability of
mass customization. We provide additional evidence for strong first-order de-
fault effects as already shown in various past research as well as second-order
default effects of opposite directions (for two different response types) for these
pre-set options on later choices in the decision sequence.
4.2 Managerial and Research Implications
For the implementation of mass customization, companies can use our model to
specify combinations of pre-set options that are most likely to generate high net
profit levels according to their individual cost structure. Such profit levels can
easily be generated by simply using intentionally set defaults, that is, defaults
determined by our proposed methodology. Put differently, and using the ter-
minology of path dependence, we show how consumers can be "locked-in" to
high-margin decision paths, leading to optimal outcomes from firms’ perspec-
tives, utilizing the strong first- and second-order default effects. Further, the
model discussed in this paper is extremely flexible in its application, as it can
19Since a car is a high involvement product, we can assume the outcome for the customer is satisfactory (Franke
et al. 2009) and therefore design effort also positively affects the willingness to pay.
ARTICLE II 35
be applied to only a limited set of attributes of interest as well as to the entire
set of attributes within the configuration tool. We also provide a simply im-
plementable grid search optimization to determine optimal pre-set options for
the configurator. Furthermore, constraints such as a limited number of defaults
and/or restricted positioning of defaults within the configurator can easily be
achieved by adjusting the support set of possible pre-set combinations within
the optimization procedure. The results also strengthen the necessity of such
an analysis for default application in mass-customization systems. Companies
cannot simply apply as many defaults as possible to skim additional profits due
to the discussed default effects. We provide evidence for two different response
types to defaults that lead to different directions of the second-order effect, the
possibility of present boundary effects, and consequently that defaults are sub-
ject to be carefully used. To utilize the discussed effects with respect to the
profitability of mass customization, companies are recommended to conduct
such an anylsis with respect to the direction of first- and especially second-order
default effects.
Form a scientific perspective, we show that there exists a window in which
the first- and second-order effects have the same direction for both response
types, and lead to higher-level attribute choices. The rationale for the occur-
rence of such a high-margin window might be a combination of three different
mechanisms: Focalism, the selective accessibility mechanism, and customiza-
tion effects. Focalism operates against budget constraints, as customers exten-
sively focus on the recent decision and do not as much take into account that
they have already made expensive high-level choices. Selective accessibility
mechanisms also debilitate budget constraints and support complementary ef-
fects such that attribute levels are chosen according to their combined fit, and
usually equal attribute levels tend to fit better together (e.g., high level with
high level). Customization effects, such as preference fit and design effort, in-
crease the customers’ willingness to pay and therefore also lower the sensitivity
to budget constraints. The equidirectionality of the first- and second-order ef-
fects cannot be exploited infinitely. Consequently, there has to be a boundary
mechanism whose rationale cannot be explained by our approach.
36 ARTICLE II
4.3 Limitations and Future Research
One limitation of our findings is that we could also have significant effects
among attribute combinations not captured in our model because in sum, only
nine attributes have been considered. Another aspect to mention herein is that
we did not explicitly model interaction effects between subsequent option
choices in the group of target attributes. We only accomodated for such pos-
sible effets through a full correlation structure. In future work, it could be in-
vestigated how such interaction effects influence subsequent option choices.
To do so, one would have to model the interaction effects explicitly, which
would allow to more precisely distangle the attribute level associations. Our
approach was more targeted toward managerial relevant output variables such
as profit. However, an analysis of the entire attribute set (approximately about
250 with accessories) and the incorporation of explicit interaction parameters
(in our case 23 more parameters only considering two-way interactions) is pos-
sible but remains a challenging task as it brings some intense computational
effort. As discussed throughout the paper, we analyze field data from an online
car configurator. This means that the configured cars have not necessarily been
purchased in the exact same setup, although these configurations are closer to
reality than hypothetically elicited data. A question to further be investigated
would be the match of actual orders and the corresponding previously made car
configurations. Another further research direction is to exactly determine and
disentangle the mechanisms that are responsible for equidirectional effects and
how they are related. Here, an interesting question is to ask for the determinants
that define the different response types to defaults. Additionally, as discussed to
some extent in this paper, we have boundary effects or directional changes for
the second-order default effects. This is definitely subject to further research
to investigate the origins of such boundary effects. One approach to analyzing
the occurring interplay is to incorporate defaults in the analyses conducted by
Franke and Schreier (2010), and Franke et al. (2010) regarding the economic
value of mass-customized products from a customer’s perspective. Also, due
to a lack of data, we did not incorporate budget constraints into our analysis.
Further research should attempt to account for such mechanisms as well. An-
other body of research approaches mass customization within social networks
(Franke et al. 2008, Moreau and Herd 2010). Here, one could argue how feed-
ARTICLE II 37
back systems influence costumer choices at several stages in the customization
process. We hope to motivate further research in this and related directions.
To summarize, we give evidence for strong effects of pre-set options on the
attributes themselves as well as significant effects of those pre-set options on
subsequent choices (first- and second-order default effects). Further, we pro-
vide the methodology for practitioners to conduct a sufficient analysis of such
effects in product configurators. We describe the procedure from the attribute
selection to default optimization. Therefore, the results of such an analysis can
support companies in developing an elaborate strategy for pre-setting options in
mass-customization systems. Supported by our field study, we show that such
a strategy is required to skim additional profit levels at almost no cost. Finally,
we confirm that customer satisfaction does not suffer from intentionally set de-
faults in product configurators. Additionally, we could also show that defaults
in a product configurator do not have a negative impact on perceived patroniza-
tion and buying probability. This means that, for our application, individuals
configuring their car online did not negatively perceive the pre-set options in
the decision-making process.
38 ARTICLE II
ReferencesAlbert, J. H., S. Chib. 1993. Bayesian analysis of binary and polychotomous response data.
Journal of the American Statistical Association 88(422) 669 – 679.
Alford, D., P. Sackett, G. Nelder. 2000. Mass customisation – an automotive perspective. Inter-national Journal of Production Economics 65(1) 99 – 110.
Argo, J. J., D. W. Dahl, R. V. Manchanda. 2005. The influence of a mere social presence in aretail context. Journal of Consumer Research 32(2) 207 – 212.
Brown, C. L., A. Krishna. 2004. The skeptical shopper: A metacognitive account for the effectsof default options on choice. Journal of Consumer Research 31(3) 529 – 539.
Chib, S., E. Greenberg. 1995. Understanding the metropolis-hastings algorithm. The AmericanStatistician 49(4) 327 – 335.
Cragg, J. G., R. S. Uhler. 1970. The demand for automobiles. The Canadian Journal of Eco-nomics 3(3) 386 – 406.
Davis, S. M. 1987. Future Perfect. Addison-Wesley Publishing, Reading, MA.
Dellaert, B. G. C., S. Stremersch. 2005. Marketing mass-customized products: Striking a bal-ance between utility and complexity. Journal of Marketing Research 42(2) 219 – 227.
Duray, R., P. T. Ward, G. W. Milligan, W. L. Berry. 2000. Approaches to mass customization:configurations and empirical validation. Journal of Operations Management 18(6) 605 –625.
Franke, N., P. Keinz, M. Schreier. 2008. Complementing mass customization toolkits withuser communities: How peer input improves customer self-design. Journal of ProductInnovation Management 25(6) 546 – 559.
Franke, N., P. Keinz, C. J. Steger. 2009. Testing the value of customization: When do customersreally prefer products tailored to their preferences?. Journal of Marketing 73(5) 103 – 121.
Franke, N., M. Schreier. 2010. Why customers value self-designed products: The importanceof process effort and enjoyment. Journal of Product Innovation Management 27(7) 1020– 1031.
Franke, N., M. Schreier, U. Kaiser. 2010. The "I designed it myself" effect in mass customiza-tion. Management Science 56(1) 125 – 140.
Gabaix, X., D. Laibson, G. Moloche, S. Weinberg. 2006. Costly information acquisition: Ex-perimental analysis of a boundedly rational model. The American Economic Review 96(4)1043 – 1068.
Geman, S., D. Geman. 1984. Stochastic relaxation, gibbs distributions, and the bayesian restora-tion of images. IEEE Trans. Pattern Analysis and Machine Intelligence 6 721 – 741.
Gentzkow, M. 2007. Valuing new goods in a model with complementarity: Online newspapers.The American Economic Review 97(3) 713 – 744.
Goldstein, D. G., E. J. Johnson, A. Herrmann, M. Heitmann. 2008. Nudge your customerstoward better choices. Harvard Business Review 86(12) 99 – 105.
Hagle, T. M., G. E. Mitchell II. 1992. Goodness-of-fit measures for probit and logit. AmericanJournal of Political Science 36(3) 762 – 784.
Häubl, G., B. G. C. Dellaert, B. Donkers. 2010. Tunnel vision: Local behavioral influences onconsumer decisions in product search. Marketing Science 29(3) 438 – 455.
Homburg, C., N. Koschate, W. D. Hoyer. 2005. Do satisfied customers really pay more? astudy of the relationship between customer satisfaction and willingness to pay. Journal ofMarketing 69(2) 84 – 96.
Houston, D. A., D. R. Roskos-Ewoldsen. 1998. Cancellation and focus model of choice andpreferences for political candidates. Basic & Applied Social Psychology 20(4) 305 – 312.
ARTICLE II 39
Imai, K., D. A. van Dyk. 2005a. A bayesian analysis of the multinomial probit model usingmarginal data augmentation. Journal of Econometrics 124(2) 311 – 334.
Imai, K., D. A. van Dyk. 2005b. Mnp: R package for fitting the multinomial probit model.Journal of Statistical Software 14(3) 1 – 32.
Iyengar, S., M. R. Lepper. 2000. When choice is demotivating: Can one desire too much of agood thing? Journal of Personality and Social Psychology 96(6) 995 – 1006.
Johnson, E. J., S. Bellman, G. L. Lohse. 2002. Defaults, framing and privacy: Why optingin-opting out. Marketing Letters 13(1) 5 – 15.
Kotha, S. 1995. Mass customization: Implementing the emerging paradigm for competitiveadvantage. Strategic Management Journal 16 21–42.
Kotha, S. 1996. Mass-customization: a strategy for knowledge creation and organizationallearning. Int. J. Technology Management 11(7/8) 846 – 858.
Levav, J., M. Heitmann, A. Herrmann, S. S. Iyengar. 2010. Order in product customization.The Journal of Political Economy 118(2) 274 – 299.
Liu, X., M. J. Daniels. 2006. A new algorithm for simulating a correlation matrix based onparameter expansion and re-parameterization. Journal of Computational and GraphicalStatistics 15(4) 897 – 914.
Manchanda, P., A. Ansari, S. Gupta. 1999. The "shopping basket": A model for multicategorypurchase incidence decisions. Marketing Science 18(2) 95 – 114.
McCulloch, R. E., N. G. Polson, P. E. Rossi. 2000. A bayesian analysis of the multinomialprobit model with fully identified parameters. Journal of Econometrics 99(1) 173 – 193.
McCulloch, R. E., P. E. Rossi. 1994. An exact likelihood analysis of the multinomial probitmodel. Journal of Econometrics 64(1 – 2) 207 – 240.
McKenzie, C. R. M., M. J. Liersch, S. R. Finkelstein. 2006. Recommendations implicit inpolicy defaults. Psychological Science 17(5) 414 – 420.
Miller, G. A. 1956. The magical number seven, plus or minus two: some limits on our capacityfor processing information. Psychological Review 63(2) 81 – 97.
Moreau, C. P., K. B. Herd. 2010. To each his own? how comparisons with others influenceconsumers’ evaluations of their self-designed products. Journal of Consumer Research36(5) 806 – 819.
Muraven, M., R. F. Baumeister. 2000. Self-regulation and depletion of limited resources: Doesself-control resemble a muscle?. Psychological Bulletin 126(2) 247 – 259.
Mussweiler, T. 2003. Comparison processes in social judgment: Mechanisms and conse-quences. Psychological Review 110(3) 472 – 489.
Mussweiler, T., F. Strack. 1999. Hypothesis-consistent testing and semantic priming in theanchoring paradigm: A selective accessibility model. Journal of Experimental Social Psy-chology 35(2) 136 – 164.
Nagelkerke, N. J. D. 1991. A note on a general definition of the coefficient of determination.Biometrika 78(3) 691 – 692.
Park, C. W., S. Y. Jun, D. J. MacInnis. 2000. Choosing what i want versus rejecting what i donot want: An application of decision framing to product option choice decisions. Journalof Marketing Research 37(2) 187 – 202.
Piller, F. T., K. Moeslein, C. M. Stotko. 2004. Does mass customization pay? an economicapproach to evaluate customer integration. Production Planning & Control 15(4) 435 –444.
Randall, T., C. Terwiesch, K. T. Ulrich. 2005. Principles for user design of customized products.California Management Review 47(4) 68 – 85.
40 ARTICLE II
Randall, T., C. Terwiesch, K. T. Ulrich. 2007. User design of customized products. MarketingScience 26(2) 268 – 280.
Rossi, P. E., G. M. Allenby. 2003. Bayesian statistics and marketing. Marketing Science 22(3)304 – 328.
Salvador, F., P. M. de Holan, F.T. Piller. 2009. Cracking the code of mass customization. MITSloan Management Review 50(3) 71 – 78.
Schreier, M. 2006. The value increment of mass-customized products: an empirical assessment.Journal of Consumer Behaviour 5(4) 317 – 327.
Seetharaman, P. B., S. Chib, A. Ainslie, P. Boatwright, T. Chan, S. Gupta, N. Mehta, V. Rao,A. Strijnev. 2005. Models of multi-category choice behavior. Marketing Letters 16(3-4)239 – 254.
Silveira, G.i Da, D. Borenstein, F. S. Fogliatto. 2001. Mass customization: Literature reviewand research directions. International Journal of Production Economics 72(1) 1 – 13.
Song, I., P. K. Chintagunta. 2006. Measuring cross-category price effects with aggregate storedata. Management Science 52(10) 1594 – 1609.
Sriram, S., P. K. Chintagunta, M. K. Agarwal. 2010. Investigating consumer purchase behaviorin related technology product categories. Marketing Science 29(2) 291 – 314.
Wedel, M., J. Zhang. 2004. Analyzing brand competition across subcategories. Journal ofMarketing Research 41(4) 448 – 456.
Wright, P. 2002. Marketplace metacognition and social intelligence. The Journal of ConsumerResearch 28(4) 677 – 682.
Zhang, X., W. J. Boscardin, T. R. Belin. 2006. Sampling correlation matrices in bayesian modelswith correlated latent variables. Journal of Computational Graphics and Statistics 15 880– 896.
Zhang, X., W. J. Boscardin, T. R. Belin. 2008. Bayesian analysis of multivariate nominalmeasures using multivariate multinomial probit models. Computational Statistics & DataAnalysis 52(7) 3697 – 3708.
ARTICLE II 41
A. Tables and Figures
Table A-1 Attribute Specification: Confirmatory Contingency Analysis
Relation p-value χ2-Test Cramer’s V
Business Package × Navigation System <0.0001 0.212
Business Package × Radio <0.0001 0.221
Business Package × Phone Equipment <0.0001 0.126
Business Package × Sound System 0.0283 0.053
Rims × Navigation System <0.0001 0.147
Rims × Radio <0.0001 0.142
Rims × Phone Equipment <0.0001 0.104
Rims × Sound System <0.0001 0.230
Upholstery × Navigation System <0.0001 0.159
Upholstery × Radio <0.0001 0.202
Upholstery × Phone Equipment <0.0001 0.217
Upholstery × Sound System <0.0001 0.153
Steering Wheel × Navigation System <0.0001 0.100
Steering Wheel × Radio <0.0001 0.129
Steering Wheel × Phone Equipment <0.0001 0.094
Steering Wheel × Sound System <0.0001 0.195
Frontseats × Navigation System <0.0001 0.204
Frontseats × Radio <0.0001 0.180
Frontseats × Phone Equipment <0.0001 0.136
Frontseats × Sound System <0.0001 0.250
42 ARTICLE II
Table A-2 Parameter Estimates: Posterior Means and 95% Posterior Intervals
Variable Parameter Posterior MeanLabel (95%- Posterior Interval)
Configuration with - MMI Navigation βConfig11 -2.5952
(-2.7931,-2.4132)
Configuration with - MMI Navigation Plus βConfig12 -0.0040
(-0.0056,-0.0023)
Configuration with - MMI Radio Plus βConfig21 -0.4063
(-0.4534,-0.3521)
Configuration with - Cell Phone Setup βConfig31 -0.0791
(-0.1284,-0.0299)
Configuration with - Bluetooth Phone βConfig32 -0.0293
(-0.0315,-0.0271)
Configuration with - Bluetooth Phone βConfig33 -0.0426
(Wireless Remote) (-0.0455,-0.0398)
Configuration with - DSP-Soundsystem βConfig41 0.3081
(0.2563,0.3570)
Configuration with - BOSE Surround Sound βConfig42 -0.0003
(-0.0023,0.0019)
Business-Package βKey11 0.0105
(0.0090,0.0120)
18-Inch Aluminum Rims βKey21 -0.0079
(-0.0098,-0.0057)
19-Inch Aluminum Rims βKey22 -0.0096
(-0.0118,-0.0076)
Upholstery Fabric Mistral βKey31 0.0046
(0.0025,0.0067)
Upholstery Fabric Arkana βKey32 0.0100
(0.0047,0.0152)
Upholstery Leather Valcona βKey33 -0.0068
(-0.0086,-0.0047)
Upholstery Leather Milano βKey34 -0.0051
(-0.0077,-0.0026)
Multi-Function Steering Wheel βKey41 -0.0046
(4 Crossing Design) (-0.0064,-0.0031)
Multi-Function Steering Wheel βKey42 -0.0081
+ Shift Compensator (4 Crossing Design) (-0.0109,-0.0055)
Multi-Function Steering Wheel (heated) βKey43 -0.0095
+ Shift Compensator (4 Crossing Design) (-0.0130,-0.0053)
Electric Frontseats βKey51 -0.0151
(-0.0178,-0.0124)
Memory-Function for Driver’s Seat βKey52 -0.0155
(with electric adjustable Frontseats) (-0.0183,-0.0126)
Memory-Function for Frontseats βKey53 -0.0169
(-0.0201,-0.0139)
ARTICLE II 43
Tabl
eA
-3D
efau
ltS
ets
Key
-Ite
mD
efau
ltSe
tD
efau
ltSe
tD
efau
ltSe
tD
efau
ltSe
tD
efau
ltSe
tI
IIII
IIV
Opt
imum
Business-Packag
eY
ES
YE
SNO
YE
SY
ES
18-Inch
Aluminum
Rim
sNO
YE
SNO
NO
NO
19-Inch
Aluminum
Rim
sNO
NO
YE
SY
ES
YE
S
Upholstery
Fab
ricM
istral
NO
YE
SNO
NO
NO
Upholstery
Fab
ricArkan
aY
ES
NO
NO
NO
NO
Upholstery
Leather
Valco
na
NO
NO
YE
SNO
YE
S
Upholstery
Leather
Milan
oNO
NO
NO
YE
SNO
Multi-FunctionSteeringW
heel
(4CrossingDesign)
NO
YE
SY
ES
YE
SNO
Multi-FunctionSteeringW
heel+ShiftCompen
sator
(4CrossingDesign)
NO
NO
NO
NO
NO
Multi-FunctionSteeringW
heel(heated)+ShiftCompen
sator
(4CrossingDesign)
NO
NO
NO
NO
YE
S
ElectricFrontseats
NO
NO
NO
NO
NO
Mem
ory-F
unctionforDriver’s
Seat
(withelectric
adjustab
leFrontseats)
NO
NO
NO
NO
NO
Mem
ory-F
unctionforFrontseats
NO
NO
YE
SY
ES
YE
S
44 ARTICLE II
Tabl
eA
-4In
itial
mea
sure
men
tsca
les
Late
ntVa
riab
les
with
Indi
cato
rsSc
ale
Bas
edon
Satis
fact
ion
Homburg
etal.(2005)
SAT1
Allin
all,Iwould
besatisfiedwithmych
oices.
SAT2
Theco
nfigurationoftheattributesis
exactlywhat
Iwan
ted.
SAT3
Thech
oices
Imad
ewould
notmeetmyex
pectations.
(R)
SAT4
Would
Ihav
eto
choose
amongthesamealternatives
again,
Iwould
mak
ethesamedecisions.
SAT5
Ihav
eagoodfeelingco
nsideringthech
oices
Ijust
mad
e.
Pro
cess
Com
plex
ityDellaertan
dCOM
PL1
Iperceived
thedecision-m
akingprocess
aseffortful.
Strem
ersch(2005)
COM
PL2
Theco
nfigurationoftheattributeswas
pleasan
t.(R
)COM
PL3
Thedecision-m
akingwas
difficu
ltforme.
COM
PL4
Theco
nfigurationoftheattributeswas
somuch
funthat
Iforgotab
outthetime.
(R)
COM
PL5
Iperceived
thedecision-m
akingprocess
asco
mplicated.
Pre
fere
nce
Cer
tain
tyArgoet
al.(2005)
PREF1
Iam
sure
that
Imad
etherightch
oices.
PREF2
Iam
certainthat
thech
oices
Imad
emeetmyex
pectations.
PREF3
Iam
confiden
tthat
Ihav
eiden
tified
thealternatives
best
meetingmyneeds.
Notes:
Allmeasu
reswereassessed
onseven
-pointscales,an
choredby"strongly
disag
ree"
(1),
and"strongly
agree"
(7).
R=reverse
scored.
ARTICLE II 45
Tabl
eA
-5P
atro
niza
tion
mea
sure
men
tsca
le
Late
ntVa
riab
lew
ithIn
dica
tors
Patr
oniz
atio
nPA
TR1
Throughthepre-set
optionsIfeltco
nstricted
inmydecisions.
PATR2
Throughthepre-set
optionsIfeltirritated.
PATR3
Throughthepre-set
optionsIfeltpressuredin
thedecision-m
akingprocess.
Notes:
Allmeasu
reswereassessed
onseven
-pointscales,an
choredby"strongly
disag
ree"
(1),
and"strongly
agree"
(7).
46 ARTICLE II
Figu
reA
-1E
stim
ated
Den
sitie
sfo
rthe
Mod
elP
aram
eter
s
Den
sity
of b
eta
1
β 1
f(β1)
1234
−2.8
−2.6
−2.4
Den
sity
of b
eta
2
β 2
f(β2)
100
200
300
400
−0.0
06−0
.004
−0.0
02
Den
sity
of b
eta
3
β 3
f(β3)
51015
−0.5
0−0
.45
−0.4
0−0
.35
Den
sity
of b
eta
4
β 4
f(β4)
51015
−0.1
5−0
.10
−0.0
50.
00
Den
sity
of b
eta
5
β 5
f(β5)
100
200
300
−0.0
34−0
.032
−0.0
30−0
.028
−0.0
26−0
.024
Den
sity
of b
eta
6
β 6
f(β6)
50100
150
200
250
−0.0
50−0
.045
−0.0
40
Den
sity
of b
eta
7
β 7
f(β7)
51015
0.20
0.25
0.30
0.35
0.40
Den
sity
of b
eta
8
β 8
f(β8)
100
200
300
400
−0.0
04−0
.002
0.00
00.
002
0.00
4
Den
sity
of b
eta
9
β 9
f(β9)10
0
200
300
400
500
0.00
80.
010
0.01
20.
014
Den
sity
of b
eta
10
β 10
f(β10)
100
200
300
400
−0.0
12−0
.010
−0.0
08−0
.006
−0.0
04
Den
sity
of b
eta
11
β 11
f(β11)
100
200
300
400
−0.0
14−0
.012
−0.0
10−0
.008
−0.0
06
Den
sity
of b
eta
12
β 12
f(β12)
100
200
300
0.00
20.
004
0.00
60.
008
Den
sity
of b
eta
13
β 13
f(β13)
50100
150
0.00
50.
010
0.01
5
Den
sity
of b
eta
14
β 14
f(β14)
100
200
300
400
−0.0
10−0
.008
−0.0
06−0
.004
Den
sity
of b
eta
15
β 15
f(β15)
50100
150
200
250
300
−0.0
10−0
.008
−0.0
06−0
.004
−0.0
02
Den
sity
of b
eta
16
β 16
f(β16)10
0
200
300
400
−0.0
08−0
.006
−0.0
04−0
.002
Den
sity
of b
eta
17
β 17
f(β17)
50100
150
200
250
300
−0.0
12−0
.010
−0.0
08−0
.006
−0.0
04
Den
sity
of b
eta
18
β 18
f(β18)
50100
150
200
−0.0
15−0
.010
−0.0
05
Den
sity
of b
eta
19
β 19
f(β19)
50100
150
200
250
300
−0.0
20−0
.015
−0.0
10
Den
sity
of b
eta
20
β 20
f(β20)
50100
150
200
250
−0.0
20−0
.015
−0.0
10
Den
sity
of b
eta
21
β 21
f(β21)
50100
150
200
250
−0.0
20−0
.015
−0.0
10
ARTICLE II 47
Figure A-2 Densities - R2 and R2adj.
R2 and Radj.2
f(R2 )
and
f(R
adj.
2)
0
20
40
60
80
0.58 0.59 0.60 0.61 0.62
95% HPD Interval [0.59,0.61]
R2
Radj.2
48 ARTICLE II
B. MCMC sampling
As introduced in sections 2.2 and 2.3, the MVMNP model assumes that given
a set of explanatory variables the multivariate multionomial response is an in-
dicator of the event that some unobserved latent variable vector falls within a
certain region. The latent variable is assumed to arise from the multivariate
normal distribution, zi ∼ N(Xiβ,Σ). The likelihood of the observed discrete
data d = (d1, . . . , dn) is then obtained by integrating over the multidimensional
constrain space of latent variables.
L(d|X, β,Σ) =
n∏i=1
∫Ai,1
· · ·∫Ai,J
1
(2π)∑J
j=1(kj−1) |Σ| 12exp
(1
2(zi −Xiβ)
TΣ−1(zi −Xiβ)
)dzi
(B-1)
where the Ai,j’s are the intervals of compatible values for the latent variables
associated with the discrete choices di. Following the notation of Zhang et al.
(2008), the joint posterior density of β,Σ, and Z = (z1, . . . , zn), given the dis-
crete data d and its likelihood (B-1), is characterized as
p(β,Σ, Z|d) ∝ p(β)× p(Σ)×n∏
i=1
(Ii × ϕ(zi|Xi, β,Σ)) (B-2)
where ϕ is the multivariate standard normal density function and
Ii =J∏
j=1
⎛⎝1[di,j=0,zi,j,l<0,l=1,...,kj−1] +
kj−1∑r=1
1[di,j=r,zi,j,r=max1≤l≤kj−1(zi,j,l,0)]
⎞⎠
with 1[E] the indicator function equal to 1 when the expression E is true and
0 otherwise. Thus, the function Ii is simply an indicator evaluating to 1 if
the choice vector di is compatible with the underlying latent vector zi (Zhang
et al. 2008). The likelihood function in (B-1) involves multidimensional inte-
grals, making classical inferences difficult. Therefore, we use MCMC meth-
ods yielding random draws from the joint posterior distribution of the parame-
ters. Inference is based on the distribution of the drawn sample. In our MCMC
ARTICLE II 49
sampling algorithm, we proceed with a combination of data augmentation (Al-
bert and Chib 1993), the Gibbs sampler (Geman and Geman 1984) and the
Metropolis-Hastings algorithm (Chib and Greenberg 1995). The algorithm con-
sists of three steps. First, we sample the parameter vector β conditional on Σ,
Z and d. Assuming a prior distribution β ∼ N(b, C) for β and using stan-
dard Bayesian linear model results, β|Σ, Z, d has a multivariate normal distri-
bution β|Σ, Z, d ∼ N(β, Vβ), where Vβ = (∑n
i=1XTi Σ
−1Xi + C−1)−1 and
β = Vβ(∑n
i=1XTi Σ
−1zi + C−1b). Second, we draw samples for the latent
variables zi,j,l ∀i conditional on Xi, β, Σ, di and zi,j(−l) = (zi,j,1, . . . , zi,j,l−1,zi,j,l+1, . . . , zi,j,kj−1). The latent variable zi,j,l follows a truncated normal distri-
bution, zi,j,l ∼ NTrunc(Xi,jβ, {Σ}(q,q)), with lower bound equal to
max(zi,j(−l), 0) and upper bound equal to ∞, if di,j = l, and,
zi,j,l ∼ NTrunc(Xi,jβ, {Σ}(q,q)), with lower bound equal to −∞ and upper
bound equal tomax(zi,j(−l), 0) otherwise; q = 1+∑j−1
s=1(ks− 1). The third and
last step of the algorithm samples the variance-covariance matrix Σ. For the
constrained variance-covariance matrix Σ, we use an adjustment of the param-
eter expanded re-parameterization and Metropolis-Hastings (PX-RPMH) algo-
rithm proposed by Liu and Daniels (2006) for correlation matrices. The idea
of this sampling algorithm for correlation matrices, and constrained variance-
covariance matrices respectively, is to relax the constraints of diagonal elements
set to one, and to freely sample a variance-covariance matrix that then follows
an inverse Wishart distribution (for details, see Liu and Daniels (2006))20. In
our MCMC framework, we use a diffuse but proper prior for β; the multivari-
ate normal distribution with mean vector b = 0 and variance-covaraince matrix
C = I · 106 (I the identity matrix). The estimated probability model can then
be used to determine those key-attribute choices that are most likely to generate
the maximum joint profit together with the target-attributes.
20A different sampling algorithm has been proposed by Zhang et al. (2006).
50 ARTICLE II
Article III
Stadel, D. P. (submitted). Online Data: Predictive Power or Obscure Delusion?
International Journal of Research in Marketing.
Online Data:
Predictive Power or Obscure Delusion?
Daniel Stadel ∗
∗Daniel Stadel ([email protected]) is Ph.D. candidate at the University of St. Gallen, 9000 St. Gallen,
Switzerland.
2 ARTICLE III
Abstract
Nowadays, the internet is one of the most important information sources with still growing
popularity every day. People more and more frequently include the internet, suppositional they
have access to it, into their information search. For example, people are spending time searching
for all kinds of information about different products ranging from groceries, vacations, luxury
products, to cars and even real estates. Companies by now usually provide all the information
already online, mostly to be easily found by their customers. Potential customers can then ei-
ther directly visit the companies’ websites, or use search engines such as Google and Yahoo
to find the respective information on independent third party webpages. In either case, lots of
different types of data can be obtained with respect to consumers’ online search behavior, such
as clickstream data, search queries, blogs, and even real product choices (e.g., online product
configurators). Thus, the world wide web is rapidly developing to the world’s biggest data
romping place. In this paper, I investigate whether online data have predictive power and can
be utilized by companies in terms of improving business forecast models, or if they provide
misdirection. In particular, I consider weekly car orders of a renowned premium European car
manufacturer over a time period from June 2007 to December 2009. The respective online data
to be considered in the analysis are (i) online car configurations from the manufacturer’s own
webpage, and (ii) online search query data from Google Insights for Search. The online data
range from January 2007 to December 2009. Due to the nature of the data, time series models
are to be applied. First, a baseline model is specified without consideration of any covariates,
and the car orders are simply modeled as autoregressive processes. Second, in a time series
regression framework, I consider online car configurations as well as online search queries as
possible predictors for the variation in the original car order series. I also introduce a simple
measure to quantify the impact of covariates on predictive performance, the Forecast Impact
Factor (FIF). The results indicate that the predictive performance can be significantly improved
with respect to forecast error by the incorporation of online data. The findings suggest that the
internet is to be considered for business forecasts, especially if no other data is available. All
model parameters are estimated within a Bayesian framework.
Key words: Online Data, Car Configurator, Google Insights for Search, Time Series, Forecast-
ing, Forecast Impact Factor, Bayesian Methods
ARTICLE III 3
1. Introduction
"If you can look into the seeds of time and say, which grain will grow,and which will not, speak then to me". This introductory quote by William
Shakespeare poetically hits the mark with respect to forecasting. The main ob-
jective in the forecasting discipline in general, is to most accurately predict the
future concerning specific variables of interest, which in the area of business
forecasting range from interest rate predictions to product sale forecasts. The
subject of forecasting issues has been occupying the research literature for over
half a century (e.g., Winters 1960), and still remains of high relevance, as to
know what is going to happen can be the key for successful management de-
cisions including inventory planning, production scheduling, and expense bud-
geting.
In order to best achieve the forecasting goal one always avails oneself of
available information that is assumed to be likely to give information about the
future. In the discipline of weather-forecasting, for example, certain "signs"
that are to be occuring in advance of the actual event of interest, are used, such
as low-pressure areas to predict bad weather. This simply shows that the pres-
ence of indicators for certain events available in advance of the events them-
selves are a main source of information and significantly influence the forecast.
With respect to business forecasting including product sales, future revenue vol-
ume, market shares, etc., it could either be reverted to information such as past
product sales, or to more general information such as inflation rates, consumer
satisfaction indices, sectoral indices, etc. Relying on past information is a ret-
rospective approach and assumes that the future behaves like the past which
is a main weakness. Global information, such as indices, etc. can vaguely be
directly linked to specific brand performances or product sales. Prospective
closely consumer-related information on specific topics can directly be linked
to possible future events, as for example information on possible purchase in-
tentions. With the growing technological development and the growing inte-
gration of the developed world through the internet, there might be a chance to
open up a new valuable source of such consumer-related information. Such data
then might have enormous potential to indicate future developments. For exam-
ple, as the internet is becoming reasonably important for information search, it
4 ARTICLE III
is not unlikely that consumers research product characteristics such as prices,
quality, etc. in advance to an upcoming purchase. For example Ratchford et al.
(2003) study the use of the internet as an information source among car buyers.
Their results indicate that especially for internet-affine consumers, the online
source is of considerable importance for their information search prior to the
purchase (Ratchford et al. 2001). Klein and Ford (2003) also show that the
internet is an important factor within the information search since more than
half of the automobile buyers use the internet in their search process. As the
internet is an always growing source for information, and by the end of 2009
on average 64%21 of the population in the developed countries use the inter-
net (ITU 2010), the impact can be assumed to be significantly stronger than
back in the year 2000. Thus, the picture drawn from online searching behavior
can be assumed to mirrow a population’s interest and preference structure. For
example Decker and Trusov (2010) estimate consumer preferences based on
online product reviews and offer an econometric framework of how the plenti-
tude of online information can be turned into aggregate user preferences. This
indicates that online available data provide new chances to extract information
that can be used to better predict future events. Nevertheless, such a data surge
inevitably also quarries new challenges. Heil et al. (2010) mention the phe-
nomenon of ROPO (=Research Online Purchase Offline) in their note on the
new challenges the 21st century brings to the field of marketing. This phe-
nomenon simply implies that consumers, as already mentioned, use the internet
in their information search prior to their offline product purchases. The main
task analysts face with respect to online data is to extract valuable informa-
tion. Montgomery (2001) discusses quantitative marketing techniques that can
be used to solve internet marketing problems, e.g., banner targeting, consumer
online behavior, and trend tracking, including autoregressive models to predict
web usage (Montgomery 1999).
The access to such data on consumers’ online behavior, can definitely be
used for analyses of possible future offline behaviors and respective trends. The
parole "Reading the right signs right" then implies improved business forecasts.
Basically, there are three main aspects to achieve this goal: (i) the right data, (ii)
the right method, and (iii) the right context (variable to be predicted). In the re-
2126% of the world’s population use the internet, with 64% in developed countries and 18% in developing
countries (ITU 2010).
ARTICLE III 5
cent marketing literature online data have been discussed for different applica-
tions. Briyalogorsky and Naik (2003) study clickstream data to analyze whether
a firm’s online activity cannibalizes offline sales, and whether these activities
can also build (long-term) online equity. Bucklin and Sismeiro (2003), Sismeiro
and Bucklin (2004), Montgomery et al. (2004), Moe (2003, 2006) also analyze
clickstream data with respect to website browsing and purchase behavior. In
this context, another type of data, namely online product (movie) review (cri-
tiques) has also been investigated in order to forecast product sales (ticket-box
performance). Dellarocas and Awad (2007) show that professional critic re-
views substantially increase forecasting accuracy for movie sales, whereas Zhu
and Zhang (2010) also include consumer characteristics in a moderating role
into their analysis of the impact of online consumer reviews on product sales in
the video gaming industry. The research conducted by Chintagunta et al. (2010)
investigates the effect of online-word-of-mouth on movie ticket sales. The pre-
dictive power of online chatter has been investigated by Gruhl et al. (2005) who
show that the volume of online blog postings can be used to predict spikes in
actual consumer purchase decisions, and Dhar and Chang (2009) who broach
the issue of user-generated content in blogs and social networks for prediction
purpose of music sales.
Common ground with respect to this past research is the usage of online
data, such as clickstream data or online reviews, for predictive purposes. This
basically means, that online available information on consumer-product rela-
tions have been used to predict product related outcomes. In this paper, I follow
this background and study the impact of online product configurator informa-
tion on offline sales for a premium car manufacturer. In addition, information on
consumer online search behavior, as part of information search prior to product
purchases, is also included into the analysis. The data have been obtained from
Google Insights for Search, and include online search intensities for key-words
concerning the product of interest. For example, Ginsberg et al. (2009) showed
that such online data can adequately be used to predict probabilities for doctor’s
appointments based on search queries including influenza symptoms. Conse-
quently, in this paper, I assess the predictive power of online car configurations
and search engine queries with respect to two different car models for offline
sale forecasts. Due to the nature of sequentially observed data, all investiga-
tions are conducted in a time series framework. Forecasting and time series
6 ARTICLE III
approaches have to some extent been applied to analyze data in the field of mar-
keting for a long time (e.g., Makridakis and Wheelwright 1977, Hanssens 1980,
1998, Aaker et al. 1982, Franses 1991, 1994, Cain 2005, Lim et al. 2005, Wang
and Zhang 2008, Deleersnyder et al. 2009, Srinivasan et al. 2010). Dekimpe and
Hanssens (2000) review the usage of time series models in the marketing field
and claim the application of such techniques due to increasing sizes of data sets,
the dynamics of the environment and the emergence of internet data sources.
The remainder of the paper is organized as follows. Section 2 introduces
the methodology of a simple autoregressive time series model without consid-
eration of online data. This model serves as the baseline to which the extended
model incorporating online data is then compared. In Section 2, the model
is also applied to the car order time series for the two car models discussed
throughout this paper, and the respective results are reviewed. In Section 3, I
discuss the time series regression approach under consideration of online avail-
able data, and show how this method is applied to our real-world observations.
Following, in Section 4, both approaches are compared, and the findings are dis-
cussed with respect to the assessment of the usability of online data as possible
predictors for offline car orders. I also introduce the Forecast Impact Factor as
simple out-of-sample R2-based measure for predictive power. Finally, in Sec-
tion 5, I conclude with discussion, implications, limitations and further research
in the respective and also related directions.
2. Time series methodology for car orders
In this section, I will apply a simple time series model to weekly observed
(offline) orders of two car models of a renowned premium car manufacturer
without incorporation of any external variables such as online data. This model
serves as the baseline model throughout the paper. Let us now consider two time
series of weekly car orders for two different car models. The first car model
considered in my analysis is from the compact luxury car segment, termed as
"model I", and the second car model is from the mid-luxury car segment, termed
as "model II". The time series of the weekly car orders have been obtained over
a period of time, ranging from February 2008 to December 2009 (85 weeks)
ARTICLE III 7
Figure 1 Car orders for model I and model II
a) Car Orders − Model I
Week/Year
Car
Ord
ers
− M
odel
I
0
1000
2000
3000
4000
9/2008 35/2008 10/2009 37/2009 53/2009
●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●●●●●
●●●●●●●●●
●
●
●
●●●●●●
●
●
●
●
●
●
●●●
●●
●
●
●
●●
●
●
●●
●●●●
●
●
●●
●
●●
●●●●●
●●
●
●●●●●●
Data for model estimation
Time series dataTo be predicted
●
b) Car Orders − Model II
Week/Year
Car
Ord
ers
− M
odel
II
0
500
1000
1500
2000
2500
25/2007 51/2007 26/2008 52/2008 26/2009 53/2009
●
●
●
●
●●●●●
●
●●
●
●
●
●
●●●●●●●●●
●●
●
●●●●●●
●
●
●
●
●●●
●●●●
●●●
●●
●●
●
●●●●●●●
●
●●●
●
●●●
●●
●●
●●●●●●●
●
●
●
●●●●●
●
●
●
●●●●●●
●●
●●
●
●
●●●
●●
●●●●●●●
●
●●●●●●
Data for model estimation
Time series dataTo be predicted
●
for model I, and over a period of time ranging from June 2007 to December
2009 (135 weeks) for model II. Figure 1 displays the time series of the car or-
ders for model I and model II, respectively. From the figure, we can obtain that
we have quite some variation in the weekly car orders for both models across
the periods for which the data have been observed. Consequently, at a later
stage in this research it is of interest what portion of this variation can be ex-
plained by online data. In order to model the car orders as time series, I first
take a closer look at their respective structure, i.e. test for stationarity and de-
termine the orders of the autoregressive and moving average components. To
assess whether the series are stationary or not, Dickey-Fuller tests have been
conducted (Dickey and Fuller 1979). Based on a p-value less than 0.0522 for
car model I, and a respective p-value of less than 0.0123 for car model II, the
null-hypotheses of non-stationary processes could be rejected for both series.
Therefore, and in line with a simple time series framework, no transformations,
such as differencing and/or logarithmizing, have been applied to the data. As
a next step, the order of the autoregressive and moving average components
need to be determined. Here, I build on standard time series techniques and
visually investigate the autocorrelation functions and the partial autocorrela-
tion functions, respectively. Briefly, whereas the autocorrelation function of
an autoregressive process of order p, AR(p), decays from order p, its partial
autocorrelation function cuts off sharply from order p. Controversely, the au-
22Test-Statistic: τI = −2.0947; the p-value was obtained from Table 4.2, p. 103 of Banerjee et al. (1993).23Test-Statistic: τII = −2.9851; the p-value was obtained from Table 4.2, p. 103 of Banerjee et al. (1993).
8 ARTICLE III
Figure 2 ACF and PACF for the car orders of model I and model II
a) ACF: Car orders − Model I
Lag
AC
F
0.0
0.5
1.0
5 10 15 20
b) PACF: Car orders − Model I
Lag
Par
tial A
CF
−0.2
0.0
0.2
0.4
5 10 15 20
c) ACF: Car orders − Model II
Lag
AC
F
0.0
0.5
1.0
5 10 15 20
d) PACF: Car orders − Model II
LagP
artia
l AC
F
−0.2
0.0
0.2
0.4
5 10 15 20
tocorrelation function of a moving average process of order q, MA(q), cuts off
after lag q, and the respective partial autocorrelation function tails off. For a
mixed process (ARMA) both autocorrelation function and partial autocorrela-
tion function decay. For a detailed discussion of time series analyis, and the
respective methods for model selection and identification, please refer to Box
et al. (2008). Figure 2 provides the autocorrelation and partial autocorrelation
functions for the two car order series. From the figure, we obtain that for both
car order processes, for model I and model II, there is only one autoregressive
component (p = 1) and no moving average component (q = 0). Therefore, both
series can simply be modeled as AR(1)-processes. Another issue to be faced in
the context of the analysis of sequentially observed car orders, is the fact that
integer-valued time series are being investigated. The time-series literature has
been excessively discussing the analysis of count data in various applications
within the last decade (e.g., Jung and Tremayne 2003, Freeland and McCabe
2004a,b, Jung and Tremayne 2006, Karlis and Ntzoufras 2006, Zhu and Joe
2006, Kim and Park 2008, Davis and Wu 2009, Drost et al. 2009, Freeland
2009, Millar 2009, Silva et al. 2009, Weiss 2009), and consequently provides
the methodology of handling them properly. The car order observations for
model I range from 66 to 2509 (mean=606.04), and the observed weekly car
orders for model II range from 170 to 4084 (mean=1341.62), respectively. For
time series of such magnitudes approximations using continuous time-series
models such as the autoregressive (AR) process with Gaussian errors are usu-
ARTICLE III 9
ally adequate (Enciso-Mora et al. 2009). Thus, both car order series focused
on throughout this paper can appropriately be analyzed by applying traditional
time series techniques (see Box et al. 2008), namely assuming a normal error
distribution; εt ∼iid N(0, σ2) ∀t = 1, ..., T .
Summarizing, all necessary preliminary investigations of the two time se-
ries to be analyzed have been conducted. The models have been identified as
stationary AR(1)-processes, and it has been shown that traditional time series
techniques are sufficient. Following, I introduce the technical details for the
AR(1)-model, estimate the parameters, and discuss the respective results.
2.1 AR(1)-process for car orders
As derived above, we can model both time series as continuous AR(1)-processes.
Therefore, for both car order series, let us consider the following time series
model:
yt = φyt−1 + εt
εt ∼ N(0, σ2) , ∀t = 1, ..., T (1)
For the stationary series of car orders, I now want to model the deviation from a
constant mean with an autoregressive error process. Including a constant mean
notation into the model formulation (1) above, we get: yt = μ+φ(yt−1−μ)+εt.
Replacing yt−1 − μ by say ut−1, and yt − μ by say ut, respectively, this leads
to ut = φut−1 + εt. Thus, we can rewrite our AR(1)-model in (1) as local level
model with autoregressive errors of order one such that
yt = μ+ ut
ut = φut−1 + εt (2)
εt ∼ N(0, σ2) , ∀t = 1, ..., T
The reason for this model formulation, i.e., a local level with the deviation
modeled as AR(1)-errors becomes more obvious in the time series regression
section, when online data covariates are introduced. Then, it is attempted to ex-
plain the variation around the local level μ by external variables such as online
car configurations and online search queries. But first, I estimate the simple
10 ARTICLE III
AR(1)-model for both car order series, assess the model fit and reflect on the
results.
2.2 Parameter estimates and car order forecasts with the sim-ple AR(1)-process
One main goal of time series modeling is the providence of forecasts for future
observations in the series. In order to do so, the parameters have been estimated
within a Bayesian framework. The MCMC sampling algorithm applied follows
the approach of Chib (1993) and can be reviewed in appendix B. I ran sam-
pling chains for 12,000 iterations and assessed the convergence by monitoring
the time-series of the draws. The results are reported based on 10,000 draws re-
tained after discarding the first 2,000 draws as burn-in iterations. The diagnostic
plots, such as trace plots and posterior densities for the model parameters, can
be obtained from appendix A (Figure A-1 for car model I, and Figure A-2 for
car model II, respectively). The relevant statistics for the parameter estimates
are displayed in Table 1. From the table, we can obtain that the posterior mean
for the autoregresssion coefficient is significantly different from zero for both
car models. The estimates also confirm the results from the unit root tests as
the autoregression coefficients have not been estimated to be close to 1. The
variance estimates are in line with the magnitude of the estimated constants,
as for both models they indicate a standard deviation of approximately 40% to
50% from the series means μI and μII , respectively. This implies a rather large
variation in the observed series. The exciting part is the question whether this
variation can sufficiently be reduced by the consideration of online data.
Using this baseline model, we can now calculate the one-step ahead predic-
tions as well as the n-step ahead predictions. Thus, the one-step ahead predic-
tions are given by
yt = E(μ+ φ(yt−1 − μ) + εt) = μ+ φ(yt−1 − μ) (3)
and the n-step-ahead predictions are given by
yt+n = E
(μ+ φn(yt − μ) +
n∑j=1
φn−jεt+j
)= μ+ φn(yt − μ) (4)
ARTICLE III 11
Table 1 Parameter estimates - Time series model without online data
Parameter Car model I Car model IIestimates [95%-HPD] [95%-HPD]
μ 1306.42 602.03[1096.26;1518.70] [515.55;695.41]
φ 0.45419 0.3926[0.27765;0.65604] [0.22689;0.56948]
σ2 263188.3 88389.53[182807.9;346532.5] [68675.02;112781.34]
with the corresponding normal predictive distributions
yt+1 ∼ N(μ+ φ(yt−1 − μ), σ2) (5)
for the one-step-ahead predictions, and
yt+n ∼ N
(μ+ φn(yt − μ), σ21− φ2n
1− φ
)(6)
for the n-step-ahead predictions.
Figure 3 provides the fitted series and the respective out-of-sample predic-
tions. The car order out-of-sample forecasts for both car models, and the re-
spective observations, can be obtained from Table A-1. As an out-of-sample
forecasting horizon, I chose 12 weeks (3 months) as companies report their per-
formances on a quaterly basis, and consequently it was intended that this was
also a managerially relevant planning horizon. From the figure and also from
Table A-1 it can easily be seen that the out-of-sample forecasts rapidly converge
toward the local level μ of the series as common for such time series models.
This means that the mean of the series is the best forecast as we do not have any
additional information to explain the variation around the local level of the se-
ries. Later in Section 3., this variation is attempted to be explained by additional
information available online.
In order to assess the goodness of fit for this baseline model (in order to
later compare it to the extended model), we calculate the mean absolute perent-
age error (MAPE) for the in-sample one-step ahead predictions (for the T − 1
12 ARTICLE III
Figure 3 Orders for car model I and car model II - Fitted Time Series
a) Car orders model I − Fitted series and predictions
Week/Year
Car
Ord
ers
− M
odel
I −
with
fitte
d va
lues
0
1000
2000
3000
4000
5000
9/2008 35/2008 10/2009 37/2009 53/2009
●
●●
●●●
●●●
●
●
●
●
●
●●
●
●●●●●●●●●●
●●●●
●
●●
●●●●●●
●
●●
●●
●●●●
●●
●
●
●
●●
●
●
●●
●●●●
●●●●
●●●
●●●●●●●
●
●●●●●●
Time series dataTo be predictedOut−of−sample forecasts
●
b) Car orders model II − Fitted series and predictions
Week/Year
Car
Ord
ers
− M
odel
II −
with
fitte
d va
lues
0
500
1000
1500
2000
2500
25/2007 51/2007 26/2008 52/2008 26/2009 53/2009
●
●
●
●
●●●●●●
●●
●
●
●
●
●●●●●●●●●●●
●
●●●●●●
●
●
●
●
●●●●●●●
●●●
●●
●●
●
●●●●●●●
●
●●●
●
●●●
●●●●●●●●●
●●
●
●
●●●●●●
●
●
●
●●●●●●
●●●●
●
●
●●●
●●
●●●●●●●
●
●●●●●●
Time series dataTo be predictedOut−of−sample forecasts
●
predictions) as well as for the out-of-sample n-step ahead predictions by
In-sample MAPE =1
T − 1
T∑t=2
|yt − yt|yt
(7)
Out-of-sample MAPE =1
n
T+n∑t=T+1
|yt − yt|yt
(8)
with yt the respective forecast for time t. The calculated in-sample MAPE for
car model I was 0.3119, and the in-sample MAPE for car model II was 0.4087.
These values are in line with the estimated variances broken down to percentage
deviations. Generally, in-sample fit measures tend to be reasonably good as
the model parameter estimates are actually based on exactly that sample. The
original degree of model fit can be obtained by the out-of-sample fit measures,
especially important in the context of forecasting and time series modeling. The
respective MAPE values were 0.7738 for car model I and 0.8461 for car model
II. This implies a rather large deviation from the actually observed values in the
out-of-sample forecasts. Consequently, there is no doubt that it is desirable to
have additional information that reduces the uncertainty in future observations.
In the subsequent section, I discuss the time series approach applied to the car
order series and two additional series of information as explanatory variables.
ARTICLE III 13
Figure 4 Online configurations for car model I and car model II
a) Online Car Configurations − Model I
Week/Year
Car
Con
figur
atio
ns −
Mod
el I
200
400
600
800
1000
1200
37/2007 1/2008 27/2008 1/2009 27/2009 53/2009
●●
●●●●●●●
●●
●
●●●●
●
●●●●
●●
●●
●
●
●
●●●
●
●●
●●●
●●
●●●●●●●●
●●●●●●●●
●●●●●●●
●
●
●●
●
●
●
●
●●●
●●●●
●
●
●
●
●●●
●●●●
●
●●●●
●●●●
●
●
●
●
●●
●
●●●
●●●●●●●
●
●●
●
●
●
●
b) Online Car Configurations − Model II
Week/Year
Car
Con
figur
atio
ns −
Mod
el II
200
400
600
1/2007 27/2007 1/2008 27/2008 1/2009 27/2009 53/2009
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●●
●
●●
●
●●
●●
●●●●
●●
●
●
●
●
●
●●
●●●
●
●
●●●
●
●●●●●
●●●
●●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●●
●
●
●●
●●●●
●
●●●
●
●
●●●●●●
●●●●●●
●
●
●
●
●●
●
●
3. Time series regression for car orders and onlinedata as predictors
In this section, I will show how two different types of online data can appro-
priately be incorporated in my analysis framework. Therefore, I obtained two
time series of online data closely related to the corresponding car model or-
ders. The first data series for car model I and car model II are the weekly car
configurations of the respective models on the car manufacturer’s website. In
other words, the manufacturer has information about how many cars of model
I and II, respectively, have been configured on its website. Figure 4 displays
the time series of the car configurations from June 2007 to December 2009
for model I and from January 2007 to December 2009 for model II, respec-
tively. As we can obtain from the figure, we see that the time series of online
configurations have some variation around the series means, and are thus ex-
pected to have some explanatory power of the variation in the car order series as
it is assumed to reflect the consumers’ average interests in a certain car model.
Because of missing data values, both time series of online car configurations
have been interpolated from June 2008 to November 2008. Those missing
values are due to software changes during which period the configurator was
offline and could not be used by the customers. In order to include the online
car configurations into the analysis they have been mean-centered with respect
to the series mean. Hence, I use the variation around a baseline level observed
14 ARTICLE III
Figure 5 Google search intensity for car model I and car model II
a) Online Search Intensity − Model I
Week/Year
Onl
ine
Sea
rch
Inte
nsity
− M
odel
I
50
60
70
80
90
100
37/2007 1/2008 27/2008 1/2009 27/2009 53/2009
●
●●
●●●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●●●
●●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●●●●
●
●●
●●
●●
●
●●●●●
●
●●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●●
●●
●
b) Online Search Intensity − Model II
Week/Year
Onl
ine
Sea
rch
Inte
nsity
− M
odel
II
60
70
80
90
100
1/2007 27/2007 1/2008 27/2008 1/2009 27/2009 53/2009
●●●
●
●
●
●
●
●●
●
●●●●
●●
●
●●●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●●●
●
●●●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●●●●
●●●●
●●
●●●●●●●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●●●
●
●●
●
●
●●●●●●
●●
●
●
●
●●●●●●
●
●●
●
from the online configurations to explain the variation observed in the car orders
themselves. Recalling the model formulation of our simple AR(1)-process in-
cluding a local level μ, the online configurations are utilized to explain a portion
of the variance in the car order series.
The second data series for the two car models are online search inquiries
recorded by Google Insights for Search. These data reflect the online search
intensity for certain key words over a period of time. In our case, the time series
show in the figurative sense the search intensity for the key words "car model I"
and "car model II", respectively. Figure 5 shows the corresponding data series.
Here, it is also expected that some information about consumers’ inter-
ests can account for variation in the car order series. As for the online car
configurations, the time series for the search intensities have also been mean-
centered. The observations for the search intensities are normalized to range
from 0 to 100, with 100 reflecting the maximum value within the considered
period of time.24 Before I continue with the technical details of the extended
time series regression model, I cover the issue of appointing the right period for
which the online data have been observed to the corresponding offline actions
(here: car orders). In this paper, I do not have a theory-based assumption of
specific time lags for the different online data. This means, there is no hypoth-
esis tested whether consumers configure their car, or search for product related
24For details on the normalizing procedure please refer to Google Insights for Search.
ARTICLE III 15
Figure 6 Local level model with deviations
Time
Car
Ord
ers
200
400
600
800
1 2 3 ... t ... T
μΔ1
Δ2
Δ3
Δt
ΔT
Δt= Deviation from the local level μ at time t
Car orders
information any time specific in advance. Although it is certainly worthwile to
appropriately address the issue of how much time in advance to a possible pur-
chase new car buyers usually visit third party websites and online configuration
tools on manufacturer’s websites, I proceed with calculating a model for each
lag-combination for online configurations and online search intensity. The in-
teresting question for the determinants of a respective online search behavior is
beyond the objective of this paper. This basically means, that for each combi-
nation of 24 weeks of lag-time for both online data series, a model is estimated.
This implies a total of 24× 24 = 576 models. Thus, the model performing best
can be chosen for forecasting purpose. Following, I derive the regression model
and review the results.
3.1 Time series regression with AR(1)-errors
In the preceding sections, for the car order series, I have discussed a local
level model. Considered more technically, the dependent series of interest
was assumed to be constant with some variation around the series mean, say
yt = μ + Δt. Figure 6 schematically displays such an interpretation of the car
order series. In the baseline model, the deviation from the local level μ at time t,
Δt, was modeled as a stationary AR(1)-process, ut = φut−1+εt. In this section,
I want to use online observables as covariates to explain part of that variation,
16 ARTICLE III
i.e., Δt = x′tβ + ut. The AR(1)-structure of the error terms is maintained as
it still concerns sequentially observed data. Thus, for the incorporation of co-
variates into the analysis, I consider a time series regression model including an
intercept and an error structure arising from a stationary AR(1)-process. Thus,
the model can be set up as follows
yt = μ+ x′tβ + ut
ut = φut−1 + εt (9)
εt ∼ N(0, σ2) , ∀t = 1, ..., T
When we compare the two model formulations in (2) and (9), we can easily
obtain that the only difference is the additional regression component x′tβ, as toexplain part of the variation in the original time series. Next, I will estimate the
model, derive forecasts and discuss the results.
3.2 Car order forecasts with the time series regression model
This section processes the model estimation for the time series regression and
the respective results. In sum, 576 regression models have been estimated using
a Bayesian framework. For each lag-combination of the two online data sources
the in-sample MAPE is reported in Table A-2 in appendix A for car model I and
in Table A-3 in appendix A for car model II, respectively. For the further anal-
ysis only the models minimizing the in-sample MAPE are considered. For car
model I, a time lag of 14 weeks for car configurations and a time lag of 18
weeks for the search queries performed best based on the in-sample MAPE as
measure of accuracy. For car model II, we get a different lag-structure. The
lag-combination performing best with respect to the in-sample MAPE implies
that on average customers configure their car 23 weeks in advance to a purchase,
and that they consult the internet in their information search only 10 weeks prior
to the order. The determination of the lag-structure with respect to the data is
certainly an issue to be further discussed, but is beyond the scope of this paper.
Here, I only want to show that information accessible online can be utilized
to improve forecasting performance. Although a more theory-based approach
might be desirable, I will continue with these data driven results. Equivalently
to Section 2.2, I used MCMC sampling following Chib (1993, see appendix
ARTICLE III 17
Table 2 Parameter estimates - Time series regression with online data
Parameter Car model I Car model IIestimates [95%-HPD] [95%-HPD]
μ 1279.15 578.2[1033.65;1561.43] [513.26;648.59]
βOC 0.54959 0.74787[-0.13684;1.32860] [0.12967;1.24680]
βSI -3.2163 -14.4326[-15.8470;11.1279] [-21.0154;-7.4852]
φ 0.49546 0.25200[0.30139;0.69581] [0.07873;0.42324]
σ2 262342.8 73950.78[189475.9;347134.3] [56503.31;93591.92]
Time Lag- Conf igurations 14 23- Search Intensity 18 10
B). I ran sampling chains for 12,000 iterations and assessed the convergence
by monitoring the time-series of the draws. The posterior inference again is
based on 10,000 draws retained after discarding the first 2,000 draws as burn-
in iterations. The respective diagnostic plots can be obtained from appendix A
(Figure A-3 for car model I, and Figure A-4 for car model II, respectively).
Table 2 provides the characteristics of the posterior parameter distributions.
From the table, we can obtain that the estimates for the local levels μI and
μII are close to the estimates from our baseline model. This implies that the
specification of the mean-centered covariates is reasonable to explain a portion
of the variance around the series means. For car model I, the parameter char-
acteristics of the explanatory variables, online car configurations and online
search intensity, indicate weak predictive power as both variables are likely to
affect the car orders in different directions. This is also confirmed by the large
estimate for the variance close to the original estimate from the autoregressive
model (1). For car model II, we get better results. It can be obtained from
Table 2 that the parameter estimates for both covariates are significantly differ-
ent from zero, as their 95% highest probability density regions do not change
signs. It can also be seen that online car configurations positively affect car
orders, whereas online search intensity negatively affects offline purchases. To
18 ARTICLE III
Figure 7 Orders for car model I and car model II - Fitted Time Series
a) Car orders model I − Fitted series and predictions
Week/Year
Car
Ord
ers
− M
odel
I −
with
fitte
d va
lues
0
1000
2000
3000
4000
5000
9/2008 35/2008 10/2009 37/2009 53/2009
●
●●
●●●
●●●
●
●
●
●
●
●●
●
●●●●●●●●●●
●●●●
●
●●
●●●●●●
●
●●
●●
●●●●
●●
●
●
●
●●
●
●
●●
●●●●
●●●●
●●●
●●●●●●●
●
●●●●●●
Time series dataTo be predictedOut−of−sample forecasts
●
b) Car orders model II − Fitted series and predictions
Week/Year
Car
Ord
ers
− M
odel
II −
with
fitte
d va
lues
0
500
1000
1500
2000
2500
25/2007 51/2007 26/2008 52/2008 26/2009 53/2009
●
●
●
●
●●●●●●
●●
●
●
●
●
●●●●●●●●●●●
●
●●●●●●
●
●
●
●
●●●●●●●
●●●
●●
●●
●
●●●●●●●
●
●●●
●
●●●
●●●●●●●●●
●●
●
●
●●●●●●
●
●
●
●●●●●●
●●●●
●
●
●●●
●●
●●●●●●●
●
●●●●●●
Time series dataTo be predictedOut−of−sample forecasts
●
dare an explanation of these results, one could possibly say that having planned
on buying a new car, say 23 weeks in advance of the likely purchase, and then
changing their mind, customers use the internet to search for alternatives. But
the interpretation of the results in the direction of behavioral decision theory is
not the scope of this present study. Figure 7 displays the fitted series as well as
the 12-weeks ahead forecasts. The assessment of the goodness of fit measures
leads to an in-sample MAPE of 0.3010 for car model I and an in-sample MAPE
of 0.3547 for car model II. The out-of-sample MAPEs take on the values 0.7320
and 0.3482, respectively. To better understand the benefits of including external
variables such as online data into the analysis, in the next section I will discuss
a detailed comparison of the two presented approaches.
4. Comparison of the car order forecasts with andwithout consideration of online data
In the previous two sections, I set up two different models for business forecast-
ing with respect to car orders for two models, termed as car model I and carmodel II, respectively; (i) a simple stationary autoregressive time series model
of order one (Section 2.) set up as a constant level model with autoregressive
error structure, and (ii) a time series regression model with autoregressive error
structure of order one (Section 3.) incorporating online data such as online car
ARTICLE III 19
Table 3 Mean absolute percentage errors (MAPE)
with Online data Car model I Car model II
In-sample MAPE 0.3010 0.3547
Out-of-sample MAPE 0.7320 0.3482
without Online data Car model I Car model II
In-sample MAPE 0.3119 0.4087
Out-of-sample MAPE 0.7738 0.8461
configurations and online search intensity as possible predictors. In this sec-
tion, I compare the predictive performance of the two approaches with respect
to forecast errors. The measure of forecasting accuracy is the mean absolute
percentage error (MAPE) as already used in Section 3. to determine the optimal
lag structure of the online predictors. Table 3 provides the respective calculated
MAPE-values for the simple time series model in Section 2., as well as the cal-
culated MAPE-values for the time series regression model in Section 3. for the
two car models. As can easily be obtained from the table, the time series re-
gression approach including the online data outperforms the simple time series
approach with no online data included with respect to the MAPE-measure for
both car models. For car model I, the in-sample predictive performance mea-
sured by the MAPE with respect to one-step-ahead forecasts is improved by
3.49%, and the out-of-sample MAPE for the 12-week forecasts is improved by
5.4%. This indicates little predictive power for the online data with respect to
car model I. This can also be seen by looking at the two estimated variances for
the modeled processes by the two approaches. Without consideration of online
data, the process variance is estimated to be σ2I = 263188.3 (see Table 1), com-
pared to an estimated variance of σ2I = 262342.8 (see Table 2) when online data
are included via time series regression. Thus, the estimated process variance
could only be reduced by 0.32% through the incorporation of the discussed on-
line predictors. This result is also confirmed by the respective adjusted R2 as
a result of the sampling scheme provided in appendix B, which follows a data
transformation and standard linear model results (Chib 1993). The resulting
R2adj. of 0.0039 from the time series regression approach also indicates that only
0.39% of the variation in the time series of the car orders for model I can be
20 ARTICLE III
Figure 8 Comparison of Foracasts - with and without online data
a) Car order predictions − Model I
Week/Year
Car
ord
ers
mod
el I
− P
redi
ctio
ns
400
600
800
1000
1200
1400
42/2009 45/2009 48/2009 51/2009 53/2009
● ● ● ● ●●
●
●●
●
●
●
Observed car orders: Model IPredictions without online dataPredictions with online data
●
b) Car order predictions − Model II
Week/Year
Car
ord
ers
mod
el II
− P
redi
ctio
ns
300
400
500
600
42/2009 45/2009 48/2009 51/2009 53/2009
●●
● ● ●●
●
● ●
●
● ●
Observed car orders: Model IIPredictions without online dataPredictions with online data
●
explained by the online data. Hence, such a result leads to a conclusion that
the online data could not significantly contribute to the predictive performance.
Figure 8a) provides the graph for the 12 out-of-sample observations and the
predictions by our two considered approaches for car model I. Although, the
results do not support a sufficient improvement of the forecast performance
by the incorporation of online data, from the figure, we can still obtain that
the variation in the predicted series from the time series regression is more in
line with the variation in the observed data than the predictions by the simple
AR(1)-model which rapidly converge toward a constant mean. We see that the
predicted series from the model with online data reflects a similar see-saw pat-
tern as is observed from the real series. This might indicate that online data
can in fact be used to predict changes and variations in the car order series for
model I. Considering the results for car model II, we determine a better predic-
tive power for the online data. For car model II, the MAPE measure of forecast
accuracy obtained from Table 3 confirms a reduction of the in-sample forecast
error by 13.21%, and an improvement of the out-of-sample forecast error by
58.85%. This indicates a strong predictive performance of the model estimated
with online data as explanatory variables. The estimated variances for the car
order series with respect to the two different approaches also confirm such an
improvement of model performance. The variance for the simple AR(1)-model
was estimated to be σ2II = 88389.53 (see Table 1) compared to an estimated
variance of σ2II = 73950.78 (see Table 2) for the time series regression model
ARTICLE III 21
with online data as predictors. This shows that by incorporation of online car
configurations and online search intensity, the variance could be reduced by
19.35%. The respective adjusted R2 from the regression based estimation (see
appendix B) was calculated to be 0.1769. This also implies that the online data
could explain over 17% of the variation in the originally observed car order se-
ries. Figure 8b) displays the 12 out-of-sample observations and the predictions
by our two considered approaches for car model II. As well as for car model
I, and in an even more clear-cut fashion, we can obtain that the order forecasts
provided by the regression model with online data are closer to the actually
observed values. Again, we can see that the time series regression approach
better reflects the see-saw variation in the data. For model II, the results indi-
cate significant forecasting potential of online configurations and online search
intensity.
Forecast Impact Factor (FIF)
In order to operationalize the determination of the predictive power of external
variables for forecasting purpose, I introduce anR2-based measure for forecast-
ing performance, the Forecast Impact Factor (FIF). The FIF is defined as the
out-of-sample R2, as it is calculated as the mean portion of the mean squared
errors with respect to the n out-of-sample observations which can be accounted
for by the regression variables compared to the baseline model. Thus,
FIFn(x) = 1−∑T+n
t=T+1(ΔREGt )2∑T+n
t=T+1(ΔAR(1)t )2
, (10)
with Δt = yt − yt, yt the forecast with respect to the applied model (REG or
AR(1)). The FIF depends on the number n of out-of-sample observations for
which the mean is calculated, and on the explanatory variables x. In addition to
the overal FIF for a set of explanatory variables x, it could also be of interest
to determine the predictive power of a single variable of interest, given that
other variables have been available and used for model estimation. Therefore,
the conditional FIF for an explanatory variable xj given a set of J explanatory
variables, can then be calculated as
FIFn(xj|x1, ..., xj−1, xj+1, ..., xJ) = 1−∑T+n
t=T+1(ΔREGt + βjxj)
2∑T+nt=T+1(Δ
AR(1)t )2
. (11)
22 ARTICLE III
Table 4 Unconditional and conditional Forecast Impact Factors
Car model I Car model II
FIF12(xOC , xSI) 0.0990 0.7595
FIF12(xOC |xSI) 0.1119 0.3900
FIF12(xSI |xOC) 0.0943 0.6048
The calculations for the conditional FIF of a set of covariates is straightfor-
ward. With the Forecast Impact Factor one can easily calculate the forecast
impact of (a) variable(s) of interest, unconditionally or conditional on a given
set of other regressors, and thus the impact of each variable in a set of informa-
tion can be distangled. For the application in this paper, the unconditional and
conditional Forecast Impact Factors for both car models can be obtained from
Table 4. The results are in line with those from the preceding sections, as we
see that for car model I, we get rather low FIF s compared to rather high ones
for car model II.
To summarize, the results, although more critically for car model I, gener-
ally indicate an enormous information potential for online data with respect to
forecasting performance. Thus, in the future, firms should embrace the chal-
lenges to correctly incorporate online data into their forecasting procedures in
order to improve their predictive performances. In the subsequent final sec-
tion I discuss general aspects of online data, the general findings of this study,
research and managerial implications, and motivate further research in related
directions.
5. Conclusion
Forecasting has long been an issue in the economic literature. In this paper,
I exemplarily show how online data can appropriately be incorporated into a
simple business forecast model. Due to the nature of sequentially (weekly) ob-
served car order data, simple time series methods are applied. I investigate how
online car configurations available from a manufacturer’s webpage, as well as
freely available data on internet search behavior improve car order forecasts. I
ARTICLE III 23
can show that information from these online data for one car model can account
for over 17% of the variation in the original car order series. The correspond-
ing Forecast Impact Factor of 0.76 provides evidence for the predictive power
of online data. Thus, there exist online data that are to be utilized by compa-
nies in terms of improving forecasting models. The goal of this paper is not to
postulate a monopoly of online data as predictors for future events, but is sim-
ply a demontration that, compared to a baseline model without any predictors,
online data can improve the predictive performance as they can account for a
significant portion of variation in the series of interest.
From a methodological perspective, although the methodology is not the
main objective in this paper, it can be argued if the models can be further
improved by using more complex approaches. For example, in a time series
setting, the consideration of varying parameters in the time domain could ac-
count for changes in the importance of the predictors. Such dynamics in the
parameters could more precisely reflect changing relevances of the internet and
consequently online data, as for example the predictive power of online search
queries could increase over time as more people are provided regular access
to the internet. State-space models in general, and dynamic linear models in
particular as closely related to time series regression, could be more advanced
to cope the additional challenge of system dynamics that cannot be denied in
such a fast changing and developing environment as given by the world wide
web. The methodology was certainly not the main contribution of this paper,
as I just wanted to indicate the growing relevance of online data with respect
of predictive power for future trends and events. Further research on related
topics incorporating sequentially observed online data should therefore apply
proper methods that better cope the challenges of a fast growing and fast chang-
ing relation structure. Though simple time series methods have been applied
throughout this paper, evidence for the predictive power of online data could
sufficiently be provided.
Another aspect to be discussed, is the determination of the time lags for
the explanatory variables such as online car configurations and online search
intensity. In this research, I simply used a data driven approach using those
time lags minimizing the in-sample mean absolute percentage error (MAPE).
Future work could consider more theoretically driven approaches as to con-
24 ARTICLE III
sider information about general online search behavior and how far in advance
people use the internet within their information search for products prior to
purchase. The database for such analyses certainly exists and is growing ever-
day. Another critique might arise from a possible lack of reliability in the
data. This issue is difficult to address. But using online search queries and
configuration frequencies implies that people show interest in certain topics
and/or products. This is different to user-generated content with respect to prod-
uct reviews and movie critiques, as those could easily be manipulated to induce
certain opinions. The critical part here is that, as shown by Chintagunta et al.
(2010), the valence and not the volume tends to be the key driver for predic-
tions. With respect to online search queries this issue seems to be less critical
as the influence of manipulation of search intensity, if possible, is marginal as
in this area the volume is obviously the matter.
In general, this research stream can be driven a lot further in very differ-
ent directions. First, the predictive power of online data should definitely be
investigated in competition with traditional data incorporated into forecasting
models. Such results would provide the absolute benefit of recording customers
online behavior. Applying the conditional Forecast Impact Factor, introduced
in this paper, can then reveal the relevance of online data when competing with
tradionally considered indicators, e.g., gasoline prices in the automotive indus-
try. Second, having access to individual level and sociodemographic data can
provide the basis for segment specific Forecast Impact Factors. One starting
point here could be emerging social networks, such as Facebook or Myspace,
as user provide a lot of relevant information on their profiles. Recent devel-
opments in the field of social networks, such as companies having Facebook
profiles, or third party websites providing analytics for social media, offer a
great playground for subject-related analysis motivated in this paper.
Summarizing, the emerging online community, and hence the correspond-
ing available data offer a variety of interesting fields for further research. In
this paper, my attempt was to provide evidence for the predictive power of on-
line data, and to demonstrate how such data can simply be used in forecasting
models. The Forecast Impact Factor is a hands-on tool to assess the predictive
power, and can be used to easily compare competing alternatives. I hope that I
could motivate further research in this and related directions.
ARTICLE III 25
ReferencesAaker, D. A., J. M. Carman, R. Jacobson. 1982. Modeling advertising-sales relationships in-
volving feedback: A time series analysis of six cereal brands. Journal of Marketing Re-search 19(1) 116–125.
Banerjee, A., J. J. Dolado, J. W. Galbraith, D. F. Hendry. 1993. Cointegration, Error Correction,and the Econometric Analysis of Non-Stationary Data. Oxford University Press, Oxford.
Box, G. E. P., G. M. Jenkins, G. C. Reinsel. 2008. Time Series Analysis: Forecasting andControl. 4th ed. John Wiley & Sond, Inc.
Briyalogorsky, E., P. Naik. 2003. Clicks and mortar: The effect of on-line activities on off-linesales. Marketing Letters 14(1) 21–32.
Bucklin, R. E., C. Sismeiro. 2003. A model of web site browsing behavior estimated on click-stream data. Journal of Marketing Research 40(3) 249–267.
Cain, P. M. 2005. Modelling and forecasting brand share: A dynamic demand system approach.International Journal of Research in Marketing 22(2) 203–220.
Chib, S. 1993. Bayes regression with autoregressive errors : A gibbs sampling approach. Jour-nal of Econometrics 58(3) 275–294.
Chintagunta, P. K., S. Gopinath, S. Venkataraman. 2010. The effects of online user reviews onmovie box office performance: Accounting for sequential rollout and aggregation acrosslocal markets. Marketing Science 29(5) 944–957.
Davis, R. A., R. Wu. 2009. A negative binomial model for time series of counts. Biometrika3(96) 735–749.
Decker, R., M. Trusov. 2010. Estimating aggregate consumer preferences from online productreviews. International Journal of Research in Marketing 27(4) 293–307.
Dekimpe, M. G., D. M. Hanssens. 2000. Time-series models in marketing:: Past, present andfuture. International Journal of Research in Marketing 17(2-3) 183–193.
Deleersnyder, B., M. G. Dekimpe, J.-B. E.M Steenkamp, P. S.H Leeflang. 2009. The roleof national culture in advertising’s sensitivity to business cycles: An investigation acrosscontinents. Journal of Marketing Research 46(5) 623–636.
Dellarocas, X., C.and Zhang, N. Awad. 2007. Exploring the value of online product reviewsin forecasting sales: The case of motion pictures. Journal of Interactive Marketing 21(4)23–45.
Dhar, V., E. A. Chang. 2009. Does chatter matter? the impact of user-generated content onmusic sales. Journal of Interactive Marketing 23(4) 300–307.
Dickey, D. A., W. A. Fuller. 1979. Distribution of the estimators for autoregressive time serieswith a unit root. Journal of the American Statistical Association 74(366) 427–431.
Drost, F. C., R. van den Akker, B. J. M. Werker. 2009. Efficient estimation of auto-regressionparameters and innovation distributions for semiparametric integer-valued ar(p) models.Journal of the Royal Statististical Society / Series B 71(2) 467–485.
Enciso-Mora, V., P. Neal, T. Subba Rao. 2009. Efficient order selection algorithms for integer-valued arma processes. Journal of Time Series Analysis 30(1) 1–18.
Franses, P. H. 1991. Primary demand for beer in the netherlands: An application of ARMAXmodel specification. Journal of Marketing Research 28(2) 240–245.
Franses, P. H. 1994. Modeling new product sales; an application of cointegration analysis.International Journal of Research in Marketing 11(5) 491–502.
Freeland, R. K. 2009. True integer value time series. AStA Advances in Statistical Analysis94(3) 217–229.
26 ARTICLE III
Freeland, R. K., B. P. M. McCabe. 2004a. Analysis of low count time series data by poissonautoregression. Journal of Time Series Analysis 25(5) 701–722.
Freeland, R. K., B. P. M. McCabe. 2004b. Forecasting discrete valued low count time series.International Journal of Forecasting 20(3) 427–434.
Geman, S., D. Geman. 1984. Stochastic relaxation, gibbs distributions, and the bayesian restora-tion of images. IEEE Trans. Pattern Analysis and Machine Intelligence 6 721 – 741.
Ginsberg, J., M.H. Mohebbi, R.S. Patel, L. Brammer, M.S. Smolinski, L. Brilliant. 2009. De-tecting influenza epidemics using search engine query data. Nature 457(7232) 1012–1014.
Gruhl, D., R. Guha, R. Kumar, J. Novak, A. Tomkins. 2005. The predictive power of onlinechatter. Proceedings of the eleventh ACM SIGKDD international conference on Knowledgediscovery in data mining. KDD ’05, ACM, New York, NY, USA, 78–87.
Hanssens, D. M. 1980. Market response, competitive behavior, and time series analysis. Journalof Marketing Research 17(4) 470–485.
Hanssens, D. M. 1998. Order forecasts, retail sales, and the marketing mix for consumerdurables. Journal of Forecasting 17(3-4) 327–346.
Heil, O., D. Lehmann, S. Stremersch. 2010. Marketing competition in the 21st century. Inter-national Journal of Research in Marketing 27(2) 161–163.
ITU, (International Telecommunication Union). 2010. Measuring the information society. Tech.rep., UN Agency for Information and Communication Technologies.
Jung, R. C., A. R. Tremayne. 2003. Testing for serial dependence in time series models ofcounts. Journal of Time Series Analysis 24(1) 65–84.
Jung, R. C., A. R. Tremayne. 2006. Binomial thinning models for integer time series. StatisticalModelling 6 81–96.
Karlis, D., I. Ntzoufras. 2006. Bayesian analysis of the differences of count data. Statistics inMedicine 25(11) 1885–1905.
Kim, H., Y. Park. 2008. A non-stationary integer-valued autoregressive model. StatisticalPapers 49(3) 485–502.
Klein, L. R., G. T. Ford. 2003. Consumer search for information in the digital age: An empiricalstudy of prepurchase search for automobiles. Journal of Interactive Marketing 17(3) 29–49.
Lim, J., I. S. Currim, R. L. Andrews. 2005. Consumer heterogeneity in the longer-term effectsof price promotions. International Journal of Research in Marketing 22(4) 441–457.
Makridakis, S., S. C. Wheelwright. 1977. Forecasting: Issues & challenges for marketingmanagement. Journal of Marketing 41(4) 24–38.
Millar, R. B. 2009. Comparison of hierarchical bayesian models for overdispersed count datausing dic and bayes’ factors. Biometrics 65(3) 962–969.
Moe, W. W. 2003. Buying, searching, or browsing: Differentiating between online shoppersusing in-store navigational clickstream. Journal of Consumer Psychology 13(1/2) 29–39.
Moe, W. W. 2006. An empirical two-stage choice model with varying decision rules applied tointernet clickstream data. Journal of Marketing Research 43(4) 680–692.
Montgomery, A. L. 1999. Using clickstream data to predict www usage. Working paper,Graduate School of Industrial Administration, Carnegie Mellon University.
Montgomery, A. L. 2001. Applying quantitative marketing techniques to the internet. Interfaces31(2) 90–108.
Montgomery, A. L., S. Li, K. Srinivasan, J. C. Liechty. 2004. Modeling online browsing andpath analysis using clickstream data. Marketing Science 23(4) 579–595.
Ratchford, B. T., M.-S. Lee, D. Talukdar. 2003. The impact of the internet on informationsearch for automobiles. Journal of Marketing Research 40(2) 193–209.
ARTICLE III 27
Ratchford, B. T., D. Talukdar, M.-S. Lee. 2001. A model of consumer choice of the internet asan information source. International Journal of Electronic Commerce 5(3) 7–21.
Silva, N., I. Pereira, M. E. Silva. 2009. Forecasting in INAR(1) model. REVSTAT – StatisticalJournal 7(1) 119–134.
Sismeiro, C., R. E. Bucklin. 2004. Modeling purchase behavior at an e-commerce web site: Atask-completion approach. Journal of Marketing Research 41(3) 306–323.
Srinivasan, S., M. Vanhuele, K. Pauwels. 2010. Mind-set metrics in market response models:An integrative approach. Journal of Marketing Research 47(4) 672–684.
Wang, F., X.-P. (Steven) Zhang. 2008. Reasons for market evolution and budgeting implications.Journal of Marketing 72(5) 15–30.
Weiss, C. 2009. Modelling time series of counts with overdispersion. Statistical Methods &Applications 18 507–519.
Winters, P. R. 1960. Forecasting sales by exponentially weighted moving averages. Manage-ment Science 6(3) 324–342.
Zhu, F., X. (Michael) Zhang. 2010. Impact of online consumer reviews on sales: The moderat-ing role of product and consumer characteristics. Journal of Marketing 74(2) 133–148.
Zhu, R., H. Joe. 2006. Modelling count data time series with markov processes based onbinomial thinning. Journal of Time Series Analysis 27(5) 725–738.
28 ARTICLE III
A. Tables and figures
Table A-1 Out-of-sample forecasts and 95%-upper and lower bounds- without online data -
Car model In-step-ahead Observation Forecast 95%-Lower bound 95%-Upper bound
1 768 1079.59 74.09 2085.092 777 1203.40 99.05 2307.743 780 1259.63 135.97 2383.284 788 1285.17 157.56 2412.775 787 1296.76 168.35 2425.186 751 1302.03 173.45 2430.617 999 1304.43 175.81 2433.048 773 1305.51 176.89 2434.139 739 1306.01 177.38 2434.6310 948 1306.23 177.61 2434.8511 534 1306.33 177.71 2434.9612 419 1306.38 177.76 2435.00
Car model IIn-step-ahead Observation Forecast 95%-Lower bound 95%-Upper bound
1 340 505.04 0 [-77.66] 1087.752 325 563.95 0 [-62.05] 1189.963 339 587.08 0 [-45.33] 1219.494 348 596.16 0 [-37.24] 1229.565 353 599.72 0 [-33.82] 1233.276 330 601.12 0 [-32.45] 1234.707 425 601.67 0 [-31.90] 1235.258 305 601.89 0 [-31.69] 1235.479 296 601.97 0 [-31.60] 1235.5510 403 602.01 0 [-31.57] 1235.5811 243 602.02 0 [-31.56] 1235.6012 234 602.03 0 [-31.55] 1235.60
Note: Negative values have been truncated to zero as car orders are alwaysgreater or equal to zero.
ARTICLE III 29
Tabl
eA
-2In
-sam
ple
mea
nab
solu
tepe
cent
age
erro
r(M
AP
E)-
Car
mod
elI
Onl
ine
Car
Con
figu
ratio
nsLa
g1
Lag
2La
g3
Lag
4La
g5
Lag
6La
g7
Lag
8La
g9
Lag
10La
g11
Lag
12
Goo
gle
Sear
chIn
tens
ity
Lag
10.3086
0.3096
0.3093
0.3093
0.3150
0.3125
0.3102
0.3065
0.3112
0.3090
0.3082
0.3071
Lag
20.3170
0.3166
0.3165
0.3141
0.3181
0.3160
0.3138
0.3118
0.3112
0.3105
0.3119
0.3091
Lag
30.3141
0.3150
0.3160
0.3130
0.3173
0.3133
0.3135
0.3085
0.3098
0.3091
0.3085
0.3086
Lag
40.3117
0.3119
0.3102
0.3100
0.3153
0.3164
0.3127
0.3101
0.3124
0.3109
0.3090
0.3077
Lag
50.3158
0.3159
0.3169
0.3142
0.3168
0.3159
0.3112
0.3115
0.3120
0.3108
0.3115
0.3103
Lag
60.3163
0.3162
0.3162
0.3146
0.3192
0.3167
0.3127
0.3116
0.3110
0.3100
0.3103
0.3094
Lag
70.3128
0.3121
0.3119
0.3109
0.3158
0.3141
0.3109
0.3085
0.3084
0.3105
0.3075
0.3074
Lag
80.3161
0.3162
0.3161
0.3131
0.3186
0.3156
0.3138
0.3118
0.3102
0.3098
0.3107
0.3096
Lag
90.3145
0.3152
0.3143
0.3133
0.3183
0.3160
0.3131
0.3111
0.3100
0.3111
0.3112
0.3097
Lag
100.3163
0.3157
0.3166
0.3147
0.3188
0.3150
0.3138
0.3126
0.3114
0.3113
0.3101
0.3104
Lag
110.3162
0.3170
0.3170
0.3163
0.3207
0.3169
0.3128
0.3124
0.3097
0.3099
0.3123
0.3098
Lag
120.3117
0.3119
0.3106
0.3100
0.3141
0.3130
0.3106
0.3062
0.3073
0.3064
0.3034
0.3030
Lag
130.3158
0.3154
0.3170
0.3145
0.3172
0.3156
0.3122
0.3121
0.3092
0.3101
0.3114
0.3125
Lag
140.3148
0.3162
0.3155
0.3119
0.3162
0.3155
0.3114
0.3114
0.3098
0.3091
0.3091
0.3104
Lag
150.3141
0.3155
0.3159
0.3130
0.3178
0.3161
0.3119
0.3113
0.3112
0.3101
0.3117
0.3096
Lag
160.3164
0.3154
0.3166
0.3138
0.3153
0.3165
0.3118
0.3122
0.3094
0.3096
0.3124
0.3086
Lag
170.3135
0.3114
0.3101
0.3092
0.3139
0.3108
0.3072
0.3087
0.3065
0.3060
0.3078
0.3049
Lag
180.3161
0.3155
0.3157
0.3141
0.3184
0.3159
0.3104
0.3108
0.3100
0.3097
0.3116
0.3096
Lag
190.3117
0.3111
0.3113
0.3063
0.3091
0.3081
0.3036
0.3045
0.3044
0.3052
0.3039
0.3032
Lag
200.3167
0.3151
0.3152
0.3132
0.3184
0.3151
0.3116
0.3106
0.3089
0.3110
0.3107
0.3112
Lag
210.3150
0.3137
0.3164
0.3136
0.3170
0.3141
0.3118
0.3098
0.3082
0.3092
0.3102
0.3103
Lag
220.3178
0.3146
0.3171
0.3142
0.3150
0.3166
0.3128
0.3098
0.3102
0.3097
0.3109
0.3076
Lag
230.3179
0.3169
0.3161
0.3126
0.3167
0.3161
0.3118
0.3109
0.3103
0.3103
0.3107
0.3103
Lag
240.3174
0.3153
0.3164
0.3135
0.3178
0.3156
0.3136
0.3111
0.3097
0.3105
0.3114
0.3099
Onl
ine
Car
Con
figu
ratio
nsLa
g13
Lag
14La
g15
Lag
16La
g17
Lag
18La
g19
Lag
20La
g21
Lag
22La
g23
Lag
24
Goo
gle
Sear
chIn
tens
ity
Lag
10.3082
0.3050
0.3092
0.3081
0.3059
0.3079
0.3102
0.3093
0.3149
0.3040
0.3135
0.3215
Lag
20.3111
0.3038
0.3152
0.3128
0.3095
0.3129
0.3180
0.3162
0.3153
0.3096
0.3173
0.3190
Lag
30.3089
0.3034
0.3132
0.3113
0.3060
0.3112
0.3139
0.3129
0.3164
0.3082
0.3170
0.3168
Lag
40.3091
0.3079
0.3120
0.3120
0.3107
0.3113
0.3139
0.3121
0.3124
0.3092
0.3151
0.3197
Lag
50.3103
0.3067
0.3143
0.3130
0.3096
0.3158
0.3182
0.3150
0.3170
0.3097
0.3193
0.3205
Lag
60.3111
0.3046
0.3149
0.3137
0.3092
0.3119
0.3182
0.3156
0.3181
0.3108
0.3173
0.3195
Lag
70.3084
0.3030
0.3112
0.3100
0.3066
0.3104
0.3114
0.3127
0.3143
0.3063
0.3133
0.3181
Lag
80.3107
0.3045
0.3141
0.3128
0.3101
0.3129
0.3188
0.3149
0.3177
0.3107
0.3174
0.3200
Lag
90.3101
0.3058
0.3147
0.3142
0.3087
0.3139
0.3166
0.3158
0.3151
0.3096
0.3171
0.3201
Lag
100.3114
0.3052
0.3149
0.3139
0.3092
0.3140
0.3163
0.3167
0.3168
0.3104
0.3176
0.3199
Lag
110.3118
0.3039
0.3140
0.3120
0.3090
0.3127
0.3182
0.3155
0.3174
0.3109
0.3182
0.3196
Lag
120.3061
0.3011
0.3074
0.3092
0.3036
0.3069
0.3110
0.3105
0.3108
0.3042
0.3116
0.3140
Lag
130.3113
0.3027
0.3153
0.3150
0.3099
0.3148
0.3187
0.3170
0.3157
0.3116
0.3181
0.3187
Lag
140.3117
0.3048
0.3142
0.3132
0.3094
0.3138
0.3161
0.3157
0.3151
0.3110
0.3158
0.3188
Lag
150.3118
0.3046
0.3149
0.3146
0.3094
0.3147
0.3175
0.3153
0.3159
0.3105
0.3163
0.3200
Lag
160.3111
0.3043
0.3156
0.3140
0.3096
0.3136
0.3181
0.3171
0.3172
0.3104
0.3173
0.3196
Lag
170.3089
0.3033
0.3110
0.3114
0.3078
0.3099
0.3126
0.3124
0.3116
0.3096
0.3130
0.3171
Lag
180.3102
0.30
100.3149
0.3142
0.3071
0.3150
0.3176
0.3163
0.3169
0.3092
0.3172
0.3197
Lag
190.3052
0.3010
0.3080
0.3088
0.3046
0.3081
0.3094
0.3114
0.3102
0.3072
0.3115
0.3123
Lag
200.3118
0.3011
0.3150
0.3134
0.3096
0.3146
0.3206
0.3160
0.3179
0.3101
0.3186
0.3204
Lag
210.3093
0.3047
0.3145
0.3125
0.3087
0.3135
0.3169
0.3164
0.3160
0.3105
0.3165
0.3184
Lag
220.3102
0.3055
0.3140
0.3149
0.3100
0.3142
0.3183
0.3175
0.3153
0.3127
0.3185
0.3211
Lag
230.3113
0.3043
0.3132
0.3141
0.3096
0.3120
0.3189
0.3180
0.3168
0.3106
0.3174
0.3203
Lag
240.3105
0.3039
0.3137
0.3149
0.3095
0.3135
0.3216
0.3176
0.3190
0.3109
0.3187
0.3211
30 ARTICLE III
Tabl
eA
-3In
-sam
ple
mea
nab
solu
tepe
cent
age
erro
r(M
AP
E)-
Car
mod
elII
Onl
ine
Car
Con
figu
ratio
nsLa
g1
Lag
2La
g3
Lag
4La
g5
Lag
6La
g7
Lag
8La
g9
Lag
10La
g11
Lag
12
Goo
gle
Sear
chIn
tens
ity
Lag
10.3895
0.3985
0.3981
0.3960
0.4044
0.3989
0.3988
0.4064
0.4029
0.3995
0.3993
0.4034
Lag
20.3766
0.3873
0.3913
0.3860
0.3959
0.3899
0.3923
0.4016
0.3960
0.3927
0.3923
0.3993
Lag
30.3843
0.3890
0.3917
0.3865
0.3993
0.3939
0.3912
0.4035
0.3983
0.3926
0.3963
0.4004
Lag
40.3920
0.3972
0.3981
0.3975
0.4049
0.4002
0.3992
0.4050
0.4055
0.4027
0.4026
0.4055
Lag
50.3740
0.3810
0.3789
0.3753
0.3879
0.3832
0.3832
0.3916
0.3851
0.3840
0.3839
0.3865
Lag
60.3817
0.3873
0.3882
0.3822
0.3910
0.3863
0.3884
0.3973
0.3926
0.3913
0.3922
0.3938
Lag
70.3891
0.3936
0.3934
0.3939
0.3975
0.3927
0.3928
0.4036
0.3989
0.3961
0.3958
0.4017
Lag
80.3765
0.3821
0.3809
0.3785
0.3849
0.3773
0.3794
0.3894
0.3849
0.3835
0.3818
0.3887
Lag
90.3895
0.3969
0.3973
0.3942
0.4020
0.3960
0.3969
0.4076
0.4024
0.4023
0.3999
0.4045
Lag
100.3605
0.3719
0.3706
0.3685
0.3736
0.3671
0.3668
0.3761
0.3689
0.3695
0.3698
0.3751
Lag
110.3787
0.3819
0.3824
0.3807
0.3883
0.3828
0.3853
0.3948
0.3888
0.3853
0.3870
0.3906
Lag
120.3829
0.3877
0.3862
0.3866
0.3908
0.3876
0.3914
0.3928
0.3904
0.3861
0.3852
0.3926
Lag
130.3720
0.3787
0.3797
0.3755
0.3814
0.3794
0.3826
0.3908
0.3843
0.3806
0.3780
0.3819
Lag
140.3843
0.3933
0.3934
0.3920
0.3958
0.3902
0.3938
0.4032
0.3999
0.3940
0.3931
0.3987
Lag
150.3687
0.3724
0.3793
0.3708
0.3820
0.3742
0.3746
0.3828
0.3818
0.3755
0.3740
0.3804
Lag
160.3892
0.3969
0.3965
0.3940
0.4018
0.3947
0.3932
0.4036
0.4013
0.3985
0.4004
0.4018
Lag
170.3891
0.3925
0.3948
0.3899
0.3979
0.3954
0.3934
0.3998
0.3969
0.3937
0.3933
0.3981
Lag
180.3859
0.3880
0.3865
0.3856
0.3891
0.3895
0.3877
0.3965
0.3912
0.3899
0.3873
0.3933
Lag
190.3910
0.3995
0.4031
0.3984
0.4048
0.3991
0.3993
0.4099
0.4068
0.3998
0.4009
0.4065
Lag
200.3705
0.3819
0.3839
0.3804
0.3835
0.3857
0.3798
0.3837
0.3850
0.3848
0.3814
0.3855
Lag
210.3784
0.3811
0.3857
0.3829
0.3854
0.3848
0.3833
0.3926
0.3890
0.3851
0.3832
0.3889
Lag
220.3932
0.4025
0.4016
0.3970
0.4049
0.4024
0.3984
0.4071
0.4073
0.4041
0.4014
0.4075
Lag
230.3873
0.3920
0.3957
0.3923
0.3971
0.3928
0.3899
0.4013
0.3966
0.3920
0.3948
0.3977
Lag
240.3889
0.3951
0.3964
0.3952
0.3980
0.3950
0.3946
0.4008
0.3989
0.3994
0.3949
0.4020
Onl
ine
Car
Con
figu
ratio
nsLa
g13
Lag
14La
g15
Lag
16La
g17
Lag
18La
g19
Lag
20La
g21
Lag
22La
g23
Lag
24
Goo
gle
Sear
chIn
tens
ity
Lag
10.4099
0.3938
0.3988
0.4050
0.4000
0.4034
0.4029
0.3978
0.4051
0.4019
0.3893
0.3952
Lag
20.4071
0.3872
0.3939
0.3951
0.3969
0.3988
0.3987
0.3892
0.3994
0.3941
0.3842
0.3889
Lag
30.4075
0.3908
0.3946
0.3987
0.3962
0.4013
0.4001
0.3926
0.4007
0.3958
0.3837
0.3909
Lag
40.4142
0.3946
0.4020
0.4050
0.4015
0.4063
0.4063
0.3970
0.4050
0.4012
0.3883
0.3960
Lag
50.3949
0.3743
0.3818
0.3882
0.3844
0.3899
0.3895
0.3821
0.3868
0.3844
0.3722
0.3780
Lag
60.4017
0.3848
0.3877
0.3932
0.3915
0.3918
0.3937
0.3871
0.3919
0.3872
0.3764
0.3816
Lag
70.4070
0.3866
0.3965
0.3981
0.3948
0.3998
0.4022
0.3916
0.4008
0.3955
0.3805
0.3886
Lag
80.3929
0.3789
0.3788
0.3847
0.3821
0.3851
0.3844
0.3775
0.3848
0.3825
0.3708
0.3728
Lag
90.4125
0.3941
0.3958
0.4019
0.3997
0.4017
0.4046
0.3962
0.3993
0.3974
0.3849
0.3911
Lag
100.3793
0.3633
0.3685
0.3678
0.3706
0.3749
0.3732
0.3646
0.3686
0.3648
0.35
470.3668
Lag
110.3981
0.3810
0.3833
0.3890
0.3858
0.3882
0.3902
0.3784
0.3855
0.3838
0.3660
0.3742
Lag
120.3939
0.3799
0.3853
0.3901
0.3890
0.3879
0.3904
0.3805
0.3830
0.3828
0.3748
0.3701
Lag
130.3909
0.3744
0.3784
0.3822
0.3817
0.3854
0.3843
0.3761
0.3837
0.3757
0.3708
0.3755
Lag
140.4055
0.3858
0.3897
0.3972
0.3949
0.3994
0.3971
0.3873
0.3951
0.3907
0.3781
0.3879
Lag
150.3820
0.3660
0.3704
0.3764
0.3774
0.3763
0.3781
0.3709
0.3743
0.3698
0.3608
0.3648
Lag
160.4090
0.3898
0.3945
0.3987
0.3971
0.4022
0.4004
0.3927
0.4012
0.3968
0.3830
0.3878
Lag
170.4047
0.3889
0.3916
0.3943
0.3904
0.3942
0.3967
0.3843
0.3933
0.3915
0.3773
0.3848
Lag
180.4010
0.3788
0.3881
0.3892
0.3864
0.3872
0.3914
0.3826
0.3827
0.3823
0.3701
0.3763
Lag
190.4122
0.3944
0.3998
0.4028
0.3982
0.3994
0.4029
0.3913
0.3990
0.3947
0.3830
0.3906
Lag
200.3921
0.3791
0.3813
0.3838
0.3815
0.3802
0.3827
0.3716
0.3723
0.3778
0.3555
0.3663
Lag
210.3961
0.3744
0.3811
0.3842
0.3805
0.3815
0.3818
0.3713
0.3720
0.3740
0.3618
0.3651
Lag
220.4121
0.3938
0.4004
0.4046
0.4020
0.4065
0.4079
0.3962
0.4034
0.4011
0.3866
0.3959
Lag
230.4069
0.3871
0.3937
0.3954
0.3954
0.3980
0.3944
0.3905
0.3905
0.3939
0.3755
0.3845
Lag
240.4074
0.3888
0.3944
0.3984
0.3942
0.3990
0.3996
0.3878
0.3959
0.3909
0.3774
0.3842
ARTICLE III 31
Table A-4 Out-of-sample forecasts and 95%-upper and lower bounds- with online data -
Car model In-step-ahead Observation Forecast 95%-Lower bound 95%-Upper bound
1 768 1070.06 66.17 2073.942 777 1199.41 79.06 2319.753 780 1363.70 216.58 2510.834 788 1307.93 154.32 2461.535 787 1237.06 81.87 2392.256 751 1210.82 55.24 2366.407 999 1271.86 116.19 2427.548 773 1295.27 139.57 2450.979 739 1282.71 127.01 2438.4110 948 1236.68 80.98 2392.3911 534 1224.59 68.88 2380.2912 419 1245.34 89.63 2401.04
Car model IIn-step-ahead Observation Forecast 95%-Lower bound 95%-Upper bound
1 340 438.51 0 [ -94.48] 971.502 325 409.61 0 [-140.04] 959.283 339 362.76 0 [-187.94] 913.454 348 344.63 0 [-206.13] 895.395 353 381.00 0 [-169.76] 931.776 330 419.15 0 [-131.61] 969.927 425 427.60 0 [-123.16] 978.378 305 493.79 0 [ -56.98] 1044.559 296 482.34 0 [ -68.42] 1033.1110 403 410.63 0 [-140.14] 961.3911 243 419.68 0 [-131.08] 970.4512 234 514.50 0 [ -36.26] 1065.27
Note: Negative values have been truncated to zero as car orders are alwaysgreater or equal to zero.
32 ARTICLE III
Figure A-1 Traceplots and posterior densities for model parameters - Car model I- without online data -
a) Traceplot − μμI
Iteration
μμ I
1000
1200
1400
1600
200 400 600 800
b) Posterior density − μμI
μμI
0.000
0.001
0.002
0.003
0.004
1000 1200 1400 1600
E((μμI)) == 1306.42
95% HPD Interval [1096.26,1518.7]
c) Traceplot − φφI
Iteration
φφ I
0.2
0.4
0.6
0.8
200 400 600 800
d) Posterior density − φφI
φφI
0
1
2
3
4
5
0.2 0.3 0.4 0.5 0.6 0.7
E((φφI)) == 0.45419
95% HPD Interval [0.27765,0.65604]
e)Traceplot − σσI2
Iteration
σσ I2
150000
200000
250000
300000
350000
400000
450000
200 400 600 800
f) Posterior density − σσI2
σσI2
0.0e+00
2.0e−06
4.0e−06
6.0e−06
8.0e−06
1.0e−05
200000 250000 300000 350000 400000
E((σσI2)) == 263188.3
95% HPD Interval [182807.89,346532.52]
ARTICLE III 33
Figure A-2 Traceplots and posterior densities for model parameters - Car model II- without online data -
a) Traceplot − μμII
Iteration
μμ II
500
600
700
200 400 600 800
b) Posterior density − μμII
μμII
0.000
0.002
0.004
0.006
0.008
400 500 600 700 800
E((μμII)) == 602.03
95% HPD Interval [515.55,695.41]
c) Traceplot − φφII
Iteration
φφ II
0.2
0.4
0.6
200 400 600 800
d) Posterior density − φφII
φφII
0
1
2
3
4
5
0.2 0.3 0.4 0.5 0.6 0.7
E((θθII)) == 0.3926
95% HPD Interval [0.22689,0.56948]
e)Traceplot − σσII2
Iteration
σσ II2
60000
80000
100000
120000
140000
200 400 600 800
f) Posterior density − σσII2
σσII2
0e+00
1e−05
2e−05
3e−05
4e−05
60000 80000 100000 120000
E((σσII2)) == 88389.53
95% HPD Interval [68675.02,112781.34]
34 ARTICLE III
Figure A-3 Traceplots and posterior densities for model parameters - Car model I- with online data -
a) Traceplot − μμI
Iteration
μμ I
800
1000
1200
1400
1600
200 400 600 800
b) Posterior density − μμI
μμI
0.000
0.001
0.002
0.003
0.004
1000 1200 1400 1600
E((μμI)) == 1279.15
95% HPD Interval [1033.65,1561.43]
c) Traceplot − ββI((OC))
Iteration
ββ I((OC
))
−1.0
−0.5
0.0
0.5
1.0
1.5
2.0
200 400 600 800
d) Posterior density − ββI((OC))
ββI((OC))
0.0
0.5
1.0
0.0 0.5 1.0 1.5
E((ββI((OC)))) == 0.54959
95% HPD Interval [−0.1368,1.3286]
e) Traceplot − ββI((SI))
Iteration
ββ I((SI))
−20
−10
0
10
20
200 400 600 800
f) Posterior density − ββI((SI))
ββI((SI))
0.00
0.02
0.04
0.06
−20 −10 0 10
E((ββI((SI)))) == −− 3.2163
95% HPD Interval [−15.85,11.13]
g) Traceplot − φφI
Iteration
φφ I
0.2
0.4
0.6
0.8
200 400 600 800
h) Posterior density − φφI
φφI
0
1
2
3
4
5
0.2 0.4 0.6 0.8
E((φφI)) == 0.49546
95% HPD Interval [0.27765,0.65604]
i) Traceplot − σσI2
Iteration
σσ I2
200000
250000
300000
350000
400000
200 400 600 800
j) Posterior density − σσI2
σσI2
0.0e+00
2.0e−06
4.0e−06
6.0e−06
8.0e−06
1.0e−05
200000 250000 300000 350000 400000
E((σσI2)) == 262342.8
95% HPD Interval [189475.93,347134.3]
ARTICLE III 35
Figure A-4 Traceplots and posterior densities for model parameters - Car model II- with online data -
a) Traceplot − μμII
Iteration
μμ II
450
500
550
600
650
700
200 400 600 800
b) Posterior density − μμII
μμII
0.000
0.005
0.010
500 550 600 650
E((μμII)) == 578.2
95% HPD Interval [513.26,648.59]
c) Traceplot − ββII((OC))
Iteration
ββ II((O
C))
−0.5
0.0
0.5
1.0
1.5
200 400 600 800
d) Posterior density − ββII((OC))
ββII((OC))
0.0
0.5
1.0
1.5
0.0 0.5 1.0 1.5
E((ββII((OC)))) == 0.74787
95% HPD Interval [0.13,1.25]
e) Traceplot − ββII((SI))
Iteration
ββ II((S
I))
−25
−20
−15
−10
−5
200 400 600 800
f) Posterior density − ββII((SI))
ββIII((SI))
0.00
0.05
0.10
−25 −20 −15 −10 −5
E((ββII((SI)))) == −− 14.4326
95% HPD Interval [−21.02,−7.49]
g) Traceplot − φφII
Iteration
φφ II
0.0
0.1
0.2
0.3
0.4
0.5
0.6
200 400 600 800
h) Posterior density − φφII
φφII
0
1
2
3
4
5
0.0 0.1 0.2 0.3 0.4 0.5
E((φφII)) == 0.252
95% HPD Interval [0.07873,0.42324]
i) Traceplot − σσII2
Iteration
σσ II2
60000
80000
100000
120000
200 400 600 800
j) Posterior density − σσII2
σσII2
−1e−05
0e+00
1e−05
2e−05
3e−05
4e−05
60000 80000 100000
E((σσII2)) == 73950.78
95% HPD Interval [56503.31,93591.92]
36 ARTICLE III
B. MCMC sampling
I consider the following model in which an observation yt, at time t, is generated
by the regression model with autocorrelated errors of order one:
yt = x′tθ + ut
ut = φut−1 + εt (B-1)
εt ∼ N(0, σ2), ∀ t ∈ {1, ..., T}
where xt is a ((k+1)×1)-vector of k covariates and a constant for the intercept,
xt = (1, x1, ..., xk)′, θ = (μ, β1, ..., βk)
′ ∈ Sθ, and φ ∈ Sφ is the coefficient for
the AR(1)-error process. The parameters θ and φ are defined on their supports
Sθ and Sφ, respectively. I follow a Bayesian approach to estimate posterior
distributions for the model parameters θ = (μ, β1, ..., βk)′, φ and σ2. Let there-
fore y1:t denote all the observations up to time t, and yt|1:t−1 the expected value
for the tth observation given all the past information on the time series and the
covariates. Because I only consider a process with autocorrelated structure of
order one, this implies that yt|1:t−1 = yt|t−1, with yt|t−1 = x′tθ + ut. The likeli-
hood of the observed time series Y = {yt}Tt=2 of car orders conditional on the
initial observation y1, and the parameters θ, φ,and σ2 is then given by
L(Y|y1, θ, φ, σ2) =T∏t=2
f(yt|yt−1, θ, φ, σ2)
∝ σ−(n−1)exp
(− 1
2σ2
T∑t=2
(yt − yt|t−1)2)
(B-2)
In order to sample from the model parameters’, θ = (μ, β)′, φ and σ2, joint
posterior distribution, I use Gibbs sampling (Geman and Geman 1984) as de-
scribed in Chib (1993) for Bayesian regression models with autocorrelated er-
rors. Given the prior information on the unknown parameters, π(θ, φ, σ2), and
applying Bayes theorem, the joint posterior distribution of interest is given by
π(θ, φ, σ2|Y) ∝ L(Y|y1, θ, φ, σ2)π(θ, φ, σ2) (B-3)
ARTICLE III 37
with a normalizing constant given by the often analytically intractable integral
K =
∫L(Y|y1, θ, φ, σ2)π(θ, φ, σ2)dθdφdσ2.
To compass these difficult posterior computations, I simply exploit the conve-
nient conditional structure of the model that allows me to use a Gibbs sampler
as iterative Monte Carlo method (Geman and Geman 1984). Hence, I can draw
samples from the full conditional distributions of the parameters, which leads,
in the stage of convergency of the sampling method, to posterior draws of the
joint distribution of the parameters. Supppose, for our model, I presume prior
distributions such that
π(θ, φ, σ2) = π(θ)π(φ)π(σ2) (B-4)
which asserts that θ, φ, and σ2 are a priori independent. Let then
θ ∼ N(k+1)(η,C)I(θ∈Sθ), φ ∼ N(φ0,Φ0)I(φ∈Sφ), σ2 ∼ IG(a0, b0) (B-5)
be the respective prior distributions for the unknown model parameters, with
I(E) equal to one if the event E is true, and zero otherwise. These prior distri-
butions are a combination of a multivariate normal distribution for θ truncated
to the region Sθ, a normal distribution for φ truncated to the region Sφ and an
inverse gamma distribution for the variance parameter σ2. In our specific ap-
plication to weekly car orders, I set Sθ = [0,∞) × R2 which ensures that the
local level constant remains positive as car orders cannot become negative. The
truncation region for φ is set to Sφ = (−1, 1) which implies a stationary er-
ror process. The indicator functions can simply be droppped if the restrictions
are not being imposed. I use diffuse, non-informative priors over the model
parameters θ, φ and σ2 as I set the hyperparameters in (B-5) at η = (0, 0, 0)′,
C(1,1) = 106, C(2,2) = C(3,3) = 102, C(i,j) = 0, ∀i �= j, i, j = 1, 2, 3, a0 = 1,
b0 = 10, φ0 = 0, and Φ0 = 1.
In order to apply a simple Gibbs sampling algorithm, as pointed out in Chib
(1993), the variables yt are transformed to y∗t = yt − φyt−1. Thus, the model in
(B-1) simply becomes
yt − φyt−1 = x′tθ − φx′t−1θ + ut − φut−1 (B-6)
38 ARTICLE III
From (B-1), it is also obtained that εt = ut − φut−1, with εt ∼ N(0, σ2). Using
this result together with (B-6), it is easy to confirm that y∗t |y1:t−1 ∼ N(x∗′t θ, σ
2),
is independently normally distributed, where x∗t = xt − φxt−1, t = 2, ..., T .
Then, due to the independence of the {y∗t }Tt=2, the model in terms of the trans-
formed variables {y∗t }Tt=2 is given by the simple regression model
Y∗ = X∗θ + ε, ε ∼ N(0, σ2IT−1), (B-7)
where IT−1 is the (T − 1)× (T − 1)-Identity matrix, Y∗ = {y∗t }Tt=2 and X∗ =(x∗2, ..., x
∗T )′, respectively. Combining the normal prior for θ in (B-5) with the
likelihood of the normal regression model in (B-7), and in line with standard
Bayesian linear model results it is easily derived that
θ = (μ, β1, ..., βk)′|y1:T , φ, σ2 ∼ N(k+1)(θ,Vθ)I(θ∈Sθ), (B-8)
where θ = Vθ(C−1η + σ−2X∗′Y∗) and Vθ = (C−1 +X∗′X∗)−1. Given θ, and
φ, and including the respective prior, the full conditional distribution of σ2 is
easily obtained in closed form as a standard result
σ2|y1:T , θ, φ ∼ IG
(a0 +
T − 1
2, b0 +
1
2SSX∗Y∗
), (B-9)
where SSX∗Y∗ = (Y∗ −X∗θ)′(Y∗ −X∗θ).
The conditional posterior distribution is, as discussed in Chib (1993), not
difficult at all. The major trick here, is to consider the errors ut = yt − x′tθthemselves in a linear regression model. Thus,
u = Uφ+ ε (B-10)
where u = (u2, ..., uT )′ and U = (u1, ..., uT−1)′. Again using standard result
from linear regression, the conditional posterior distribution for the autoregres-
sive parameter φ is obtained as a truncated normal distribution
φ|y1:T , θ, σ2 ∼ N(φ, vΦ)I(θ∈Sφ), (B-11)
where φ = vΦ(Φ−10 φ0+σ−2U′u) and vΦ = (Φ−10 +σ−2U′U)−1. It could also be
drawn from the untruncated normal distribution retaining the draw if it lies in
the open interval (−1, 1). One additional result of this strategy is the providence
ARTICLE III 39
of a conditional probability of the stationarity of the error process (Chib 1993).
This conditional probability is simply the proportion of accepted draws from
the untruncated normal distribution.
With the availability of all full conditional distributions, in stage of conver-
gency, the draws yield from the joint prosterior distribution of the parameters.
40 ARTICLE III
CURRICULUM VITAE
Personal Information
Name Daniel Philipp Stadel
Date/Place of Birth February 23, 1982, Munich
Education
Conferral of a Doctorate (PhD in Management)12/2008 – 04/2011 University of St. Gallen, Switzerland
Research Institute for Customer Insight
07/2009 – 08/2009 University of Michigan, Ann Arbor, USA
Summer Program in Quantitative Research Methods
Degree Program Economical Mathematics (Dipl.-Math. oec.)10/2002 – 08/2008 Ulm University, Germany
Focus: Financial Mathematics and Statistics
08/2006 – 07/2007 University of West Florida, Pensacola, USA
School Education09/1992 – 06/2001 Nikolaus-Kopernikus-Gymnasium,
Weissenhorn
Work Experience
12/2008 – 04/2011 Research Assistant and Doctoral Candidate,
Institute for Customer Insight
(former: Center for Business Metrics)
11/2007 – 05/2008 Working Student at the Risk-Controlling Division,
Savings Bank Ulm
09/2007 – 10/2007 Intern at the Risk-Controlling Division,
Savings Bank Ulm
08/2006 – 07/2007 Assistant at the Statistics Center,
University of West Florida, Pensacola, USA
05/2004 – 07/2006 Student Assistant at the Ulm University,
Department for Mathematics and Business Studies