advanced statistical models for pricing, mass ...file/dis3937.pdf · advanced statistical models...

Advanced Statistical Models forPricing, Mass Customization and Forecasting

- A Bayesian Approach -

D I S SERTAT ION

of the University of St. Gallen,

School of Management,

Economics, Law, Social Sciences

and International Affairs

to obtain the title of

Doctor of Philosophy in Management

submitted by

Daniel Philipp Stadel

from

Germany

Approved on the application of

Prof. Dr. Andreas Herrmann

and

Prof. Dr. Torsten Tomczak

Dissertation no. 3937

Difo-Druck GmbH, Bamberg 2011

The University of St. Gallen, School of Management, Economics, Law,

Social Sciences and International Affairs hereby consents to the printing of the

present dissertation, without hereby expressing any opinion on the views herein

expressed.

St. Gallen, May 13, 2011

The President:

Prof. Dr. Thomas Bieger

To my family

V

Acknowledgment

This cumulative dissertation has been a very challenging and interesting en-

deavor, which would not have been possible without the contributions of a num-

ber of persons. Therefore, I would like to thank all the people who supported

me throughout this ambitious project.

First of all, I would like to thank my primary advisor, Professor Andreas

Herrmann, and my secondary advisor, Professor Torsten Tomczak, for their

guidance, encouragement, and supervision. They have provided vital support

for my study and research. I am also indebted to numerous individuals at the

University of St. Gallen. Particularly, I would like to thank Dr. Julia Stefanides,

Antonia Erz, Christian Hildebrand and Christian Purucker for their ongoing

willingness to discuss issues related to my dissertation project, their valuable

comments and the great working atmosphere. I also would like to thank my co-

authors Professor Florian Stahl, Professor Raghuram Iyengar, Professor Bene-

dict Dellaert and Dr. Jan Landwehr for their support and effort in completing

the papers.

Sustaining me throughout has been the ever present support and understand-

ing of my family and friends. Therefore, I would like to thank my friends for

their comprehension and patience and my family for their continuous encour-

agement and support. Without them, the completion of this dissertation would

not have been possible.

St. Gallen, May 2011

Daniel P. Stadel

VII

Table of Contents

A. Summary - Zusammenfassung

B. Article I

Stahl, F., Stadel, D. P., Iyengar, R., and Herrmann, A. (in preparation for sub-

mission). Subscriptions Pricing and Intertemporal Tradeoffs. ManagementScience.

C. Article II

Stadel, D. P., Dellaert, B. G. C., Herrmann, A., and Landwehr, J. R. (submit-

ted). Locked-In To Luxury: First- and Second-Order Default Effects in Mass

Customization. Marketing Science.

D. Article III

Stadel, D. P. (submitted). Online Data: Predictive Power or Obscure Delusion?

International Journal of Research in Marketing.

E. Curriculum Vitae

IX

Summary

In many fields of business studies such as finance, econometrics and quantita-

tive marketing the importance of advanced statistical methods steadily increases

as the problems gain complexity. With faster computers and new sources for

vast amounts of data, statistical approaches are challenged to cope with these

aspects, and must therefore be improved. For quantitative marketing, the in-

formation gained by complex models, and the insights given by new advanced

methods strengthen the ability of companies to faster react to their customers’

needs. This dissertation discusses the application of advanced statistical meth-

ods to a variety of research objectives, such as tariff design, mass customization

profitability, and forecasting.

The first essay broaches the issue of consumers’ intertemporal tradeoffs

among subscription plans and the respective consequences for its optimal pric-

ing. Subject of further investigation is the individual’s discounting behavior

which has a significant influence on the perceived value for the customer. We

augment a general discount function by an additional parameter to account for

flexibility preferences in the individual’s discounting behavior. Such behavior

has a tremendous influence on optimal pricing strategies for service providers.

The second essay investigates default-based upselling potentials in mass cus-

tomization systems such as online car configurators. We study whether compa-

nies can start their customers off on a high-margin decision path based on a few

high-end default selections early on in the configuration process. We analyze

whether default attribute levels within the customization process can increase

consumers’ choices for high-margin attribute levels (first-order default effects)

and whether these effects help or hurt margins on later subsequent attribute

level choices (second-order default effects). We offer a conceptual framework

to managerially guide default selection to accommodate these two effects. The

third essay provides an analysis of online data from a car configurator and web

search queries to assess its usefulness as input for forecasting models. Due to

large amounts of data available online, companies can benefit from proper anal-

X

yses of such data pools. Therefore, time series methods are applied to the data.

The forecasting performance is compared to models without incorporating the

online data. It is shown that such data can significantly improve the forecasting

performance and, hence, companies should face the challenge to cope with the

task of utilizing available online data.

The intended research projects, and therefore the resulting essays show ap-

plications of advanced statistical tools to cover complex but important issues

among the economic interaction of companies and their customers. Based on

these methods, I am able to conduct several analyses to draw conclusions of

high importance and relevance for managerial implications.

XI

Zusammenfassung

In den unterschiedlichsten Teilgebieten der Wirtschaftswissenschaften efreuen

sich statistische Methoden aufgrund der stetig steigenden Komplexität der

Fragestellungen immer größerer Beliebtheit. In der Disziplin des quantita-

tiven Marketings können hochentwickelte Methoden den nötigen Wissensvor-

sprung liefern, um bestmöglichst auf sich ändernde Kundenanforderungen zu

reagieren. Die vorliegende Dissertation diskutiert statistische Modelle für An-

wendungen in den Bereichen Tarifdesign, Konfigurator-Optimierung und Prog-

noserechnung.

Der erste Aufsatz diskutiert die Gestaltung von Abo-Tarifen bezüglich

Laufzeit und Preis unter Berücksichtigung des individuellen Planungshorizonts

der Kunden. Der Fokus liegt dabei auf dem individuellen Diskontierungsverhal-

ten der Konsumenten. Der zweite Beitrag untersucht das Upselling-Potential

mittels Produktkonfiguratoren am Beispiel eines Car-Konfigurators. Es wird

analysiert, wie Konsumenten auf voreingestellte Optionen innerhalb des Kon-

figurationsprozesses reagieren (Effekt 1. Ordnung), und ob diese Auswirkun-

gen auf spätere Entscheidungen innerhalb des selben Konfigurationsprozesses

(Effekt 2. Ordnung) haben. Wir erarbeiten ein konzeptionelles Gerüst für die

optimale Auswahl von Defaults unter Berücksichtigung beider Effekte. Der

letzte Artikel befasst sich mit der Prognose von Verkaufsmengen. Für zwei

Pkw-Modelle werden mittels Online-Daten und einem statistischen Zeitreihen-

Modell zukünftige Bestellungen vorhergesagt. Es wird gezeigt, dass sich, unter

Berücksichtigung der Online-Daten, die Prognose-Güte signifikant verbessern

lässt, und den Herstellern somit eine weitere zuverlässige Datenquelle für Prog-

nosemodelle gegeben ist.

Die wissenschaftlichen Ausführungen zeigen die Anwendung statistischer

Methoden zur Bearbeitung und Lösung komplexer Fragestellungen im Zusam-

menhang der wirtschaftlichen Interaktion zwischen Unternehmen und ihren Kun-

den. Basierend auf diesen Methoden und deren Ergebnisse, können Implika-

tionen und Empfehlungen von großer Bedeutung für Management-relevante

Entscheidungen abgeleitet werden.

Article I

Stahl, F., Stadel, D. P., Iyengar, R., and Herrmann, A. (in preparation for sub-

mission). Subscriptions Pricing and Intertemporal Tradeoffs. ManagementScience.

Subscriptions Pricing and

Intertemporal Tradeoffs

Florian Stahl ∗

Daniel P. Stadel †

Raghuram Iyengar ‡

Andreas Herrmann §

∗Florian Stahl ([email protected]) is Assistant Professor of Marketing at the University of Zurich, 8032 Zurich,

Switzerland.†Daniel P. Stadel ([email protected]) is Ph.D. candidate at the University of St. Gallen, 9000 St. Gallen,

Switzerland.‡Raghuram Iyengar ([email protected]) is Assistant Professor of Marketing at the The Wharton

School, Philadelphia, PA - 19104.§Andreas Herrmann ([email protected]) is Professor of Marketing at the University of St. Gallen,

9000 St. Gallen, Switzerland.

2 ARTICLE I

Abstract

A common form of subscriptions to many services (e.g., health clubs, Internet access) is charac-

terized by duration that a consumer has access to a service and a one-time flat fee for unlimited

use. A key aspect of such subscriptions is that the price per-time unit declines with longer du-

rations. Such a pricing mechanism forces consumers to face a tradeoff in their choice among

plans - a short membership plan gives the flexibility to switch plans or providers while a long

one provides a price discount. For consumers, such decisions involve the flat fee, discounting of

future service benefits and their valuation of flexibility. For a firm offering a subscription-based

service, it is important to understand how consumers discount future benefits as it impacts their

willingness-to-pay. Using experimental data, we find that consumers’ discounting pattern is

inverse N-shaped (decrease-increase-decrease) with respect to membership duration. We also

show that a key driver of this pattern is the maximum contract duration that consumers typically

subscribe to a service. To determine the implications of our findings for managerial decisions,

we parameterize the observed discounting pattern and incorporate it within a model of con-

sumer choice among plans. The model is estimated using experimental data on consumers’

willingness-to-pay for membership plans for a health club. We compare the optimal menu of

plans predicted from our model with those based on an alternative model, which assumes only

hyperbolic discounting. Our results show that firms would give much smaller price discounts

to customers for longer membership durations if they ignore the inverse N-shaped discounting

pattern. Translated in terms of profitability, the failure to account for the observed discounting

leads to a reduction of 19% in firm profit.

Key words: Subscriptions, Membership Plans, Pricing, Intertemporal Choice, Preference for

Flexibility

ARTICLE I 3

1. Introduction

Subscriptions are a popular pricing practice used bymany business-to-consumer

companies. A common form of such subscriptions is a membership plan, which

is characterized by the length of time a customer can access a service (member-

ship duration) and a one-time flat fee (membership fee) for its unlimited use.

For example, an online newspaper, Radiance Weekly, charges a one-time fee

of $85, $125, $175 and $225 for unlimited access to articles for 1, 2, 3 and 5

years, respectively (see radianceweekly.com). Similarly, Greyhound bus service

charges $239, $439 and $539 for unlimited rides for 7, 30 and 60 days, respec-

tively as part of their Discovery package (see discoverypass.com). Such flat rate

plans are becoming increasingly popular as compared to usage-based pricing for

a variety of services such as Internet access, fixed-line telephone and access to

many online services (OECD 2009). A key aspect of such subscriptions is that

the price per-time unit declines with a longer subscription period. For example,

customers pay only $45 per year to Radiance Weekly if they subscribe for 5

years as opposed to $85 for a one year subscription.1

With such type of plans, consumers face a tradeoff in their choice among

them - a membership plan of short duration has a high price per-time unit but

gives consumers the flexibility to switch plans or providers. With a long mem-

bership plan, customers lose their flexibility but benefit from the lower price

per-time unit. For consumers, a choice of a plan involves consideration of im-

mediate costs (flat fee), future benefits from use of service and their valuation

of flexibility (DellaVigna and Malmendier 2006). How consumers discount fu-

ture benefits in this context has an impact on their willingness-to-pay (WTP)

for subscriptions of differing lengths and, consequently, for the optimal design

of subscriptions.

A rich stream of past literature on consumers’ intertemporal preferences has

shown that individuals discount future utility according to a hyperbolic func-

tion (Ariely and Loewenstein 2000, Ariely and Zauberman 2000, Laibson 1997,

Loewenstein and Prelec 1992, Thaler 1981). A majority of this work has fo-

1There are other types of subscriptions in which consumers are charged for both access and usage using either

a two-part tariff (Danaher 2002, Essegaier et al. 2002) or multi-part tariff (Iyengar et al. 2007, 2008). In this paper,

we focus on a popular pricing plan used by many types of services, where there is a one-time access fee charged

for giving consumers unlimited usage for a given time duration.

4 ARTICLE I

cused on consumers discounting of utility at discrete future time points. More

recent literature has considered the impact of “duration” and “intervals” on indi-

viduals’ discount patterns (Ariely and Loewenstein 2000, LeBoeuf 2006, Over-

ton and MacFadyen 1998, Read et al. 2005, Scholten and Read 2006, 2009).

However, such investigations have been in the context of how far future out-

comes are removed from the present, how far these outcomes are removed from

one another and its effect on the evaluation of a sequence of outcomes. Please

see Berns et al. (2007) and Frederick et al. (2002) for a more detailed discussion

of past work about intertemporal discounting, and DellaVigna and Malmendier

(2006) for a discussion of consumers’ discounting of future benefits from prod-

uct usage. As described above, subscriptions are also characterized by a time

duration for which customers have access to a service. However, in the case of

subscriptions, consumers have to discount future benefits from a service over a

continuous duration (length of membership) rather than at discrete future time

points. While it is plausible that consumers may discount such future bene-

fits still using a hyperbolic function, it is not obvious how their valuation for

flexibility would impact their discounting pattern.

Several researchers have also focused on tariff design but typically with-

out considering consumers’ discounting behavior (Dolan 1987, Miravete 2009,

Räsänen et al. 1997). For example, Dolan (1987) provides guidelines to design

quantity discount schedules. Miravete (1999) considers the design of optimal

menu of nonlinear tariffs when consumers are uncertain about their future con-

sumption. Within an analytical framework, Essegaier et al. (2002) investigate

the effect of capacity constraints and heterogeneity in consumers’ usage for

pricing of access services. Other research has also explored how consumers’

usage of a service differs under tariffs of varying durations and its implications

for service renewal but has not considered how consumers discount future util-

ity from the service (Gourville and Soman 2002, Soman and Gourville 2001).

Please see Wilson (1993) for a detailed discussion of past research on the design

of pricing plans. As noted earlier, it is important to determine how consumers’

discounting pattern of future benefits may impact the optimal design of sub-

scriptions.

ARTICLE I 5

In summary, there are two related issues we address herein: (1) how con-

sumers discount future benefits from a subscription-based service and (2) the

managerial implications for optimal design and pricing of tariffs. While previ-

ous research has investigated some of these issues separately, subscriptions are

an appropriate context to investigate their interplay. To this end, we use three

experimental studies to explore consumers’ preferences for future benefits from

a subscription-based service. In all three studies, we find that the monthly dis-

count rate has an inverse N-shaped pattern with respect to membership duration

- the discount rate initially decreases (consistent with hyperbolic discounting),

after a certain length of membership, it actually shows an increase and then re-

verts to a decreasing pattern. We also propose and find confirming evidence that

a key driver of this pattern, and a significant predictor of the time of increase

in the discount rate, is the maximum contract duration that consumers typically

subscribe to a service. The latter duration is a measure of how much consumers

value flexibility. Thus, consumers require large price discounts for choosing

plans with membership duration exceeding their maximum considered service

length. To quantify the impact of our findings on optimal pricing, we param-

eterize the observed discounting scheme and incorporate it within a model of

consumer choice among plans. The model is estimated using our experimental

data. We contrast the optimal menu of plans based on our model and an al-

ternative model that considers only hyperbolic discounting. Our results show

that firms would give much smaller price discounts to customers for longer du-

rations if they ignore the inverse N-shaped discounting pattern. Translated in

terms of profitability, the failure to account for the observed discounting leads

to a reduction of 19% in firm profit.

The remainder of the paper is organized as follows. We begin by investi-

gating how consumers discount their future benefits for subscription-based ser-

vices. Thereafter, we assess the managerial relevance of our findings for opti-

mal pricing of subscriptions. To do so, we describe a model of consumer choice

among subscriptions, incorporate the observed discounting pattern within this

choice model and discuss pricing policy optimization. The paper concludes

with a summary of our findings, limitations, and directions for future research.

6 ARTICLE I

2. Discounting pattern for subscription-basedservices

Our key objective is to explore consumers’ discounting patterns of future bene-

fits from ongoing access to a subscription-based service. In this section, we de-

scribe three studies. In all three studies, we collect data using surveys in which

we ask participants to state the maximum price they are willing to pay to switch

to various membership durations of a service conditional on their willingness-

to-pay (WTP) for a baseline membership duration of that service. We ask such

conditional questions as we are interested in understanding the pattern of con-

sumers’ discounting and not in determining their absolute WTP for a specific

service. In what follows, we describe three studies and their overall findings.

2.1 Study 1

Study 1 explores the discounting behavior of consumers while they decide

whether to remain with a given baseline subscription plan or switch to an al-

ternative plan. We ask consumers to state their WTP to switch to a proposed

alternative membership plan from the given baseline plan. Given the vast lit-

erature on hyperbolic discounting, a similar pattern is plausible, which would

indicate that consumers are discounting their future benefits from a service at

higher rates for shorter membership durations than for longer durations. How-

ever, it is not obvious how consumers’ valuation for flexibility will affect their

discounting pattern.

Method

One hundred and five undergraduate and masters level students participated in

the study. As the context, we considered membership plans to a health club. For

all participants, we specified that they were willing to pay an initial one-time fee

of $300 for a membership plan that gave 3 months of unlimited use of service

(i.e., a price of $100 per month for 3 months). In addition, respondents were

told that the payment of the one-time flat fee corresponding to any subscrip-

tion would be at the start of membership. This is typically how firms charge

ARTICLE I 7

subscribing consumers. Participants then had to state the maximum price they

were willing to pay to switch to an alternative of longer duration, i.e. "Forwhich monthly price p, would you choose a subscription of duration T monthsthan T ∗ months ? $X.−" with p being the monthly payment, T > T ∗ the sub-

scription period (e.g., 12 months) and T ∗ the baseline subscription period (i.e.,

3 months). By stating their WTP for switching to longer subscriptions, par-

ticipants provide information on how they tradeoff between being flexible and

benefiting from a price discount. Consequently, with this information we can

determine the discount pattern for each participant. Each respondent answered

nine such questions and we obtained the maximum price that they are willing

to pay to switch to contracts with durations of 6, 9, 12, 18, 24, 36, 48, 60 and

72 months (i.e., ΔDuration ∈ {3, 6, 9, 15, 21, 33, 45, 57, 69}). The order of the

questions was counterbalanced across respondents.2

As described, we asked participants to state their WTP for a subscription

duration T months of a service given that they are willing to pay $100 per

month for 3 months of the same service. Thus, for a membership duration of T

months, we can estimate participants’ monthly discount rate using the following

equation:

δT (100 · T )− p · T = 0, (1)

where δT = exp(−rT ), is the discount factor, which indicates the level of dis-

count on the monthly price that a consumer requires to switch to a plan longer

than the baseline duration, and r is the monthly discount rate, which may vary

with membership duration. In this way, we can determine the variation of con-

sumers’ monthly discount rate with membership duration.

Results

Table 1 contains the average (across respondents) discount rate for all offered

durations. The table also shows the difference in the discount rates between suc-

cessive membership durations. As the data may contain correlated observations

within individuals, we determine the statistical significance of such differences

using paired t-tests. For the health club membership, we obtain a decreasing

pattern in the monthly discount rates until the contract duration exceeds thirty

2The detailed survey can be provided by the authors upon request.

8 ARTICLE I

Table 1 Sudy 1 - monthly discount rates

Gym Subscriptions

Δ Duration Mean r (Sd) Decrease in rin months in %

3 4.49 (4.44)6 3.60 (2.72) 0.89 *9 3.84 (3.94) -0.24

15 2.53 (1.92) 1.31 **21 2.36 (1.42) 0.1733 5.08 (16.84) -2.7245 13.44 (32.31) -8.36 **57 13.19 (32.19) 0.2569 13.11 (32.22) 0.08

*p-value < .1 **p-value < .05

six months. This negative pattern is indicative of hyperbolic discounting (Za-

uberman et al. 2009). After thirty six months, we observe a significant increaseof around 8 percentage points in the monthly discount rate. For membership

plans that exceed 36 months, we again get a decreasing pattern. To summarize,

the study provides initial evidence that consumers show hyperbolic discounting

even for future benefits over an entire duration of service membership. How-

ever, as the inverse N-shaped pattern for monthly discount rates indicates, just

hyperbolic discounting cannot explain the observed behavior.

Our first study has some limitations. First, it can be argued that our use of a

sample of undergraduate and masters level students may have biased the results

as, being college students, they inherently have limited need for membership

services for a long duration.3 Second, it is conceivable that our use of 3 months

as the duration of the baseline membership plan may have affected the results.

Third, it is unclear what are the underlying drivers of the observed discounting

pattern. We address these concerns in the following study.

3For the students who participated in the study, the average time remaining to graduate was 2 years.

ARTICLE I 9

2.2 Study 2

Our second study is broadly designed to be similar to the first one and addresses

its limitations. We expect to find evidence for the inverse N-shaped discounting

pattern similar to that from the first study.

To hypothesize on the underlying drivers of observed discounting pattern,

we consider past research on intertemporal preferences and reference points.

Such research suggests that differing reference points used to evaluate alterna-

tives can significantly alter the choice among them (Kahneman 1992, Loewen-

stein 1988, Ordonez et al. 2000). In the current context, respondents may use

two reference points for making their WTP decisions for each offered subscrip-

tion (1) the baseline duration provided in the survey and (2) the maximum con-

tract duration that they typically subscribe to a service (termed as “critical”

duration). The latter reflects how much consumers value flexibility - the shorter

(longer) is this duration, the more (less) they value flexibility. We propose that

respondents’ critical duration should be a significant predictor of the timing of

increase in their monthly discount rate.

Method

Forty nine professionals (Executive MBAs) participated in the study. They were

randomly assigned to one of two versions of the survey. The two surveys had

different services – health club and online-video-rental service. We chose two

different services to explore whether there were any differences in discounting

behavior between a more common subscription service, such as a health club

membership, and a more innovative one such as online-video-rental service. As

noted earlier, the design for this study is similar to that in study 1, with the dif-

ference being that we considered several durations for the baseline subscription

tariff, e.g. 3, 6 or 12 months. The monthly fee for the baseline tariff was kept

constant at $100. For each baseline tariff, the respondents were asked for their

WTP to switch to alternative tariffs. Each participant had to answer twenty such

questions and the order of questions was counterbalanced across respondents.

At the end of the survey, we asked respondents to state their critical duration for

the offered service.

10 ARTICLE I

Table 2 Study 2 - monthly discount rates

Gym Subscriptions Online-Video-Rental Service

Δ Duration Mean r (Sd) Decrease in r Δ Duration Mean r (Sd) Decrease in rin months in % in months in %

3 4.04 (0.98) 3 9.43 (6.26)6 2.87 (0.92) 1.16 *** 6 7.30 (4.29) 2.13 ***9 1.98 (1.09) 0.89 *** 9 5.47 (2.63) 1.83 **

12 2.58 (0.90) -0.59 ** 12 6.09 (3.05) -0.62 +

15 2.07 (0.88) 0.51 *** 15 5.22 (2.81) 0.86 ***18 2.10 (0.59) -0.03 18 4.71 (2.51) 0.51 **24 1.95 (0.78) 0.16 * 24 3.85 (1.78) 0.86 ***30 1.77 (0.74) 0.17 *** 30 3.63 (1.80) 0.22 **36 1.80 (0.69) -0.02 36 3.29 (1.18) 0.34 **48 1.51 (0.59) 0.29 *** 48 2.70 (0.80) 0.59 ***

*p-value < .1 **p-value < .05 ***p-value < .01+p-value = .13

Results

For each respondent, we calculate the monthly discount rate for every offered

membership duration. Table 2 shows the discount rates averaged across all

baseline membership durations and respondents.4 As before, we determine the

underlying discount pattern for the two services using paired t-tests. The re-

sults corroborate those from study 1 – we observe an inverse N-shaped pattern

in the monthly discount rates for the health club membership as well as for the

online-video-rental service. For the latter, the monthly discount rate declines

until the membership duration exceeds 15 months (ΔDuration = 12 months).

We then find a small increase in the monthly discount rate (p < 0.13). There-

after, we obtain the declining pattern. Note that consistent with past research

(Kalish 1985, Mukherjee and Hoyer 2001), the discount rates for the online-

video-rental service (a more innovative product) are higher than that for a more

common service such as health club memberships (p < 0.001). A probable rea-

son is that, for an innovative service, consumers are more uncertain about their

future use of service and require much larger discounts to make it attractive for

them to switch to longer membership durations. Later, we discuss the issue of

4We average the data across all baseline durations as the pattern of discounting was very similar. The discount

rates for each baseline duration is available from the authors upon request.

ARTICLE I 11

Figure 1 Study 2 - observed pattern of discounting

Δ Duration (in months)

Mon

thly

Dis

coun

t Rat

e r

2 %

4 %

6 %

8 %

10 %

10 20 30 40

Gym SubscriptionsOnline−Video−Rental Service

uncertainty in future usage in greater detail. Figure 1 summarizes these patterns

graphically. Next, we use the critical duration from each respondent and inves-

tigate its relationship with the timing of increase in their monthly discount rates.

For the health club membership, the average critical duration was 23.28 months

while for the online-video-rental service, it was 15.88 months. Consistent with

intuition that consumers will be less likely to subscribe to longer durations for

a more innovative service, the latter is significantly smaller than the former

(p < 0.001). We find that for 64.60% (64.71%) of respondents, the increase in

their monthly discount rates for health club memberships (online-video-rental

service) either coincides with their self-stated critical duration or is the pre-

vious shorter or next longer membership duration (health club: χ2df=1 = 9.64,

p-value< .01; online-video-rental service: χ2df=1 = 10.29, p-value< .01). This

provides confirming evidence that consumers’ critical duration is a significant

driver for determining when their discount rate will increase.

12 ARTICLE I

2.3 Study 3

We performed a third study to emphasize the robustness of the inverse N-shaped

discounting pattern and test its relationship with respondents’ critical duration.

In this study, we replicate Study 2 with membership plans to a health club as a

context.

Method

This study was carried out as a web survey with fifty-five respondents. The

sample consisted of senior research associates. As in study 2, the last ques-

tion asked respondents to state their critical duration. In this study, we used six

months as baseline membership duration. The monthly fee was set at $100 per

month. Similar to the calculation in the first two studies, we determined the

monthly discount rates for all individuals and every offered membership dura-

tion (9, 12, 18, 24, 36 and 48 months).

Results

Table 3 contains the discount rates. Our finding of the inverse N-shaped pattern

in monthly discount rates is robust (see also Figure 2). This reaffirms the im-

portance of both hyperbolic discounting and customers’ valuation of flexibility

to understand how they discount future benefits from a subscription-based ser-

vice. The critical duration from respondents has an average of 25.42 months

with a standard deviation of 12.38 months. From Table 3, we note that the sig-

nificant increase in the monthly discount rates occurs around a duration of 18

months (ΔDuration= 12 months) and falls within the one standard deviation

around the average. We also find that for 83.64% of respondents, the increase

in their monthly discount rates for health club memberships either coincides

with their critical duration or is the previous shorter or next longer membership

duration (χ2df=1 = 24.89, p-value < .01). This strengthens our earlier finding

that consumers’ critical duration is an important factor for determining when

their discount rate will increase.

ARTICLE I 13

Table 3 Study 3 - monthly discount rates for health club subscriptions

Gym Subscriptions

Δ Duration Mean r (Sd) Decrease in rin months in %

3 2.036 (2.055)6 1.876 (1.668) 0.160 +

12 2.200 (1.592) -0.324 **18 1.979 (1.219) 0.221 **30 1.888 (0.999) 0.09142 1.691 (0.801) 0.197 **

**p-value < .05, +p-value = .13

2.4 Discussion of experimental results

The results from all three studies suggest that customers experience a loss of

flexibility with long contract durations and it is reflected in how they discount

future benefits from a service. Our finding of how consumers’ critical duration

impacts their discounting pattern is consistent with past research on how con-

sumers may use multiple reference points while making intertemporal choices

(Kahneman 1992, Loewenstein 1988, Ordonez et al. 2000). In our context, con-

sumers may use (a) the baseline duration that is provided in the survey and (b)

the critical duration as two reference durations for making their decisions. The

observed inverse N-shaped monthly discounting pattern can arise from the in-

terplay between these two reference durations with the first (second) region of

decreasing monthly discount rate arising from the comparison of each offered

duration with the baseline (critical) duration. The effect of consumers’ critical

duration on their discounting pattern may also be due to consumers’ uncertainty

in future use of service. It is likely that as consumers evaluate subscriptions with

durations greater than their critical duration, they are more uncertain about their

future usage and hence require much larger discounts for these subscriptions

to be attractive. This is consistent with past work of Jones and Ostroy (1984)

which notes that the more uncertain consumers are about their future beliefs

(e.g., future use of service), the more flexible they would like to be (e.g., use a

shorter contract length).

14 ARTICLE I

Figure 2 Study 3 - discounting behavior

Contract Duration (in months)

Dis

coun

t Rat

e r

1.8 %

2.0 %

2.2 %

2.4 %

12 18 24 30 36 42

Thus far, our investigation has documented an inverse N-shaped discount

pattern for subscription-based services and showed that this pattern emerges

largely due to consumers’ critical duration. The remainder of the paper is fo-

cused towards showing the implications of this finding for managerial decision

making, in particular for optimal pricing for subscriptions. To do so, we begin

by describing a model of how consumers choose among subscriptions. There-

after, we use the model for determining optimal prices for subscriptions.

3. Model for consumer choice among subscriptions

In this section we present a managerial application of our empirical findings.

We begin with a model for consumer choice among subscriptions. The model

formalizes how consumers’ discounting pattern affects their willingness-to-pay

for subscriptions. Thereafter, we parameterize the observed discounting pattern

and incorporate it in the model.

ARTICLE I 15

3.1 Utility model

Consider a single firm offering a product or service based on J subscription

plans. Each alternative j (j = 1, . . . , J) is described in terms of length of time

a customer can access the service and an initial, one-time flat fee (e.g., a health

club membership for 6 months for a one-time fee of $600). For an alternative

j, let the membership duration be Tj . We assume that the utility consumer i

associates with product j, vij(Tj), increases with duration. In addition, we set

the utility of zero duration to zero. This means that the consumer derives no

utility if s/he does not subscribe to the service (i.e., vij(0) = 0). We formulate a

discounted utility type specification (Samuelson 1937, Koopmans 1960) where

the utility from the alternative j depends on the duration that a consumer may

access the service. That is:

vij(Tj) =

∫ Tj

t=0

νiδi(t)dt, (2)

where νi ≥ 0 is the utility consumer i derives from consuming service j for

a unit time interval and δi(t) is the consumer-level discount factor for future

utility from service, which may be a function of time. The utility function in

equation (2) is the discounted utility from accessing service j for duration Tj .

Note that the utility function in equation (2) reduces to vij(Tj) = νi · Tj when

consumers do not discount future utility (i.e., δi(t) = 1.0).

We assume that consumer i (i = 1, . . . , I) cannot choose more than one

alternative. Let p(Tj) be the price associated with Tj duration of service j.

Consistent with economic theory, we assume that there is an individual-specific

composite (outside) good with unit price pyi and that consumer i has a budget yi.

A consumer can spend the entire budget on the composite good, or spend some

of it on the composite good and the rest to buy one of the J choice options (e.g.,

service j with Tj duration). Let zij denote the number of units of the composite

good.

Let uij(Tj, zij) represent the utility consumer i obtains from Tj duration of

service j and zij units of the composite good. We assume that the consumer

maximizes his or her utility, subject to a budget constraint p(Tj) + zijpyi = yi.

Without loss of generality, we normalize the price of the composite good to

16 ARTICLE I

unity, i.e., pyi = 1. Hence the number of units of the composite good is given

by zij = yi − p(Tj). We specify the following quasilinear utility function for

consumer i:

uij(Tj, zij) = vij(Tj) + βi(yi − p(Tj)), (3)

where vij(Tj) is specified in equation (2) and βi > 0 is the income effect or

price sensitivity.

Let j = 0 denote the no-choice option. Then the utility of allocating the

whole budget to the composite good (i.e., no-choice) for consumer i reduces to

ui0(0, yi) = βiyi since vi0(0) = 0. Thus a utility maximizing consumer would

choose alternative j if it has the maximum utility {uij > uik, k = 0, . . . , J, k �=j} and would choose none of the alternatives if the no-choice option (j = 0) has

the maximum utility {ui0 > uij, j = 0, . . . , J}. Note that in a choice context,

the term βiyi is irrelevant to the choice decision since it is a consumer-specific

constant across alternatives. Consequently, the utility of the no-choice option is

set to zero.

Our interest is in determining consumers’ willingness-to-pay for a given du-

ration of service. When the utility function is quasilinear, utility maximization

is equivalent to surplus maximization (Jedidi and Zhang 2002). Thus dividing

uij(Tj, zij) in equation (3) by the price coefficient βi gives the consumer surplus

function:

sij(Tj) =vij(Tj)

βi− p(Tj) = θi

∫ Tj

t=0

δi(t)dt− p(Tj), (4)

where sij(Tj) is the surplus (WTP−price) that consumer i derives from choos-

ing alternative j and θi =νiβi

is consumer i’s WTP for a unit time interval of the

service with no discounting.

The left-hand side component of equation (4) represents the WTP function

which describes the maximum price a consumer is willing to pay for a given

duration of service j.5 This function is given by:

WTPij(Tj) = θi

∫ Tj

t=0

δi(t)dt. (5)

5WTP or reservation price is the price point that equates the utility of consuming Tj units of service j to the no

choice utility, which we set to zero (see Jedidi and Zhang (2002)).

ARTICLE I 17

Figure 3 WTP function with different types of discounting

Duration t

WTP

($)

400

800

1200

1600

2000

2400

2800

3200

3600

4000

4400

4800

6 12 18 24 30 36 42 48

●

(1)

(2)

(3)

(4)

(5)

(6)

(1) δδ(t) = δδt with δδ = 1.00(2) δδ(t) = δδt with δδ = 0.80(3) δδ(t) = ββ ⋅⋅ δδt with ββ = 0.90 and δδ = 0.90(4) δδ(t) = ββ ⋅⋅ δδt with ββ = 0.85 and δδ = 0.70

(5) δδ(t) = ((1 ++ αα ⋅⋅ t))−−ββαα with αα = 0.50 and ββ = 0.060

(6) δδ(t) = ((1 ++ αα ⋅⋅ t))−−ββαα with αα = 0.03 and ββ = 0.025

Let θi = 100 per-month for alternative j. Figure 3 depicts the shape of the

WTP function for different types of discount functions (i.e., different discount

factors). The WTP function is linear when δi = 1. As the figure suggests,

when the discount factor has other types of variation with duration (i.e., δi(t)),

the WTP function takes various shapes. To accurately determine the WTP for

a subscription plan, and hence its optimal pricing, it is then important to cor-

rectly capture the underlying discount pattern. To this end, we describe how we

parameterize the discount function observed in the experiments.

3.2 Augmented discount function

We begin with a description of the generalized hyperbolic function,

δi(t) = (1 + αit)−βiαi , αi, βi > 0, (6)

which was proposed by Loewenstein and Prelec (1992). In this function, the

coefficient αi captures the divergence from exponential discounting. As the

coefficient αi goes to 0, the discount function becomes an exponential function

with a parameter βi, i.e., δi(t) = exp(−βit). When the coefficient αi becomes

18 ARTICLE I

Figure 4 Schematic diagram for discounting behavior

Contract Duration

Dis

coun

t Rat

e r

2 %

4 %

6 %

8 %

ConsideredDurations

CriticalDurations

UnconsideredDurations

large, the discount function becomes a step function. Note that a hyperbolic

function imposes that the discount pattern is decreasing with duration.

Our studies provide evidence for an inverse N-shaped discounting behavior,

which is schematically illustrated in Figure 4. Such a pattern cannot be cap-

tured by just using a hyperbolic function. To describe this pattern, consider the

exponential discount function, namely,

δi(t) = exp(−rit), (7)

where ri is the constant discount rate. As discussed in past research (Zauberman

et al. 2009), a hyperbolic discount function can be expressed in terms of the

exponential discount function in the following manner.

δi(t+Δt) = exp(−(ri −ΔriΔt)(t+Δt)), where ΔriΔt > 0. (8)

For consumer i, let tcrit.i be the critical duration. For parsimony, we assume

that the loss of flexibility affects the discount rates only through a shift, i.e.,

a positive shock during the time period that exceeds the critical value. After

the shock, the discount scheme is again consistent with hyperbolic discounting.

ARTICLE I 19

Figure 5 Characterization of shift in discount rates

Contract Duration

Dis

coun

t Rat

e r

2 %

4 %

6 %

8 %

ConsideredDurations

MeanCritical

Duration

UnconsideredDurations

Thus, when t > tcrit.i , there is a shift in the discount rates.6 Figure 5 shows our

characterization of this shift in the discount rates. Given this assumption, for

t+Δt > tcrit.i , we obtain the following discount function:

δi(t+Δt) = exp(−(ri −ΔriΔt)tcrit.i − (ri −ΔriΔt + rcrit.i )1− (9)

(ri −ΔriΔt)(t+Δt− tcrit.i − 1)),

with t + Δt > tcrit.i and rcrit.i > ΔriΔt. A simple algebraic manipulation leads

to the following:

δi(t+Δt) = exp(−(ri −ΔriΔt)(t+Δt)) · exp(−rcrit.i ), (10)

with t + Δt > tcrit.i and rcrit.i > ΔriΔt. Note that the first term, exp(−(ri −ΔriΔt)(t+Δt)), implies hyperbolic discounting (Zauberman et al. 2009) and we

can replace the exponential discount function with the generalized hyperbolic

discount function. Thus, we get

δi(t) = fI(t>tcrit.

i)

i · (1 + αit)−βiαi , (11)

6It is possible that after the critical duration, the hyperbolic discounting pattern has a different slope as well.

For parsimony, we capture the loss of flexibility only by a shift in the discount pattern. Such a parsimonious

specification requires the estimation of only one additional parameter. As the results of model estimation later

show, this specification fits the empirical discounting pattern well.

20 ARTICLE I

with I the Indicator-Function, such that

I(t>tcrit.i ) =

{1, if t > tcrit.i

0, else

and fi = exp(−rcrit.i ). We denote it as the flexibility parameter.

To summarize, we have parsimoniously incorporated the impact of critical

duration on the discount function as a multiplicative factor. For a consumer,

this multiplicative factor becomes applicable when the offered membership du-

ration exceeds their critical duration. We refer to our parameterized form as the

augmented discount function. Next, we include this discount function in the

WTP expression.

3.3 Willingness-to-pay model with flexibility

We can incorporate the augmented discount function in our WTP specification.

Thus, for consumer i and service j with duration Tj , we obtain:

WTPij(Tj) = θi

∫ Tj

t=0

fI(t>tcrit.

i)

i · (1 + αit)−βiαi dt. (12)

This completes the description of our model, which we denote as the "WTPmodel with flexibility". When consumers’ preference for flexibility is not in-

cluded in the discount function, their discount pattern of future benefits may be

captured by a generalized hyperbolic function. We denote this alternative model

as "WTP model without flexibility". Next, we use the models to determine opti-

mal prices / durations for a menu of subscription plans.

4. Optimal design of membership plans

In this section, we discuss the optimal design of membership plans. First, we

describe how the parameters of the two models, WTP model with (without)

flexibility, can be estimated. As an illustration, we use the data from Study 3

ARTICLE I 21

Table 4 Parameter estimates - posterior means and 95% posterior intervals

Parameter Model Modelwithout Flexibility with Flexibility

μα 0.0233 0.0246(0.0146, 0.0332) (0.0160, 0.0414)

μβ 0.0280 0.0236(0.0206, 0.0291) (0.0193, 0.0281)

μf - 0.9391- (0.9105, 0.9714)

for model estimation. Next, using the estimated parameters, we determine the

optimal menu of plans from each of the two models. A comparison of the opti-

mal menu shows how the pattern of discounting affects both the characteristics

of the offering and firm profitability.

4.1 Parameter estimates

We use Markov Chain Monte Carlo (MCMC) methods to estimate the two mod-

els using the data from Study 3. Our approach follows the standard Bayesian

estimation for hierarchical models (Rossi and Allenby 2003). Please see the

appendix for details on model estimation. Table 4 provides the posterior means

and 95% posterior intervals for the parameters in parenthesis. The estimate

for μα, which corresponds to the divergence from exponential discounting, is

marginally higher in the model with flexibility as compared to that in the model

without flexibility. The estimate of μβ in the latter model is higher than in the

former. This can be explained by the higher discount rates for longer durations,

which are captured in the model with flexibility by inclusion of a parameter

for flexibility. The average flexibility preference (μf ) is 0.9391, which sug-

gests that, on average, the positive shock in the discount rate for any member-

ship duration that exceeds a consumer’s critical duration is around 6% (rcrit. =

−Ln(0.9391) = 0.0628). In other words, on average, consumers require an

extra 6% discount on the monthly fee for any subscriptions that exceed their

critical duration. Finally, from the consumer-level estimates, we find that across

consumers the flexibility factor (f ) ranges from 0.92 to 0.96. This suggests that

22 ARTICLE I

Figure 6 The predicted discount functions (for six randomly chosen individuals)

(a) Model without Flexibility

Duration t

Dis

coun

t Fu

nctio

n δ(

t)

0.4

0.6

0.8

1.0

6 12 18 24 30 36 42 48

discount variation for six randomlychosen individuals (model without flexibility)

(4)(3)

(5)

(6)

(1)

(2)

(b) Model with Flexibility

Duration t

Aug

men

ted

Dis

coun

t Fu

nctio

n δ(

t)

0.4

0.6

0.8

1.0

6 12 18 24 30 36 42 48

discount variation for six randomlychosen individuals (model with flexibility)

(4)

(3)

(5)

(6)

(1)

(2)

consumers vary in their requirements of additional monthly discounts ranging

from around 4% (= −Ln(0.96)) to 8% (= −Ln(0.92)).7

To assess goodness of fit of the two models, we calculate the mean squared

error (MSE) for both models based on a comparison of the model-predicted

discount pattern with the observed discount rate for our sample. For the model

without flexibility, the MSE is 5.8 · 10−4 while for the model with flexibility, it

is 3.0 · 10−4. Consistent with expectations, the MSE comparison suggests that

the latter model better explains the pattern in the monthly discount rates.

We also contrast the individual-level discount functions (δi(t)) predicted by

the two models. For this comparison, we randomly choose six individuals (from

fifty-five respondents) and predict their discount function using both the model

with and without flexibility. Figure 6 shows the predicted discount functions

for the six individuals, labeled from 1...6, with the left (right) panel based on

the model without (with) flexibility. Recall that as the model without flexibility

assumes hyperbolic discounting, the discount function decreases with duration.

In the right panel, there are three points to note. First, for each individual,

there is a drop in the discount function (i.e., an increase in the monthly discount

rate) when the membership duration is greater than their critical duration. For

instance, person 3 has a critical duration of 24 months and we see a drop in

7We also explored whether there was any relationship between individual demographics (age, gender) and the

required monthly discounts. We found that age had no effect and women require marginally higher discounts than

men (p < 0.1).

ARTICLE I 23

their discount function thereafter. Second, as individuals vary in their critical

duration, the drop in their discount function occurs at different time durations.

As an example, person 3(5) has a critical duration of 24(36) months. Finally, the

magnitude of the drop in the discount function varies across individuals, e.g.,

person 2(5) has a much bigger (smaller) drop in their discount function. In sum,

our comparison indicates that the individual-level discount functions based on

the two models are different. Clearly such differences across the two models

will impact the predicted optimal menu of plans, which is discussed next.

4.2 Optimal tariff structure

For this illustration, we assume that the menu is comprised of two membership

plans; each with a specific contract period and an initial, one-time, flat fee.8

Let t represent the period of time of a particular subscription plan and p be

the fee associated with the plan. We define t1 (t2) as the “shorter” (“longer”)

contract period and p1 (p2) as its price. The key decision variables for the

firm are the contract periods t1 and t2 and the related prices p1 and p2. For

profit calculations, we assume that for a given membership duration (t), the

cost per customer, c(t), is an increasing function of the membership duration,

i.e., c(t) = c · t. Thus, the variable cost (c) to produce the product or service

per time unit is constant for all durations. As such, a simple assumption for

the cost function is useful as it allows us to focus our analysis on the impact of

consumers’ discounting patterns on the optimal design of subscription plans. It

is straightforward to incorporate other types of cost functions within our model

framework.

Note that our model with flexibility has four individual-level parameters,

namely, αi, βi, fi and θi. The first three parameters are estimated using the

pattern of discounting from the survey. The remaining individual-level param-

eter, θi, is estimated in the following manner. As described earlier, we told the

respondents that they were willing to pay a maximum of $100 per-month for a

six months baseline plan, i.e., a total price of $600 for the membership. Conse-

8The number of offered plans is clearly an important managerial decision in the context of designing product

lines (Iyengar and Lepper 2000, Lim and Ho 2007). Our focus on how consumers may choose only among two

offered plans helps sharpen our investigation of the impact of intertemporal discounting on plan choice and optimal

pricing.

24 ARTICLE I

Table 5 Optimal tariff structure

Optimal Tariff Structure Optimal Tariff StructureModel without Flexibility Model with Flexibility

Optimal Duration t1 8 Optimal Duration t1 4Optimal Duration t2 12 Optimal Duration t2 16Price p1 per month 93.97 Price p1 per month 107.14Price p2 per month 85.77 Price p2 per month 81.24Expected Profit per customer 1012.24 Expected Profit per customer 1268.20Discount (t1 → t2) 8.71 % Discount (t1 → t2) 24.17 %

quently, using the WTP model and imposing Tj = 6, this assumption has to be

consistent with the following:

θi

∫ 6

0

fI(t>tcrit.

i)

i (1 + αit)−βiαi − 600 = 0

For each consumer i, the parameter θi can be estimated by solving this equation.

The estimation procedure for parameter θi within the model without flexibility is

very similar with the sole difference being that there is no flexibility parameter

(fi) to be estimated from the survey. If the health club offers two membership

plans then, to maximize profits, the optimal levels of the fee and membership

duration for each plan have to be determined. To do so, we perform the fol-

lowing grid search. For each of the two offered plans, we vary the duration in

increments of 1 month from 1 to 48 months. We also vary the prices of the

plans from $45 to $125.9 From each menu of two plans, individuals choose

the tariff that provides a higher positive surplus. If none of the two plans gives

a surplus greater than zero, then a customer will not subscribe to the service.

We assume a $1 monthly variable cost per-customer (i.e., c=$1) and fixed cost

as zero. The optimal prices are determined by maximizing the sum of profits

over all respondents in our sample using a simulated annealing optimization

algorithm. Table 5 displays the results for the optimal menu of plans with the

highest expected contribution per-customer from the WTP model with/without

flexibility. The model without flexibility gives an optimal menu of plans that

have durations of 8 and 12 months with a monthly fee of $93.97 and $85.77,

respectively. The model with flexibility gives plans with durations of 4 and 16

9The self-stated monthly WTP from respondents falls within our chosen range of monthly price.

ARTICLE I 25

months with a monthly fee of $107.14 and $81.24, respectively. A comparison

of the optimal menu derived from the two models indicates that the model with

(without) flexibility predicts a shorter (longer) subscription duration for Tariff

1. This is reasonable as when consumers value flexibility, they are less willing

to subscribe to a long membership duration. Also consistent with intuition, the

model with flexibility predicts that Tariff 2 should be offered with a much larger

price discount. This quantifies the impact that the augmented discount function

has on the characteristics of the offered menu of plans.

To show the profit implications from ignoring the flexibility effect in the dis-

count function, we use the model with flexibility to assess the profitability of the

optimal pricing plans identified by the model without flexibility. This mimics

a scenario in which a firm may erroneously set the optimal prices without con-

sidering the flexibility effect, when, in reality, customers behave as observed in

our experiments. Thus, this analysis indicates the magnitude of profit reduction

that will ensue from using a misspecified model. We find that the firm should

make an expected profit of $1017.18 per customer. The firm would therefore

be forgoing a $251.02 (=1268.20-1017.18) profit per customer. Put differently,

the failure to account for the flexibility effect leads to about a 19% (=[1268.20-

1017.18]/1268.20) reduction in the firm’s profit.

5. Conclusion

Subscriptions are an often used pricing strategy by business-to-consumer com-

panies. A common type of subscription is characterized by a length of time

(e.g., a month or a year) a customer has access to a service and a corresponding

one-time flat fee for its unlimited use. A key aspect of such pricing is that the

price per-time unit declines with a longer subscription period. Given such plans,

customers face a tradeoff in their choice among them. With a choice of short

membership duration, customers incur a high average price per unit of time

but retain their flexibility to either drop the service altogether or switch service

providers. Upon choosing a subscription with a long duration, customers lose

their flexibility but benefit from price discounts. The critical information for

designing such plans is how consumers tradeoff their cost, future benefits from

26 ARTICLE I

a service and their valuation of flexibility.

In this paper, we explore the relationship between consumers’ discounting

of future benefits and the pricing of subscriptions. To this end, we conduct

several experiments. Across all studies, we find that the discount rate for con-

sumers has an inverse N-shaped pattern with respect to membership duration

- the discount rate initially decreases, after a certain length of membership, it

shows an increase and then reverts to a decreasing pattern. We also propose

and find confirming evidence that a significant predictor of when there is an

increase in the discount rate, is the maximum contract duration that consumers

typically subscribe to a service. The latter is a measure of how much consumers

value flexibility and our finding indicates that consumers require large price dis-

counts for choosing plans with membership duration exceeding their maximum

considered service length.

To draw managerially relevant implications, we parsimoniously specify the

inverse N-shaped discount function and incorporate it in a consumer-level model

for subscription choice. Using data from an experiment, we estimate individual-

level parameters of the model. We compare the optimal menu of two subscrip-

tion plans determined by our model to those from a model that assumes only

hyperbolic discounting. There are two key findings. First, we find that the menu

based on the model with the observed discounting pattern contains plans with

larger price discounts than those in the menu from a model with the hyperbolic

discounting scheme. Second, in terms of profitability, the failure to account for

the inverse N-shaped discounting pattern leads to a reduction of 19% in firm

profit.

In this paper, we empirically investigated the discounting behavior of con-

sumers in the context of subscriptions. Further research could examine the im-

pact of usage uncertainty and contract obligations for any product and/or service

on the observed discounting behavior. Another interesting area of future work

would be to explore the relationship between multiple reference points and the

discounting pattern. From a modeling perspective, we examined the design of

plans assuming that the service provider is a monopoly that offers a menu of

two plans. Future research could generalize our investigation by considering

the impact of competition on tariff design as well as the design of an offered

ARTICLE I 27

menu with more than two tariffs. Finally, we restricted our focus to the initial

purchase of a subscription. Future research may consider issues related to re-

purchase of plans, customer retention and actual usage in consumers’ choice of

tariffs. We hope this paper encourages work in these and related directions.

28 ARTICLE I

ReferencesAriely, D., G. Loewenstein. 2000. When does duration matter in judgment and decision mak-

ing? Journal of Experimental Psychology: General 129(4) 508–523.Ariely, D., G. Zauberman. 2000. On the making of an experience: The effects of breaking and

combining experiences on their overall evaluation. Journal of Behavioral Decision Making13(2) 219–232.

Berns, G. S., D. Laibson, G. Loewenstein. 2007. Intertemporal choice - toward an integrativeframework. Trends in Cognitive Sciences 11(11) 482–488.

Danaher, P. J. 2002. Optimal pricing of new subscription services: Analysis of a market exper-iment. Marketing Science 21(2) 119–138.

DellaVigna, S., U. Malmendier. 2006. Paying not to go to the gym. American Economic Review96(3) 694–719.

Dolan, R. J. 1987. Quantity discounts: Managerial issues and research opportunities. MarketingScience 6(1) 1–22.

Essegaier, S., S. Gupta, Z. J. Zhang. 2002. Pricing access services. Marketing Science 21(2)139–159.

Frederick, S., G. Loewenstein, T. O’Donoghue. 2002. Time discounting and time preference:A critical review. Journal of Economic Literature 40(2) 351–401.

Gourville, J. T., D. Soman. 2002. Pricing and the psychology of consumption. Harvard BusinessReview 80(9) 90–96.

Iyengar, R., A. Ansari, S. Gupta. 2007. A model of consumer learning for service quality andusage. Journal of Marketing Research 44(4) 529–544.

Iyengar, R., K. Jedidi, R. Kohli. 2008. A conjoint approach to multi-part pricing. Journal ofMarketing Research 45(2) 195–210.

Iyengar, S. S., M. R. Lepper. 2000. When choice is demotivating: Can one desire too much ofa good thing? Journal of Personality and Social Psychology 96(6) 995–1006.

Jedidi, K., Z. J. Zhang. 2002. Augmenting conjoint analysis to estimate consumer reservationprice. Management Science 48(10) 1350–1368.

Jones, R. A., J. M. Ostroy. 1984. Flexibility and uncertainty. The Review of Economic Studies51(1) 13–32.

Kahneman, D. 1992. Reference points, anchors, norms, and mixed feelings. OrganizationalBehavior And Human Decision Processes 51(2) 296–312.

Kalish, S. 1985. A new product adoption model with price, advertising, and uncertainty. Man-agement Science 31(12) 1569–1585.

Koopmans, T. C. 1960. Stationary ordinal utility and impatience. Econometrica 28(2) 287–309.Laibson, D. 1997. Golden eggs and hyperbolic discounting. Quarterly Journal of Economics

112(2) 443–477.LeBoeuf, R. A. 2006. Discount rates for time versus dates: The sensitivity of discounting to

time-interval description. Journal of Marketing Research 43(1) 59–72.Lim, N., T.-H. Ho. 2007. Designing price contracts for boundedly rational customers: Does the

number of blocks matter? Marketing Science 26(3) 312–326.Loewenstein, G. 1988. Frames of mind in intertemporal choice. Management Science 34(2)

200–214.

Loewenstein, G., D. Prelec. 1992. Anomalies in intertemporal choice: Evidence and an inter-pretation. Quarterly Journal of Economics 107(2) 573–597.

ARTICLE I 29

Miravete, E. J. 1999. Quantity discounts for taste-varying consumers. CARESSWorking PapersfromUniversity of Pennsylvania Center for Analytic Research and Economics in the SocialSciences. Downloaded from http://www.econ.upenn.edu/caresspapers.

Miravete, E. J. 2009. Competing with menus of tariff options. Journal of the European Eco-nomic Association 7 188–205.

Mukherjee, A., W. D. Hoyer. 2001. The effect of novel attributes on product evaluation. Journalof Consumer Research 28(3) 462–472.

OECD. 2009. Oecd communications outlook 2009. Organisation for Economic Co-Operationand Development, Report.

Ordonez, L. D., T. Connolly, R. Coughlan. 2000. Multiple reference points in satisfaction andfairness assessment. Journal of Behavioral Decision Making 13(3) 329–344.

Overton, A. A., A. J. MacFadyen. 1998. Time discounting and the estimation of loan duration.Journal of Economic Psychology 19(5) 607–618.

Räsänen, M., J. Ruusunen, R. P. Hämäläinen. 1997. Optimal tariff design under consumerself-selection. Energy Economics 19(2) 151–167.

Read, D., S. Frederick, B. Orsel, J. Rahman. 2005. Four score and seven years from now: Thedate/delay effect in temporal discounting. Management Science 51(9) 1326–1335.

Rossi, P. E., G. M. Allenby. 2003. Bayesian statistics and marketing. Marketing Science 22(3)304–328.

Samuelson, P. A. 1937. A note on measurement of utility. Review of Economic Studies 4(2)155–161.

Scholten, M., D. Read. 2006. Discounting by intervals: A generalized model of intertemporalchoice. Management Science 52(9) 1424–1436.

Scholten, M., D. Read. 2009. The Psychology of Intertemporal Tradeoffs. SSRN eLibrary,available at http://ssrn.com/paper=1444094 .

Soman, D., J. T. Gourville. 2001. Transaction decoupling: How price bundling affects thedecision to consume. Journal of Marketing Research 38(1) 30–44.

Thaler, R. H. 1981. Some empirical evidence on dynamic inconsistency. Economics Letters8(3) 201–207.

Wilson, R. B. 1993. Nonlinear pricing. Oxford University Press, New York, NY.

Zauberman, G., B. K. Kim, S. A. Malkoc, J. R. Bettman. 2009. Discounting time and time dis-counting: Subjectice time perception and intertemporal preferences. Journal of MarketingResearch 46(4) 543–556.

30 ARTICLE I

Appendix. Model estimation

In this appendix, we describe the estimation of parameters for the “ WTP modelwith flexibility”. For the “ WTP model without flexibility”, the estimation method

is very similar.

Consider a sample of N customers with each customer giving K observa-

tions of their willingness-to-pay to switch to an alternative membership plan

from a baseline membership duration. Let dit denote the discount factor for the

monthly price p of individual i and subscription duration t (i = 1, ..., N ). The

assumed relationship between the discount factor dit and the membership dura-

tion t is:

dit = fI(t>tcrit.

i)

i (1 + αit)−βiαi + εit. (A-1)

Here, for individual i, the parameter fi captures the need for flexibility and tcrit.i

is the critical duration. We assume that εit is normally distributed with zero

mean and variance σ2. The conditional likelihood Li|(αi, βi, fi, σ2) of observ-

ing the discounting behavior of consumer i across the K membership durations

is as follows:

Li|(αi, βi, fi, σ2) = (2π)−

K2 σ−K exp(− 1

2σ2(∑t∈T

(dit− fI(t>tcrit.)

i (1+αit)−βiαi )2)),

(A-2)

with T = {9, 12, 18, 24, 36, 48}. To allow for correlation among parameters we

set γi = (αi, βi, fi)T , and to account for customer heterogeneity, we assume

that the individual-level parameter vector γi = (αi, βi, fi)T follows a multi-

variate normal distribution with mean vector μγ = (μα, μβ, μf)T and variance-

covariance matrix Σ, and is restricted to the space (0,∞) × (0,∞) × [0, 1].10

Such a restriction ensures positive discounting.

10These distributional specifications were made to ensure parameter constraints such as α, β > 0 and f ∈ [0, 1].The censoring can be done without loss of generality since we do not get any modes in the posterior distributions

at the specified boundaries. If values outside these bounds are likely to occur, we would obtain modes at those

bounds. The value of the draw would be set to the bound since the contribution to the likelihoods remains the

same.

ARTICLE I 31

The unconditional likelihood L for a random sample of N consumers is:

L =N∏i=1

∫Li|(γi, σ2)f(γi|μγ,Σ)dγi, (A-3)

where the density function f(γj|μγ,Σ) isMVN(μγ,Σ). We use Markov Chain

Monte Carlo (MCMC) methods to estimate the model. This approach follows

the standard Bayesian estimation for hierarchical models (Rossi and Allenby

2003). We use the following set of proper but noninformative priors for all

population-level parameters. As a prior for the hyperparameter-vector μγ, we

use a multivariate normal distribution censored to the space (0,∞)× (0,∞)×[0, 1] with mean vector equal to (0, 0, 0.5)T , and variance-covariance matrix

with diagonal elements equal to 1000, and off-diagonal elements equal to zero.

As a prior for σ2, we use an inverse Gamma-Distribution IG(a, b) with a =

0.01 and b = 0.01.11 For the variance-covariance hyperparameter Σ we use

an inverse Wishart prior with degrees of freedom equal to 4 and scale matrix

with diagonal elements equal to 1000, and off-diagonal elements equal to zero.

We ran sampling chains for 30,000 iterations and assessed the convergence by

monitoring the time-series of the draws. We report results based on 15,000

draws retained after discarding the first half of the draws as burn-in iterations.

11Parameterization of the inverse Gamma-Distribution with density f(x) = ba

Γ(a)x−a−1 exp(−b

x )

32 ARTICLE I

Article II

Stadel, D. P., Dellaert, B. G. C., Herrmann, A., and Landwehr, J. R. (submit-

ted). Locked-In To Luxury: First- and Second-Order Default Effects in Mass

Customization. Marketing Science.

Locked-In To Luxury:

First- and Second-Order Default Effects

in Mass Customization

Daniel P. Stadel ∗

Benedict G. C. Dellaert †

Andreas Herrmann ‡

Jan R. Landwehr §

∗Daniel P. Stadel ([email protected]) is Ph.D. candidate at the University of St. Gallen, 9000 St. Gallen,

Switzerland.†Benedict G.C. Dellaert ([email protected]) is Professor of Marketing at the Erasmus School of Economics,

Erasmus University Rotterdam, 3000 DR Rotterdam, The Netherlands.‡Andreas Herrmann ([email protected]) is Professor of Marketing at the University of St. Gallen,

9000 St. Gallen, Switzerland.§Jan R. Landwehr ([email protected]) is Assistant Professor of Marketing at the University of St. Gallen,

9000 St. Gallen, Switzerland.

2 ARTICLE II

Abstract

Mass customization is a growing business practice and a strategy for firms to simultaneously

support consumer choice and increase profits. An important business objective of mass cus-

tomization is to increase sales of high-margin products by selling products that are more closely

in line with consumer preferences. In this study, we use a field experiment approach to inves-

tigate how consumers can be directed toward high margin decision paths in mass-customized

product choices through the use of defaults. We propose that default attribute levels in mass

customization affect consumers’ choices of high-margin attribute levels not only for the at-

tributes for which the defaults are set (first-order default effects) but also for subsequent at-

tributes (second-order default effects). The impact of second-order default effects is tested in

an online mass customization configurator in the car industry. Based on existing manufacturer

data, we first build a multivariate multinomial probit model that takes into account both first-

and second-order default effects on consumer product choices. We use this model to define

defaults that promote high-margin mass-customization choices by consumers. Next, we im-

plement these defaults in an online car configurator. An analysis of the resulting real-world

consumer choices demonstrates significant increases in product margins for the attributes for

which the defaults are set and, more critically, also significant effects on later attribute choices

for which no defaults were set. Finally, we conduct a satisfaction survey among users of the

online car configurator in which the defaults are implemented. The results show no negative

effects of the proposed high-margin defaults on user satisfaction. This alleviates concerns of

customer defection due to high-margin product sales. Therefore, this study provides empirical

evidence for the theoretical relevance of second-order default effects and their managerial im-

pact in designing mass-customization system configurators for greater profitability.

Key words: Mass Customization, Consumer Choice, Defaults, Online Configurators, Multivari-

ate Multinomial Probit Model

ARTICLE II 3

1. Introduction

In the age of mass customization, there are many opportunities for consumers

to customize products according to their own preferences. Consumers can de-

sign their own personal items, such as watches, shirts, bags, jackets, and shoes,

etc. It is estimated that worldwide, over 40,000 configurators are in place

across virtually all industries that allow customers to design their own products

(www.conf igurator-database.com). For example, in the car industry in certain

segments of the European car market, already over 70% of car buyers configure

their vehicles online.

When designing a product with a configurator, customers are typically asked

to make decisions to select their desired level for many attributes of the prod-

uct (e.g., in the case of cars, consumers select the level of horsepower for

the engine, the color for the exterior, the type of rims, etc.). Jointly, these

choices result in the consumers’ desired customized product, which is defined

by the set of attribute levels selected for each of the attributes. An important

way in which companies can support customers in this decision process is by

providing defaults or starting points for the attributes (Dellaert and Stremer-

sch 2005, Randall et al. 2005). Defaults are the pre-set attribute levels that

customers obtain in their product unless they make an active choice to se-

lect another level (Brown and Krishna 2004, Goldstein et al. 2008, Park et al.

2000). They are commonly observed in online configurators, which typically

present consumers with firm-specified default levels for each attribute. Aside

from the technical necessity of implementing defaults in mass-customization

configurators to be able to automatically complete the product specification,

defaults also play a decisive role in the customer decision-making process. Em-

pirical studies show that defaults are commonly selected by decision makers

and that they serve as a reference to which other available options are compared

(Brown and Krishna 2004, Johnson et al. 2002, McKenzie et al. 2006, Park

et al. 2000). Therefore, firms can have a strong influence on consumers’ mass-

customization choices by the defaults that they offer.

In this study, we specifically focus on the question of how defaults can be

used by firms to direct consumers towards more profitable high-margin decision

paths.

4 ARTICLE II

Configuring a product such as a car is a multi-attribute decision process

(Seetharaman et al. 2005). Therefore, we investigate in particular whether the

influence of a default in mass customization is not only on the consumer’s level

choice for the attribute for which the default was set but also on the level choices

they make for other attributes that come up later on in the mass-customization

process. For example, in an online car configurator, consumers need to make

up to 60 consecutive attribute-level choices and the consumer’s choice in a later

attribute may be affected by the selection of a level in an earlier attribute (e.g.,

leather seats, leather steering wheel). Due to such cross-category dependencies

among attributes (Gentzkow 2007, Manchanda et al. 1999, Song and Chinta-

gunta 2006) we expect that the effect of a change in the default for one at-

tribute will carry over to the consumer’s choices for other attributes (Sriram

et al. 2010). Indeed, anecdotal evidence from the car market shows that the en-

gine selection has a considerable impact on many other decisions regarding car

components (Wedel and Zhang 2004).

We expect this effect across attribute-level choices to have an upward profit

potential. First, because of positive cross-category dependencies among the at-

tributes (Gentzkow 2007, Manchanda et al. 1999, Song and Chintagunta 2006),

a high-margin change in the default for one attribute is likely to increase high-

margin choices for other attributes later on (Sriram et al. 2010). Second, we

expect consumers to form mental images (selective accessibility mechanism)

that drive the consistent selection of their current attribute levels (Mussweiler

2003). Therefore, once high-margin attribute levels are selected early on in the

mass-customization process, this is likely to increase consumers’ later choices

for high-margin attribute levels. Third, we also expect consumers to be sub-

ject to focalism when going through the mass-customization decision process

(Häubl et al. 2010). This implies that after selecting high-margin levels due to

the defaults set in the early stages of the mass-customization process, consumers

are likely to not fully take into account their past spending commitments when

selecting attribute levels later on in the mass-customization process (Levav et al.

2010). The reason is that consumers initially do not take into account all costs

of buying a high-level product but later are less sensitive to their earlier choices

and more focused on their current choices.

ARTICLE II 5

We empirically investigate the default-based up-selling potential in mass-

customization configurators. Our aim is to investigate how profitability in mass

customization can be increased by the extent to which default effects extend

across attributes. To better understand the effects of default selection in mass

customization, we address the first- and second-order effects of defaults early

in the process on the profitability of mass customization. The first-order effect

causes consumers to select higher-margin attribute levels within the attributes

for which the defaults were set. The second-order effect operates through

changes in consumers’ selections for attributes in the later choices of the mass-

customization process. We offer a conceptual framework to managerially guide

the selection of defaults to accommodate these two effects. Therefore, in con-

trast to past research on mass customization that addressed profitability mainly

by focusing on benefits such as organizational learning and lean management

(Alford et al. 2000, Silveira et al. 2001, Kotha 1995, 1996), we analyze the po-

tential of the profitability of a firm’s mass customization based on consumers’

choices within the mass-customization process. To do so, we model the in-

terplay of pre-set defaults and subsequent decisions and their effects on the

revenue-generating process.

The remainder of the paper is organized as follows. Section 2 discusses the

effects of defaults in mass customization configurators, a utility model

specification of consumer choice in mass-customization, and the multivariate

multinomial Probit model for the analysis of how consumers choose among

specific target attributes conditional on choices they have already made. We

also propose a procedure to maximize profitability based on the consumer pref-

erence model. In Section 3, we provide a context for the methodology and use

an example of online configuration from the automotive industry as an explicit

application to real data. We analyze product selections in the data to estimate a

preference model as a basis for optimal default selection. Then, we test the

specified optimal defaults in a field experiment with the same configurator.

With a follow-up survey, we also address the question of whether customer

satisfaction is affected by the high-margin defaults in the product configurator.

We conclude with a discussion and managerial implications of our findings in

Section 4.

6 ARTICLE II

2. High-Margin Default Selection in Mass-Customization Configurators

Mass-customization strategies offer a high variety in product design with the

aim of matching customers’ product preferences more closely than when only

a few product options are available (e.g., Davis 1987, Duray et al. 2000, Kotha

1995). A commonly used approach to achieve this aim is to provide consumers

with access to online product configurators (Randall et al. 2007, Salvador et al.

2009). These are in essence elaborate choice boards with which customers can

compose products according to their own needs. Product designs can then either

be saved to the customer’s own account or be directly placed as an order to the

firm (e.g., www.nikeid.com).

Recent research has investigated the (economic) value of mass-customized

products from a customer’s perspective (Franke et al. 2009, Franke and Schreier

2010, Franke et al. 2010, Schreier 2006). Mass customization can provide a

higher preference fit (Franke et al. 2009, Franke and Schreier 2010, Franke

et al. 2010, Schreier 2006), process benefits (Franke and Schreier 2010, Schreier

2006), a self-design- effect (Franke et al. 2010, Schreier 2006), design effort

(Franke et al. 2010), and product uniqueness (Schreier 2006). All these aspects

directly affect the perceived value of a customized product. Thus, companies

have to face this challenge of striking the balance between complexity and util-

ity in mass customization (Dellaert and Stremersch 2005). One opportunity to

address both the preference fit and the design effort is the use of intentionally

set defaults for several options within the customization process. Such defaults

lower the complexity of the decision-making process, as they serve as a refer-

ence to which other options are compared (Brown and Krishna 2004, Johnson

et al. 2002, McKenzie et al. 2006, Park et al. 2000) but do not limit the products’

individuality, as the defaults can be changed when customers make an active

choice (Brown and Krishna 2004, Goldstein et al. 2008, Park et al. 2000).

2.1 First- and Second-Order Default Effects

In an online configurator, the customer has to go through several choices in

order to select the attribute levels they prefer for the product that is being cus-

ARTICLE II 7

tomized. Theoretically, we expect a strong direct effect of defaults among the

attributes for which they are set, as this is shown in past research (Brown and

Krishna 2004, Johnson et al. 2002, McKenzie et al. 2006, Park et al. 2000).

Therefore, we expect that consumers frequently follow the manufacturer’s pre-

set levels when making their mass-customization choices (in our field study, the

default acceptance rates range from 9% to 87%).

In this study, we are specifically interested in the effects that defaults may

have on subsequent option choices because such sequential decisions often are

not made independently (Gabaix et al. 2006, Iyengar and Lepper 2000, Muraven

and Baumeister 2000). Although one might expect that budget considerations

should drive consumers to select less expensive attribute levels later on in the

customization process once they have made more expensive choices in the be-

ginning, we propose that there is a likely positive effect of initial high-margin

defaults on later high-margin attribute choices. This is due in part to cross-

category complementarities among the attributes (Gentzkow 2007, Manchanda

et al. 1999, Song and Chintagunta 2006), a change in the default for one at-

tribute is likely to also affect choices for other attributes later on (Sriram et al.

2010).

Furthermore, we also expect behavioral influences to increase the selections

of high-margin attribute levels later on. From a psychological perspective, de-

faults constitute the point of reference that people have in their mind when they

evaluate the other attribute levels of the considered attribute. They thus en-

gage in a comparison process, where they evaluate the expected utility of each

alternative attribute level in comparison to the default level to come up with

their decision. Such comparison mechanisms are very well understood in the

psychological literature, and it has been shown that people usually engage in a

form of hypothesis-consistent testing (Mussweiler and Strack 1999, Mussweiler

2003). That is, people start to mentally activate images and associations that are

consistent with choosing the default. For instance, a high level for horse-power

might activate mental images of luxury, freedom, and youthfulness. The alterna-

tives are then judged in comparison to these mental images and only regarding

the dimensions that have been activated by the default (Houston and Roskos-

Ewoldsen 1998). The default thus has an advantage over the alternatives that

are only chosen if they are considerably superior. For the present context, the

8 ARTICLE II

most important prediction of this comparison process is that mental images that

have been activated for the comparison process stay activated in the mind and

are therefore likely to influence later choices (Mussweiler and Strack 1999).

That is, the activated mental images and associations stay highly activated in

the person’s working memory and work as priming for later decisions that thus

have an increased likelihood of being consistent with the activated mental im-

ages. Therefore, if an image of luxury, freedom, and youthfulness has been

activated, later decisions are made to be consistent with that image and thus

correspond to the initial default.

Finally, due to focalism in the consumer decision process, we expect con-

sumers to cognitively focus more strongly on their current attribute-level choices

than on previous or later attribute-level choices (Häubl et al. 2010). This effect

is bidirectional in that consumers (i) do not fully take into account their fu-

ture attribute-level choices when selecting their current attribute levels (Levav

et al. 2010), and (ii) are not likely to fully take into account all past attribute-

level choices when making their current decisions (cf. Miller 1956). We expect

that this focalism effect will lead consumers to select relatively more expen-

sive attribute levels in their mass-customization choices than when they choose

between complete product alternatives simultaneously. The reason is that con-

sumers initially do not take into account all costs of choosing a high-level at-

tribute option but later on forget about the costs of the earlier high-level attribute

choices they have already made.

Conceptual framework

Using the data from their online configurator, firms can specify two sets of

attributes for different purposes in the default-setting process: (i) the set of at-

tributes among which they offer the default levels and which we refer to as key

attributes, and (ii) the set of attributes among which they expect second-order

effects and which we refer to as target attributes. Most desirably, the category

of key attributes is characterized by its importance for the product; that is, one

level of each attribute is essential for the product functionality (e.g., engine for

a car), and the category of target attributes is characterized by high margins.

ARTICLE II 9

Figure 1 Default Effects Framework

A general framework helps to understand the relationships that are being an-

alyzed. The framework demonstrates the expected association between key and

target attributes. We expect the default selection among key attributes to have

significant carry-over effects on the subsequent level choices among the target

attributes. Figure 1 summarizes the position of key and target attributes within

the analysis and displays the relationship we explore. There are two optimiza-

tion considerations. First, one can analyze what would be the best default set

among the key attributes to entail optimal consumers’ choices among the key

attributes themselves from the firms’ perspective (which here is profitability),

termed first-order optimization. Second, one can analyze what would be the

best pre-set options among key attributes to entail optimal decisions within the

sets of key and target attributes, termed second-order optimization (the second-

order effect included). In other words, the second-order optimization incor-

porates the interplay of choices among pre-specified and subsequent attribute

choices within the product design process into the calculation. We address the

joint first- and second order optimization problem in this study (see Figure 1),

and determine what is the best default set among key attributes in terms of re-

sulting level choices under consideration of effects (second-order effects) within

the category of target attributes. This means, we analyze which defaults are best

in terms of profitability with respect to the joint revenue from attributes where

the pre-set options are directly applied to (key attributes) and of attributes whose

choice probability is most likely to depend on choices among the key attributes

10 ARTICLE II

(target attributes). In Section 2.4, we approach the question of how to find an

optimal default set, to maximize profitable across the two effects. Sections 2.2

and 2.3 develop the underlying utility model to drive this optimization.

2.2 Utility specification

For the utility specification, we consider a utility surplus model such that we

calculate the differences between two aspects of utility for each target attribute

level; (i) the joint utility from the target attribute level in combination with all

key attribute level choices (uK+T ), and (ii) the utility from the key attribute level

choices only (uK). The resulting utility component uT = uK+T − uK is then

the relevant combination of the target attribute utility and its interactions with

previous key attribute choices. With such a model formulation, we separate out

the key attributes effects from the joint product utility. Hence, with the resulting

utility we can capture the key attributes’ influence on the target choice. As a

consequence we can also determine the different effects for different choices

among key attributes on target attribute utility, which allows us to select the

optimal default setting mix to affect both the key attributes for which they are

set and the subsequent target attribute choices. Before we formulate the surplus

model in all technical details, we start with the notation and handling of the

different variables in our model.

To develop our utility model, we consider each attribute level for the key

attributes for which the defaults are set as a single dichotomous variable in the

decision process. Therefore, let xlv be the indicator for choosing the vth level

of the lth attribute, namely,

xlv =

{1, if the vth level of attribute l is choosen0, if the vth level of attribute l is not choosen , (1)

where l = 1, . . . ,m, where m represents the number of attributes in the data

set (or considered in the analysis, respectively), and v = 1, ..., kl with kl the

number of levels for Attribute l.

For the target attributes, we take a different approach and consider each

attribute as a polychotomous (multinomial) variable. Let dj be the chosen level

ARTICLE II 11

for attribute j such that

dj =

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

1, if the 1st level of attribute j is choosen

2, if the 2nd level of attribute j is choosen...

k, if the kth level of attribute j is choosen

0, if no level of attribute j is choosen

, (2)

where j = 1, . . . , J , with J representing the number of attributes in the data

set (or considered in the analysis, respectively), and kj = |{0, 1, . . . , k}| with

kj representing the number of levels available for attribute j.12 The data can

be defined and formatted in any way that is convenient for a specific analysis.

Generally, as output from a configurator, the data consist of discrete, dichoto-

mous, and/or polychotomous entries in one of the given structures above and

can be easily transformed into one another.

To analyze the choice behavior through a probability model, we translate the

choice behavior into a utility context. First, as each level of a target attribute, in

combination with levels of key attributes, has a certain utility (either positive or

negative) for each customer i, we formulate a constant utility specification for

each level of the target attributes. That is:

vij(tj) = βConfigjtj

, (3)

where vij(tj) is the joint utility consumer i perceives from choosing level tj

of the jth target attribute in combination with the previous choices among the

key attributes. Second, we define a linear utility specification for the set of key

attributes. Therefore, according to equation (1), let x be the indicator vector

whose entries are set to one if the corresponding attribute level is chosen and

zero otherwise. Furthermore, let βKey = (βKey11 , . . . , βKey

1k1, . . . , βKey

mkm)T be the

vector of utilities associated with each of the k· levels of the m key attributes

represented in the choice vector x = (x11, . . . , x1k1, . . . , xmkm)T . Therefore,

assuming a linear-additive utility function for each customer i we get

ui(x) = xTβKey , (4)

12Using this structure of the data, the no-choice option is included in the number of kj attribute levels (e.g., threechoice options: no choice, standard or advanced→ kj = 3).

12 ARTICLE II

where ui(x) is the utility consumer i perceives from choosing a specific combi-

nation x from the key attribute levels. We assume that consumer i (i = 1, ..., n)

cannot choose more than one level of each attribute. A consumer i will then

only choose a certain attribute level if it maximizes his/her utility surplus. In

other words, the choice for a specific level tj of a target attribute j has to pro-

vide the maximal possible increase of his/her perceived joint utility from the

set of chosen key attribute levels x and the chosen level of the specific target

attribute. Consequently, the resulting utility surplus from equations (3) and (4)

for consumer i and level tj of the jth target attribute is given by

zij(tj) = vij(tj)− ui(x) = βConfigjtj

− xTβKey . (5)

Therefore, a utility-maximizing consumer i would choose level t∗j of the jth

target attribute, if it has the maximum utility surplus {zij(t∗j) > zij(tj), ∀tj �=t∗j}, and would choose none of the alternatives if the no-choice option (tj = 0)

has the maximum utility {zij(0) = 0 > zij(tj), ∀tj �= 0}.

Our utility surplus specification in equation (5) is being considered sepa-

rately for each of the J choices among the target attributes within the multi-

variate multinomial probit model as it is a combination of J multinomial probit

models. This implies an assumption of no interaction effects between target

attributes as the choices among them are being made sequentially. This means,

we have no extra parameters in the model that account for interaction effects

of the current target attribute being chosen and previous choices among them.

Possible interaction effects are then included in the attribute level intercepts

(βConfigjtj

) as this also accounts for the average effect of earlier target attribute

selections. The reason for such an assumption is simply of practical matter to

not overwhelm the model with parameters that significantly increase the model

complexity and the resulting computational effort for parameter estimation. In

order to accomodate for such interactions between target attributes, we further

model a full correlation structure; that is, we allow for correlations among al-

ternatives within decisions and among alternatives between all other choices,

as we have already pointed out that choices are not being made independently.

Consequently, if a target attribute level in choice step 1 has a positive interac-

tion with a target attribute level in choice step 3, the corresponding correlation

with respect to the underlying utility is significantly greater than zero. In such

ARTICLE II 13

a way, we still incorporate associations between target attribute levels although

we have not specifically adressed them in the formulation of the surplus model.

As already mentioned, we use a probit model approach, as such models

are common practice in the analysis of polychotomous response data (Albert

and Chib 1993). Due to our definition and the nature of the data, the actual

choices being made among the target attributes, according to (2), can be con-

sidered multinomial outcomes. Because we have more than one target attribute

(at least in the general case), we end up having a multivariate multinomial out-

come vector. To address this issue, in the next section, we specify a multivariate

multinomial Probit model. To account for heterogeneity among individuals,

we follow a standard Bayesian approach for parameter estimation (Rossi and

Allenby 2003).

2.3 Multivariate multinomial Probit model

Starting from amass of n customers, suppose that each individual i (i = 1, ..., n)

has to make J choices within the set of target attributes. For the first choice, k1

options are offered, for the next, k2 options, and so on up to the last choice with

kJ options (Recall: The no-choice option is included in the number of options

kj). Let, then, according to equation (2), di = (di,1, ..., di,J)T denote the index

vector of the alternatives the ith individual chooses for the J decisions. Assume

that each of these J choices follows a multinomial Probit model (Zhang et al.

2008)13. As explanatory variables for all J decisions, we introduce the previous

choices among a specified set of key attributes into the model, as defined in

equation (1), x = (x11, . . . , x1k1, . . . , xmkm)T . Therefore, following our utility

specification in equation (5), for the jth choice, j = 1, ..., J , there exists a

13For detailed discussions of multinomial Probit models, and the treatment of additive and multiplicative redun-

dance, see for example McCulloch and Rossi (1994), McCulloch et al. (2000) and Imai and van Dyk (2005a,b).

14 ARTICLE II

(kj − 1)-dimensional underlying utility vector zi,j, with

zi,j =

⎛⎜⎜⎝

1 0 · · · 0 −xT0 1 · · · 0 −xT...

.... . .

......

0 0 · · · 1 −xT

⎞⎟⎟⎠×

⎛⎜⎜⎜⎜⎜⎝

βConfigj1

βConfigj2...

βConfigj(kj−1)βKey

⎞⎟⎟⎟⎟⎟⎠+ εi,j = Xi,jβj + εi,j,

where βj = (βConfigj1 , βConfig

j2 , . . . , βConfigj(kj−1), (β

Key)T )T , with βKey = (βKey11 , . . .,

βKey1k1

, . . ., βKeymkm

)T , satisfying

di,j =

⎧⎨⎩

0 if max1≤l≤kj−1(zi,j,l) < 0

r if max1≤l≤kj−1(zi,j,l) = zi,j,r > 0(6)

where zi,j,l is the lth component of the utility vector zi,j, zi,j ∼ N(Xi,jβ,Σj)

and {Σj}(1,1) = 1 for i = 1, ..., n and j = 1, ..., J ; εi,j ∼ N(0,Σj). Equation (6)

describes a fully identifiable multivariate multinomial Probit model (MVMNP;

see Zhang et al. 2008).

Contemplating all information together, we can now set up our MVMNP for

the J choices within the target category as follows:

zi = Xiβ + εi, εi ∼ N(0,Σ)

where zi = (zi,1, . . . , zi,J)T are the utility surpluses for each alternatives in

the several decisions with zi,j = (zi,j,1, . . . , zi,j,(kj−1)), j = 1, . . . , J , Xi =

(XTi,1, . . .,X

Ti,J)

T , β = (βConfig11 , βConfig

12 , . . ., βConfig1(k1−1), . . . , β

ConfigJ1 , βConfig

J2 , . . .,

βConfigJ(kJ−1), (β

Key)T )T and εi ∼ N(0,Σ)with diagonal elements ofΣ equal to one,

i.e. σq,q = 1, where q = 1, (1+k1−1) = k1, (1+k1−1+k2−1) = (k1+k2−1), . . . , (1+k1−1+k2−1+. . .+kJ−1−1) = (k1+k2+. . .+kJ−1−(J−2). More

generally, q = 1+∑j−1

s=1(ks− 1), with j = 1, . . . , J . Our explanatory variables

ARTICLE II 15

can be represented by a(∑J

j=1(kj − 1))×(∑J

j=1(kj − 1) +∑m

l=1 kl

)-matrix;

Xi =

⎛⎜⎜⎝

1 0 · · · 0 −xi,11 · · · −xi,1k1 · · · −xi,mkm

0 1 · · · 0 −xi,11 · · · −xi,1k1 · · · −xi,mkm...

.... . .

......

......

......

0 0 · · · 1 −xi,11 · · · −xi,1k1 · · · −xi,mkm

⎞⎟⎟⎠

︸︷︷︸A

︸︷︷︸B

with m the number of key attributes. The matrix A is a∑J

j=1(kj − 1) ×∑Jj=1(kj − 1)- identity matrix and includes the constants equal to 1 for the

configuration utilities, vij(tj) = βConfigjtj

, of the choice options for all J possible

choices. The matrix B includes the actual choices individual i made within the

set of key attributes. Obviously, the choices among the key attributes are the

same for all subsequent decisions within the target category. Consequently, the

rows of matrix B are identical. The parameter estimation process, a Bayesian

sampling algorithm for MVMNP models, is presented in appendix B.

The corresponding choice probabilities P(di) for the outcome vectors di =

(di,1, ..., di,J)T are then simply calculated as the probabilities for the latent utility

surpluses zi = (zi,1, ..., zi,1)T to fall within the ranges to match the resulting

choice vector di. This can easily be done as in our multivariate multinomial

probit model, we assume the latent utility surpluses to arise from a multivariate

normal distribution, zi ∼ N(Xiβ,Σ).

2.4 Profitability maximization

We propose a two-tiered attribute selection procedure as a profitability maxi-

mization strategy. First, the knowledge, experience, and expertise of company

authorities and/or area experts is used to limit the pool of possible attributes

to be considered in the analysis. This qualitative survey for attribute selection

also guarantees the managerial feasibility of using the levels of possible key

attributes as pre-set options in the online configurator and for the most promis-

ing target attributes to try and influence. In other words, company experts and

authorities can give helpful information about expected associations between at-

tributes, the willingness to pre-set levels for the specified attributes and whether

16 ARTICLE II

these pre-set options are practicable from a technical perspective. Second, a

contingency analysis can be conducted to investigate the relationships of the

qualitatively appointed attributes. This analysis reveals which relations among

the appointed attributes are significant. This attribute selection procedure is an

iterative process and results in a set of attributes best meeting the managerial

expectations and requirements. To do this, we use a simple maximization pro-

cedure.

Once we have estimated the probability model, we can go one step further

and determine the choices within the set of key attributes that are most likely

to increase the joint profitability of mass-customization systems with respect to

the appointed key and target attributes. The result will then be referred to as an

optimal default set. As discussed in the introduction, we build on the results of

previous research that has shown that pre-set options are a very powerful tool

to support consumers’ choices (e.g., Goldstein et al. 2008, Brown and Krishna

2004). Therefore, we attempt to provide a default combination among the key

attributes that has the highest probability of putting consumers on a high-margin

decision path jointly considering key and target attributes, that is, not neglecting

possible second-order effects depending on previous decisions. We use a sim-

ple iterative maximization procedure. The object to be maximized is the joint

expected profit E(profit|x) coming from the key and target attributes. In other

words, the profit for the target attributes weighted with the corresponding prob-

ability determined in the previous section conditional on choices among the key

attributes plus the profit coming from the chosen key attribute levels. Thus, we

have to solve the following maximization problem:

maxx

[E(profit|x)] =

maxx

[k1∑

l1=1

. . .

kJ∑lJ=1

P(d1 = l1, . . . , dJ = lJ) · v(d1 = l1, . . . , dJ = lJ)

+ xT (p(x)− c(x))

],

(7)

with v(d1 = l1, . . . , dJ = lJ) = p(d1 = l1, . . . , dJ = lJ)− c(d1 = l1, . . . , dJ =

lJ), where p(d1 = l1, . . . , dJ = lJ) simply denotes the sum of the prices for

ARTICLE II 17

the according choices among the target attributes to be paid by the customers

and c(d1 = l1, . . . , dJ = lJ) the sum of costs for the car manufacturer for

the chosen target attributes. The vector x, according to (1), stands for the

specific key attribute levels, p(x) is the corresponding price vector and c(x)

is the cost vector associated with the key attribute levels in x. The probabilities

P(d1 = l1, . . . , dJ = lJ) are calculated from the multivariate multinomial Probit

model (6) presented earlier in Section 2.3. The corresponding optimal defaultset resulting from our model is then given by

xopt = argmaxx

[k1∑

l1=1

. . .

kJ∑lJ=1

P(d1 = l1, . . . , dJ = lJ) · v(d1 = l1, . . . , dJ = lJ)

+ xT (p(x)− c(x))

].

(8)

Consequently, the optimal default combination is determined via an iterative

grid search algorithm that evaluates the expectation function for each possible

combination x of default setups within the set of key attributes (or any other

defined support for x that one wishes to optimize the profit for). For better

clarification and to provide evidence for the relevance of such an analysis, we

bring the multivariate multinomial Probit model in the context of an online car

configurator. Therefore, in the next sections, we refer to real data from the

automotive industry.

3. Online car configurator: Existing real-worlddata and a field experiment

We now apply the proposed probability and optimization model to real data in

an application in the automotive industry. In particular, we analyze existing data

and conduct a field experiment on the basis of the online car configurator of a

renowned premium car manufacturer. First, we describe the existing configurator

consumer choice data for 2,500 real-world customers, the parameter estimation

of the multivariate multinomial Probit model based on this data and the re-

18 ARTICLE II

sulting default level selection. Next, we discuss a field experiment with 8,608

customers that was designed on the basis of the initial analysis and that allows

for a more controlled test for the first- and second-order effects. Finally, we

introduce the results of a customer satisfaction survey to investigate whether

the selection of a high-margin default level might harm customers’ satisfaction

with the product and the decision-making process.

3.1 Existing online configurator customer choice data

A large data set containing 2,500 real online car configurations was available

for analysis. These online car configurations contain the choices of real poten-

tial customers who are very likely to be interested in purchasing a car, as the

data in our analysis is the actual output of a functioning online car configurator

provided on the website of a renowned premium car manufacturer. This is a

unique source of data regarding real decision behavior in a mass customization

environment for durable goods, particularly cars. The data contain individual

car designs customers have stored to their account. The company can obtain

these configurations from its own website on which the co-design platform is

implemented.

The data is structured as follows. For each stored configuration, all attribute

levels to be individually chosen within the configuration process are stored with

either a "chosen" label or a "not-chosen" label. Consequently, we can set up a

data structure as described in Section 2.2. We define the variables representing

the key attribute levels according to Equation (1) as dichotomous variables, and

the variables representing the target attribute levels according to Equation (2) as

polychotomous (multinomial) responses, respectively. This allows us to set up

the multivariate multinomial Probit model as covered in detail in Section 2.3.

In our field study, several default combinations determined by using the pro-

posed model are embedded in the manufacturer’s real online car configurator.

When starting the configuration process, a customer was randomly assigned to

one out of five configuration conditions. The five conditions include one con-

dition without any defaults being used and four sets of a varying number of

defaults of different levels. If a customer was assigned to one of the four default

ARTICLE II 19

conditions, she was asked whether the car configuration should start with pre-set

options or not. A reasonably large number (15%) decided to start with defaults.

The data set for our field study contains 8,608 observations. In Section 3.2,

we describe the experiment in detail and conduct several tests. We test whether

there are significant differences among the five configuration conditions with

respect to profit from key and target attributes. The results provide evidence

for first- and second-order default effects. In the next section, we start with the

determination of the key and target attributes relevant for our application.

3.1.1 Key and target attributes

A first step with respect to our analysis, is the specification of all variables of

interest, that is, the dependent variables (the choices among target attributes)

and the independent variables (the choices among key attributes). The criteria

for selecting the key and target attributes are set by the firm and can be se-

lected to match the specific application. Therefore, we first maintained several

interviews with company authorities of the premium car manufacturer. The ba-

sic principles, at this step, for the selection of key and target attributes, were the

following: (i) the experts’ opinions and appraisals concerning relations between

consecutive attributes in the configuration process; (ii) the company’s willing-

ness to test several pre-set options for the key attributes in the real online car

customization platform (i.e., to test the pre-set options with real potential cus-

tomers), and (iii) the margins with respect to the target attributes. In more detail,

for the target attributes, the company authorities focused on extra equipment

and accessories such as multimedia attributes. One characterization of such at-

tributes is their ordering in huge quantities at a considerably low price by the car

manufacturer. Therefore, they potentially generate high profit levels when sold

with cars. The specified set of five key attributes includes the business pack-

age (1 level), rims (2 levels), upholstery (4 levels), steering wheels (3 levels)

and seats (3 levels) with a total of 13 different attribute levels. Therefore, there

are 13 variables for the set of key attributes whose levels can either be chosen

or not (dichotomous variables). The infotainment-attributes are the appointed

target attributes in our analysis. We have four target attributes, namely, naviga-

tion systems, radio, phone equipment and sound systems, each with more than

one level (polychotomous variables). Figure 2 summarizes the model setup for

20 ARTICLE II

Figure 2 Default Effects Framework - Online Car Configurator

our analysis according to Figure 1. In a second, successive step, we conducted

4×5 = 20 contingency analyses to statistically investigate the relations between

the key and target attributes. As a result, we obtain significant relations between

the appointed sets of key and target attributes. Table A-1 provides the results of

the respective contingency analysis. Accounting for α-error inflation, we still

get an overall significance level of α∗ = 1− (1−0.0001)19(1−0.0283) = 0.03

for all χ2-tests. The χ2-based measure of coherence, Cramer’s V , shows the

degree of association (see Table A-1). Therefore, we statistically justified the

reasonability of the attribute selection to be used in our model; therefore sig-

nificant parameter estimates are to be expected. Further aspects that were con-

sidered in the selection of the key attributes were the actual positions of the

attributes within the product (car) configuration process. Our attempt was to

avoid default crowding in one single configuration screen, as only a few pre-set

options can prevent customers from forming negative metacognitive impres-

sions (Wright 2002). The appointed key attributes are equally distributed over

the entire customization sequence, but are set previous to the target attributes.

Therefore, we make sure that consumers do not feel patronized in their choices

by a large number of pre-set options. Finally, we have adequately set up the

probability model. In the following sections, we estimate the corresponding

model parameters, describe a field study, and assess further consequences of

defaults in the car configurator.

ARTICLE II 21

3.1.2 Parameter estimation

As already given above, we have access to a large data set from a renowned

premium car manufacturer. The data consist of 2,500 car configurations stored

from the company’s real online car configurator on its corresponding website.

We follow a standard Bayesian approach for parameter estimation (Rossi and

Allenby (2003); see also appendix B). To estimate our model parameters, we

ran sampling chains for 100,000 iterations and assessed the convergence by

monitoring the time series of the draws. We report results based on 50,000

draws retained after discarding the first 50,000 of the draws as burn-in itera-

tions. Table A-2 from appendix A provides the posterior means and 95% pos-

terior intervals for the parameters βConfig11 , βConfig

12 , βConfig21 , βConfig

31 , βConfig32 ,

βConfig33 , βConfig

41 , βConfig42 , (βKey)T . We can obtain that almost all parameters,

except βConfig42 (Configuration with - Bose Surround Sound) are significantly

different from zero (i.e., the zero is not included in the 95% posterior interval).

The parameter estimates are only to be relatively interpreted as how the surplus

function (5) changes. Considering the table more closely, we can obtain that

all estimated configuration utilities (βConfigjk , k = 1, . . . , kj − 1, j = 1, . . . , 4)

are smaller than zero (except βConfig41 ), that is, the utility surplus is smaller than

zero if no key attribute is chosen at all. This implies that there is a strong

second-order effect, as the specification is only chosen if the surplus becomes

positive. In other words, the choices of the target attributes certainly depend

on the chosen levels of key attributes. Negative estimates for the key attribute

levels coefficients imply that a chosen key attribute level increases the utility

surplus. This confirms the reasonability of the selected sets of attributes and the

effect of the appointed key attributes on subsequent level choices according to

our MVMNP model. Only for the DSP-sound system level do we get a posi-

tive surplus even if no key attribute is chosen14. Looking at the margins of the

parameter estimates for the attribute sound systems (βConfig41 and βConfig

42 ) and

all key attributes (βKey·· ), we can obtain that any combination x of key attribute

levels does not affect the order of the utility surpluses for the DSP sound system

and the Bose sound system, that is, maxx zi4(2) = maxx(βConfig42 − xTβKey) =

0.0988 < minx zi4(1) = minx(βConfig41 − xTβKey) = 0.283. This is an indica-

14The DSP sound system was the only attribute level within the target attributes that could be chosen by the

customers at no additional costs.

22 ARTICLE II

tor that for the sound system attributes the predictive power of the key attribute

levels might be weak according our model. The corresponding posterior distri-

butions (densities) can be obtained from Figure A-1 in appendix A. To assess the

goodness of fit for the model, we could use the pseudo-R2 of Nagelkerke (1991)

and Cragg and Uhler (1970), a measure of determination based on the idea of

the R2 from OLS regression. However, Hagle and Mitchell II (1992) state that

if the underlying latent variable is available, the usual OLS-R2 itself is the mea-

sure of determination to be used. In the MCMC sampling algorithm described

in appendix B, we sample the underlying latent variable zi, with zi = Xiβ.

Therefore, we can simply calculate the OLS R2 to determine the variability ex-

plained through our model. Because we have MCMC samples, we can actually

determine an empirical distribution of the R2 of the underlying regression prob-

lem and determine a confidence region for this measure of determination. The

resulting R2 and R2adj., respectively, have a 95% highest probability density re-

gion ranging from 0.59 to 0.61. Accordingly, on average, 60% of the variation

in our data can be explained by our proposed model. This is reasonably good

for real-life data. The corresponding densities of the empirical distributions of

R2 and R2adj. are provided in Figure A-2 in appendix A. Next, we determine

default sets, which are to be expected to have influence on the profitability of

the car configurator. These defaults are subject to be tested in the affiliated field

study.

3.1.3 Optimal default selection

In this section, we determine the optimal default set (8) as a solution of our max-

imization problem (7) and four other sets, which we will later use in the field

study. Due to the lack of cost data, we will neglect the costs (i.e., c(·) = 0) in

our calculations15 and simply focus on the joint turnover (sum of prices) coming

from the key and target attributes.16 The grid search evaluates the expectation

function E(profit|x) for the 480 possibilities of different default combinations

15This can be done without loss of generality since (i) we want to investigate the interplay between previous

and later choices conditional on defaults, and (ii) its effect on managerial relevant variables such as profit. If cost

data are available, the results might be different in terms of different defaults and different predicted values but the

implications remain the same.16In the following it is still referred to profit as the variable of interest, as in the general case the costs are not

equal to zero

ARTICLE II 23

Figure 3 Expected Profit Levels for Default-Combinations

a) Key−Attributes − Profit Directions

Default Sets

Pre

dict

ed D

irect

ions

Set I Set II Set III Set IV

Baseline: No Target Attributes Profits

b) Target−Attributes − Profit Directions

Default Sets

Pre

dict

ed D

irect

ions


Baseline: Sample Average Profit for Target Attributes

x. Table A-3 in appendix A provides the results. The optimal default set most

likely to maximize the expected profit E(profit|x) includes the business pack-

age, 19-inch aluminum rims, leather upholstery, a multi-function steering wheel

with a wheel mounted shifter and integrated heating, and, finally, a memory

function for the front seats. This basically tells us that in our obtained data

set, configurations including exactly these levels among the key attributes, on

average, result in the maximal joint profit coming from both, the key and tar-

get attributes. It can also be seen that higher (lower) valued attribute levels are

included in the optimal set (set I) for pre-set options. This is an indicator that

these customers configured their cars on a luxury (economy) path. The other

four combinations in Table A-3 were determined to provide a variety of ex-

pected outcomes for the field study. One that is likely to lead to a small profit,

one that is likely to generate a high profit, and the other two sets were specified

to give insights about the sensitivity of changes among the pre-set options, that

is, to lie between the expectations of the low-level (set I) and high-level (set

IV) defaults. The performance of these four sets is subject to be investigated

in the field study presented in the consecutive section. We will later compare

the realized profit levels from the field study. Figure 3a) displays the expected

directions for the profit levels of the key attributes according to the default sets

I to IV. Figure 3b) provides the corresponding expected directions of the profit

levels as well as the no-default baseline for the target attributes according to

our model. As we can see (from Figure 3a)), the expectations for the profit in-

creases with respect to the key attributes are all positively directed which is due

to the fact that a consumer’s choice of one attribute level generates more profit

24 ARTICLE II

than no choice. We also obtain an increasing pattern within the key attributes

profit levels. This is simply explained by the increasing number of defaults and

the higher default levels, respectively; therefore, we see that the default levels

increase, say, from economy to luxury. The expected profit levels for the target

attributes (see Figure 3b)) draw a slightly different picture. We obtain nega-

tively directed profit increases, i.e. profit decreases, for the default sets I and II,

and we expect positive influences on the target attributes’ profits for the default

sets III and IV. In other words, we simply expect the average return from the

target attributes to be lower for the default sets I and II, and to be higher for the

default sets III and IV as compared to the average return for the target attributes

from configurations in the "no default"-condition.

Summarizing, we expect the first-order default effect to only be postively di-

rected and to increase for higher level defaults, and we expect the second-order

default effect to be negatively directed for the sets I and II, as well as positively

directed for the sets III and IV. In general, we expect different results for differ-

ent attribute levels being pre-set in the online car configurator. Consequently,

the presence of such expected differences in the performance of defaults should

motivate companies to conduct such types of analysis, as the results can lead

to improvements of product configurators and reveal sensitivities to different

default levels being used. We subsequently use the results of our model to test

the proposed default sets in the real online car configurator of the premium car

manufacturer. In the following, we provide the field study and the respective

results.

3.2 Field experiment: The impact of profit margin increasingdefaults in a real online car configurator

In the previous section, we estimated our model and determined the default sets

to be tested in a real life mass customization configurator. We will then describe

how we proceeded with our field study, and discuss the corresponding results.

Furthermore, we discuss an additional survey, which was designed to investigate

how defaults affect customer satisfaction. The goal of this additional survey was

simply to see whether customers form a negative impression when confronted

with defaults within the customization process. The results show that we can

ARTICLE II 25

replicate the raising pattern with respect to the expected profit levels, as actu-

ally realized profit levels, for the default sets I to IV with respect to the key

attributes. For the target attributes, we can also replicate the decrease-increase

pattern and show that we get strong (statistically significant) second-order ef-

fects for the default sets I to IV. In addition, we also show that customers do not

form negative impressions due to defaults in the mass-customization process.

Therefore, we offer a reliable foundation for managerial usage in companies

offering product configurators as strategy of sales and distribution.

Modus operandi

For the field study, the authors were actually allowed to embed the several sets

of pre-set options (i.e., four default sets - I, II, III and IV) into the real online

car configurator. We randomly assigned one of the specified sets (see Table A-

3) to consumers using the car configurator of our premium manufacturer. To

recall the research objective: we investigate, whether we can set consumers off

on a high-margin path using accordingly determined pre-set options or not. We

expect to replicate the raising pattern for the key attributes (first-order effect)

in the generated profit levels conditional on the different sets of defaults dis-

played in Figure 3a), namely, for the default sets I, II, III and IV. For the target

attributes, we expect an decrease-increase pattern as in Figure 3b). In the end,

we compare the average profit levels from key and target attributes of the five

conditions (I, II, III, IV and no defaults) with each other. The study lasted 49

weeks.

Results

After the 49 weeks (and 8,608 car configurations), we have 292 configurations

with Default Set I, 267 configurations with Default Set II, 248 configurations

with Default Set III, 238 configurations with Default Set IV and 7,149

configurations with no pre-set options at all. The remaining 414 configurations

have been removed from the data set due to the fact that these customers skipped

the infotainment attributes within their configuration process, and therefore an

analysis of the second-order effect has not been possible. This basically means

that 12.75% of all configurations have accepted to start with defaults. This im-

plies that with approximately 8,700 stored car configurations per year (based on

8,200 configurations in 49 weeks), the yearly profit-multiplicator for different

26 ARTICLE II

Figure 4 Generated Profit Levels for Default-Combinations

a) Realized Key−Attributes Profits

Default Sets

Pro

fits|

x

3000

3200

3400

3600

3800

4000

4200

4400


Profits|x=3672.39

Profits|x=3933.62

Profits|x=4136.11

Profits|x=4497.71

Average Profit without Defaults: Profits|0=3077.27incl. Predicted Profit Directions

b) Realized Target−Attributes Profits

Default Sets

Pro

fits|

x

3550

3600

3650

3700

3750

3800

3850


Profits|x=3748.25

Profits|x=3658.91

Profits|x=3854.46

Profits|x=3575.39

No Defaults: Profits|0= 3673.65

incl. Predicted Profit Directions

default sets adds up to 1,109 (12.75%×8,700; about 1,109 designing customers

accept to start their configuration with a set of pre-set options). The results of

the realized profit levels generated under the named conditions are displayed in

Figure 4a). From the figure, we can obtain that we fully replicated the increas-

ing pattern of the four default sets considering the generated profit levels for the

key attributes. In other words, the different default sets could influence the aver-

age profit in the predicted direction, i.e. increase the generated revenue. We also

see that the effects differ in strength which is consistent with our prediction that

higher level defaults more significantly increase the profit. We assessed the sta-

tistical significance of the key attributes profits through pairwise comparisons.

The results are displayed in Table 1. From the table we see that there exist

significant differences with respect to different default levels, and most impor-

tantly all default sets generate significantly higher profits for the key attributes

than the configurations without any pre-set options. This provides evidence

for the first-order default effect already discussed in various past research (e.g.,

Brown and Krishna 2004). This research has shown that consumers frequently

follow defaults when making their actual attribute level choice. Looking closer

at our results, we can confirm the findings of these previous studies. Table 2

provides the frequencies in percentages of how often costumers have kept our

default suggestions within the different default sets. The first column of this ta-

ble contains the choice frequencies in the "no-default"-condition. First, we see

that the default acceptance rates for the different attribute levels range from 9%

to 87%, and are always higher than the choice frequencies in the "no default"-

condition. Depending on the attribute level, we consequently get rather high

ARTICLE II 27

Table 1 Pairwise Profit Comparisons

Pairwise T-Tests for Key attribute Profit Levels of ...p-values Default Set I Default Set II Default Set III Default Set IV

No Defaults <0.0001 <0.0001 <0.0001 <0.0001Default Set I 0.0566 0.0081 <0.0001Default Set II 0.4469 0.0123Default Set III 0.0826

Pairwise T-Tests for Target attribute Profit Levels of ...p-values Default Set I Default Set II Default Set III Default Set IV

No Defaults 0.2256 0.8196 0.0050 0.1851Default Set I 0.3062 0.2194 0.0676Default Set II 0.0278 0.3869Default Set III 0.0037

acceptance rates. Second, we conducted several χ2-tests to determine the sta-

tistical significance of the higher choice frequencies for the four default sets.

All tests revealed significant differences for all attribute levels compared to the

baseline at significance levels less than α = 0.01. Finally, these results for the

key attributes provide strong additional evidence for the first-order effect of de-

faults in mass-cutomization. The generated profit levels for the target attributes

are provided in Figure 4b). The profit levels only match the predictions of our

model for default sets II and III. For the remaining default sets I and IV, we get

the opposite directions in the results. The corresponding significance levels for

the profit differences for the four default sets and the "no default"-baseline can,

as for the key attributes, be obtained from Table 1. Here, the only significant

differences occur bewteen default set III and the "no default"-condition, set II

and III, and set III and IV, respectively. From these results, we can conclude

that there exists a "window" for defaults in mass customization for which our

model is able to perform the correct prediction on profit directions for the target

attributes. A company’s challenge then is to determine this window in order to

adequately apply pre-set options in their product configurator. As we can see, a

too extreme high-end default set can have the opposite effect than intended. This

might be explained by boundary effects, that is, that for the higher default level,

people are still aware of high spendings early on in the mass-customization pro-

cess which is then taken into account for the current choice. Unfortunately, we

28 ARTICLE II

Table 2 Comparison of Default Choice Frequency with No Default Baseline

Attribute No Default Default Default DefaultLevel Defaults Set I Set II Set III Set IV

Business 71.86% 87.33%*** 83.90%*** 81.93%***Package

18-Inches 15.29% 34.08%***Aluminum Rims

19-Inches 11.81% 26.61%*** 20.17%***Aluminum Rims

Upholstery 16.97% 25.09%***Fabric Mistral

Upholstery 1.19% 9.59%***Fabric Arkana

Upholstery 16.39% 27.42%***Leather Valcona

Upholstery 9.37% 30.67%***Leather Milano

Multi-Function 37.68% 54.68%*** 51.21%*** 65.13%***Steering Wheel

Memory-Function 5.86% 32.60%*** 44.96%***for Frontseats

*** p-value<0.001

did not have access to individual-level data that would have allowed us to incor-

porate budget constraints into our analysis. Other reasons for the missmatch in

prediction for default set I and default set IV might simply be the fact that not

all customers kept all default levels as choices within their car configuration, or

made additional choices within the key attributes, or even among attributes not

considered in the analysis, and therefore not captured by the model. At any rate,

the aggregate results for the target attributes are worthwile to take a closer look

at. As previously mentioned, the default acceptance rates under single attribute

consideration for the several pre-set options range from 9% to 87%. But consid-

ering the default sets as a whole, and separating the costumers into "accepters"

ARTICLE II 29

and "rejecters" can then reveal the true model performance. Therefore, we sep-

arated each data set for each default condition (set I, set II, set III and set IV)

into those configurations that have accepted more than half of the defaults pro-

posed in the condition and into those configurations that have accepted less than

half the pre-set options. Figure 5 shows the results for the splitted profits for all

four default sets with respect to the two different groups. From the figure, we

obtain that the predicted directions of profit increases for all default sets match

the realized profit levels for the group of default-accepters. For the default sets

I, II and IV, we see that the mean profits from the target attributes for the "re-

jecters" are significantly17 greater (set I and II), and significantly less (set IV)

than the mean profit from the "no default"-baseline, as opposed to the predic-

tions by our model. For default set III the effect for the "rejecters" is in the

same direction as for the "accepters". This explains the correct prediction for

the aggregate data for default set III. The correct prediction for default set II is

due to the interplay of sample size and difference margins such that the opposite

direction of the effect for the "rejecters" is of no consequence. Also considering

the sample sizes in the conditions and difference margins for default sets I and

IV, the contrary second-order effects obtained for the "rejecters" overwhelm the

predicted second-order effects. What we have shown here is the existence of

second-order default effects in mass-customization. For customers that follow

the defaults, our model accurately predicts the direction of the second-order

effect. What might happen for the "rejecters" is that they spend a lot of cogni-

tive effort to reject the defaults and consequently do not follow the underlying

mindset for a positive second-order effect. All together, this is evidence that

with intentionally chosen pre-set options the company can achieve a customer

lock-in to high margin decision paths. In other words, if companies are able to

steer their customers into such a high margin path early on in the configuration

process, they can benefit later on from easily generated profit at no additional

cost. The results also show that one should be careful in the use of defaults and

that the agenda definitely cannot follow the belief that "a higher default level

generally translates into higher profit levels". Our analysis provides evidence

for two types of respones to defaults, (i) the acceptance response which can be

17The significances of the diferences for the target attribute profits for the separated groups have been assessed

by two-sample statistical tests. For the "accepters" in default set I and III, we conducted Wilcoxon Rank-Sum tests

as the sample sizes are small (n=24,31), and for the "rejecters we conducted t-tests. The significances in Figure 5

are indicated as follows: * p-value<.10, ** p-value<.05, *** p-value<.01

30 ARTICLE II

Figure 5 Generated Profit Levels of Target Attributes for Default-Combinations- Profit Split -

a) Default set I (N=292) − Profit split

Pro

fits|

x

3000

3200

3400

3600

3800

4000

More than halfthe defaults accepted

Less than halfthe defaults accepted

’ACCEPTERS’n1=24

’REJECTERS’n2=268

Profits|x=2968.75 ***

Profits|x=3818.06 ***

PredictedDirection

b) Default set II (N=267) − Profit split

Pro

fits|

x

3200

3400

3600

3800

4000





Profits|x=3271.47 ***

Profits|x=3810.26 **

PredictedDirection

c) Default set III (N=248) − Profit split

Pro

fits|

x

3700

3800

3900

4000





Profits|x=3763.71

Profits|x=3867.42 ***

PredictedDirection

d) Default set IV (N=238) − Profit splitP

rofit

s|x

3500

3600

3700

3800

3900

4000

4100

4200





Profits|x=3698.16

Profits|x=3481.74 *

PredictedDirection

utilized according our model, and (ii) the rejectance response which leads to a

second-order effect of different direction. The second type of response to the

defaults is critical and can have quite some different origins. One explanation

might be the boundary effect that is responsible for the contrast in the shift of

profit levels among the target attributes when the defaults become too excessive.

There could be two arguments for this boundary effect: (i) the first-order default

effect is too strong, and consequently focalism and/or the selective accessibility

mechanism cannot explain the decision behavior, as customers are still aware

of high-level attribute choices made previously (due to high-level defaults), and

(ii) because customers frequently follow the manufacturer’s pre-specified lev-

els (Brown and Krishna 2004, Johnson et al. 2002, McKenzie et al. 2006, Park

et al. 2000), the design effort is not high enough to positively affect the willing-

ness to pay (Franke and Schreier 2010, Franke et al. 2010). The investigation of

the origins for such boundary effects and/or the assessment of the explanations

for the different response types would be beyond the scope of this paper, but is

ARTICLE II 31

certainly subject to further research. Here, we simply focus on the existence of

such first- and second-order default effects and provide a conceptual framework

to utilize these effects from a firm’s perspective. The effects are not limited to

only the postive or negative, as our model can also handle both directions of

first- and second-order default effects.

To better quantify these results in a managerial relevant measure, the overall

profit levels per year, recall that the yearly multiplier for the additional profit

generated by different default sets is 1,109. Considering Figure 4a), we see

that approximately between 659,988.08 (= [3, 672.39−3, 077.27]×1, 109) and

1,530,907.96 (= [4, 457.71−3, 077.27]×1, 109) of additional yearly profit can

be realized among the key attributes according to different default sets compared

to no fixed pre-set options. For the target attributes the profit levels represent a

mixed blessing (see Figure 4b)). Here, the additional yearly profit ranges from

a loss of 108,970.34 (= [3, 575.39− 3, 673.65]× 1, 109) to a win of 200,518.29

(= [3, 854.46 − 3, 673.65] × 1, 109). This indicates that it is necessary to con-

duct such an analysis when using default options in mass customization. Con-

sequently, using proper defaults, the additionally generated turn-over can be

fairly large (e.g., 1,374,771.85, joint profit for default set III compared to the

no-default results) keeping in mind that the profit driver, setting the defaults,

does not cost the company a single cent.

3.3 Customer Satisfaction

Customer satisfaction should also be taken into consideration when considering

the use of defaults to increase profit margins. The objective to be investigated

is the effect of the encroachment into the customer’s free decision-making pro-

cess via pre-set options. One could assume besides feeling supported in their

decisions by defaults individuals might also feel patronized in their decision-

making, which could, as a consequence, lower customer satisfaction and hurt

the company’s image. This would then be a negative side effect of default strate-

gies in mass customization. To address this issue, we designed a follow-up

study in which we asked people to go through the exact same decision-making

process as in the real online car configurator. Therefore, 175 respondents were

drawn from an automotive panel that consisted of people who were planning

32 ARTICLE II

on buying a new car within the next 12 months (average age: 45.64, 48.81%

females). They were randomly assigned to one of five conditions consisting

of "no defaults", "set I", "set II", "set III" and "set IV". At the end, we asked

people to answer several questions on seven-point Likert scales. These ques-

tions are indicators for three latent constructs, namely customer satisfaction,

process complexity, and preference certainty. The three-dimensional construct

(customer satisfaction, process complexity, and preference certainty) is used as

an overall measure of customer satisfaction in our specific context. The initial

measurement scales can be obtained from Table A-4. The internal consistency

of the three scales, customer satisfaction, process complexity and preference

certainty, was reasonably good for all three dimensions (customer satisfaction:

Cronbach’s α =0.88, AVE=0.62; process complexity: Cronbach’s α =0.74,

AVE=0.44; preference certainty: Cronbach’s α =0.91, AVE=0.78). There-

fore, for further analysis, we calculated the mean scores and conducted a mul-

tivariate analysis of variance (MANOVA). The MANOVA was applied to see

whether there are any differences in the overall customer satisfaction scores

between customers that configured their car in the different conditions. We

could not determine any statistically significant differences (Pillai’s trace=0.04,

F(12,492)=0.61, p-value=0.84).18 This implies that intentionally set defaults

do not negatively affect customer satisfaction. In addition to determining the

effect of defaults on customer satisfaction, we also assessed whether the cus-

tomers felt patronized when they started their configurations with pre-set op-

tions. Those customers had to answer additional questions to those correspond-

ing to customer satisfaction. These additional questions are indicators for the

latent construct of perceived patronization when configuring the vehicle. The

measurement scale can be obtained from Table A-5. The reliability of the scale

was reasonably good (Cronbach’s α=0.85, AVE=0.87). An analysis of vari-

ance (ANOVA) did not detect any significant differences among the different

default conditions with respect to perceived patronization (F(3,130)=1.07, p-

value=0.37). With a mean score of 2.81, this indicates that people do not feel

patronized in their decision-making when they are confronted with intentionally

set defaults. As a last question, all of the respondents (all conditions) were asked

to also state their buying probability on a seven-point scale anchored at (1) very

unlikely and (7) very likely. An Analysis of Variance (ANOVA) for the buy-

18The other three test-statistics for MANOVA (Wilks’ λ, Hotellings-Lawley Trace and Roy’s Maximum Root)

also indicate that we have no significant differences (p-values>0.27).

ARTICLE II 33

ing probability also failed to detect any significant differences (F(1,167)=0.61,

p-value=0.44). All together, we provide strong evidence that customer satis-

faction and perceived patronization is not an issue for default strategies in mass

customization when applying them properly.

To this end, we have shown that our developed methodology from section 2.

can be adequately used for analyzing mass-customization systems with respect

to first- and second-order effects. The main goal was to show how such ef-

fects can influence the generated profit levels of attribute combinations. We

show that the results can be used to determine optimal strategies for pre-set op-

tions, assuming that consumers frequently follow the manufacturer’s proposed

attribute level as shown in various past research (e.g., Johnson et al. 2002). We

can confirm this effect and extend its influence on subsequent decisions and

provide evidence for a second-order default effect. In addition, we have shown

that defaults do not have a negative effect on customer satisfaction, which would

be critical when implementing defaults in online configurators. Following, we

conclude with discussion, implications and further research directions.

4. Conclusion

Mass customization is a growing business practice and a common strategy in the

automotive industry to simultaneously support consumer choice and increase

firm profits. Default strategies are commonly used to support the customer’s

decision-making process. Herafter, we discuss our findings, draw implications,

reflect on limitations and motivate further research.

4.1 Discussion

In a mass-customization environment, customers have to make several decisions

regarding the desired levels of all required attributes. It is common business

practice for companies offering product configurators to support customers in

such decisions. Key drivers for customers’ willingness to pay are the prefer-

ence fit and design effort. Companies are challenged to strike a balance be-

tween these two dimensions (Dellaert and Stremersch 2005). To do so, defaults

34 ARTICLE II

can be used that play a decisive role in the customer decision-making process,

and companies provide such defaults for selected attribute levels to support cus-

tomers with their choices. Product configurations can be seen as multicategory

decision processes where due to the complementary nature of attribute levels,

a choice in one category affects the selection of attribute levels in other cate-

gories. When consumers design their desired product, it comprises individual

selections of attributes and their corresponding levels from each category. We

propose a statistical model incorporating the interplay between previous and

subsequent attribute selections. For our statistical analyses and empirical in-

vestigations, we do not use hypothetical data and convenience samples as in

various past studies. We analyze real life data from customers of a premium car

manufacturer. We show that previously chosen attribute levels have a significant

effect on subsequent option choices. The customers higher willingness to pay

is positively affected by the higher preference fit and adequate design effort

(Franke et al. 2010)19. Because in a mass-customization system the customer

participation happens during the configurations process (Piller et al. 2004), this

effect can be utilized from a firm’s perspective to increase the profitability of

mass customization. We provide additional evidence for strong first-order de-

fault effects as already shown in various past research as well as second-order

default effects of opposite directions (for two different response types) for these

pre-set options on later choices in the decision sequence.

4.2 Managerial and Research Implications

For the implementation of mass customization, companies can use our model to

specify combinations of pre-set options that are most likely to generate high net

profit levels according to their individual cost structure. Such profit levels can

easily be generated by simply using intentionally set defaults, that is, defaults

determined by our proposed methodology. Put differently, and using the ter-

minology of path dependence, we show how consumers can be "locked-in" to

high-margin decision paths, leading to optimal outcomes from firms’ perspec-

tives, utilizing the strong first- and second-order default effects. Further, the

model discussed in this paper is extremely flexible in its application, as it can

19Since a car is a high involvement product, we can assume the outcome for the customer is satisfactory (Franke

et al. 2009) and therefore design effort also positively affects the willingness to pay.

ARTICLE II 35

be applied to only a limited set of attributes of interest as well as to the entire

set of attributes within the configuration tool. We also provide a simply im-

plementable grid search optimization to determine optimal pre-set options for

the configurator. Furthermore, constraints such as a limited number of defaults

and/or restricted positioning of defaults within the configurator can easily be

achieved by adjusting the support set of possible pre-set combinations within

the optimization procedure. The results also strengthen the necessity of such

an analysis for default application in mass-customization systems. Companies

cannot simply apply as many defaults as possible to skim additional profits due

to the discussed default effects. We provide evidence for two different response

types to defaults that lead to different directions of the second-order effect, the

possibility of present boundary effects, and consequently that defaults are sub-

ject to be carefully used. To utilize the discussed effects with respect to the

profitability of mass customization, companies are recommended to conduct

such an anylsis with respect to the direction of first- and especially second-order

default effects.

Form a scientific perspective, we show that there exists a window in which

the first- and second-order effects have the same direction for both response

types, and lead to higher-level attribute choices. The rationale for the occur-

rence of such a high-margin window might be a combination of three different

mechanisms: Focalism, the selective accessibility mechanism, and customiza-

tion effects. Focalism operates against budget constraints, as customers exten-

sively focus on the recent decision and do not as much take into account that

they have already made expensive high-level choices. Selective accessibility

mechanisms also debilitate budget constraints and support complementary ef-

fects such that attribute levels are chosen according to their combined fit, and

usually equal attribute levels tend to fit better together (e.g., high level with

high level). Customization effects, such as preference fit and design effort, in-

crease the customers’ willingness to pay and therefore also lower the sensitivity

to budget constraints. The equidirectionality of the first- and second-order ef-

fects cannot be exploited infinitely. Consequently, there has to be a boundary

mechanism whose rationale cannot be explained by our approach.

36 ARTICLE II

4.3 Limitations and Future Research

One limitation of our findings is that we could also have significant effects

among attribute combinations not captured in our model because in sum, only

nine attributes have been considered. Another aspect to mention herein is that

we did not explicitly model interaction effects between subsequent option

choices in the group of target attributes. We only accomodated for such pos-

sible effets through a full correlation structure. In future work, it could be in-

vestigated how such interaction effects influence subsequent option choices.

To do so, one would have to model the interaction effects explicitly, which

would allow to more precisely distangle the attribute level associations. Our

approach was more targeted toward managerial relevant output variables such

as profit. However, an analysis of the entire attribute set (approximately about

250 with accessories) and the incorporation of explicit interaction parameters

(in our case 23 more parameters only considering two-way interactions) is pos-

sible but remains a challenging task as it brings some intense computational

effort. As discussed throughout the paper, we analyze field data from an online

car configurator. This means that the configured cars have not necessarily been

purchased in the exact same setup, although these configurations are closer to

reality than hypothetically elicited data. A question to further be investigated

would be the match of actual orders and the corresponding previously made car

configurations. Another further research direction is to exactly determine and

disentangle the mechanisms that are responsible for equidirectional effects and

how they are related. Here, an interesting question is to ask for the determinants

that define the different response types to defaults. Additionally, as discussed to

some extent in this paper, we have boundary effects or directional changes for

the second-order default effects. This is definitely subject to further research

to investigate the origins of such boundary effects. One approach to analyzing

the occurring interplay is to incorporate defaults in the analyses conducted by

Franke and Schreier (2010), and Franke et al. (2010) regarding the economic

value of mass-customized products from a customer’s perspective. Also, due

to a lack of data, we did not incorporate budget constraints into our analysis.

Further research should attempt to account for such mechanisms as well. An-

other body of research approaches mass customization within social networks

(Franke et al. 2008, Moreau and Herd 2010). Here, one could argue how feed-

ARTICLE II 37

back systems influence costumer choices at several stages in the customization

process. We hope to motivate further research in this and related directions.

To summarize, we give evidence for strong effects of pre-set options on the

attributes themselves as well as significant effects of those pre-set options on

subsequent choices (first- and second-order default effects). Further, we pro-

vide the methodology for practitioners to conduct a sufficient analysis of such

effects in product configurators. We describe the procedure from the attribute

selection to default optimization. Therefore, the results of such an analysis can

support companies in developing an elaborate strategy for pre-setting options in

mass-customization systems. Supported by our field study, we show that such

a strategy is required to skim additional profit levels at almost no cost. Finally,

we confirm that customer satisfaction does not suffer from intentionally set de-

faults in product configurators. Additionally, we could also show that defaults

in a product configurator do not have a negative impact on perceived patroniza-

tion and buying probability. This means that, for our application, individuals

configuring their car online did not negatively perceive the pre-set options in

the decision-making process.

38 ARTICLE II

ReferencesAlbert, J. H., S. Chib. 1993. Bayesian analysis of binary and polychotomous response data.

Journal of the American Statistical Association 88(422) 669 – 679.

Alford, D., P. Sackett, G. Nelder. 2000. Mass customisation – an automotive perspective. Inter-national Journal of Production Economics 65(1) 99 – 110.

Argo, J. J., D. W. Dahl, R. V. Manchanda. 2005. The influence of a mere social presence in aretail context. Journal of Consumer Research 32(2) 207 – 212.

Brown, C. L., A. Krishna. 2004. The skeptical shopper: A metacognitive account for the effectsof default options on choice. Journal of Consumer Research 31(3) 529 – 539.

Chib, S., E. Greenberg. 1995. Understanding the metropolis-hastings algorithm. The AmericanStatistician 49(4) 327 – 335.

Cragg, J. G., R. S. Uhler. 1970. The demand for automobiles. The Canadian Journal of Eco-nomics 3(3) 386 – 406.

Davis, S. M. 1987. Future Perfect. Addison-Wesley Publishing, Reading, MA.

Dellaert, B. G. C., S. Stremersch. 2005. Marketing mass-customized products: Striking a bal-ance between utility and complexity. Journal of Marketing Research 42(2) 219 – 227.

Duray, R., P. T. Ward, G. W. Milligan, W. L. Berry. 2000. Approaches to mass customization:configurations and empirical validation. Journal of Operations Management 18(6) 605 –625.

Franke, N., P. Keinz, M. Schreier. 2008. Complementing mass customization toolkits withuser communities: How peer input improves customer self-design. Journal of ProductInnovation Management 25(6) 546 – 559.

Franke, N., P. Keinz, C. J. Steger. 2009. Testing the value of customization: When do customersreally prefer products tailored to their preferences?. Journal of Marketing 73(5) 103 – 121.

Franke, N., M. Schreier. 2010. Why customers value self-designed products: The importanceof process effort and enjoyment. Journal of Product Innovation Management 27(7) 1020– 1031.

Franke, N., M. Schreier, U. Kaiser. 2010. The "I designed it myself" effect in mass customiza-tion. Management Science 56(1) 125 – 140.

Gabaix, X., D. Laibson, G. Moloche, S. Weinberg. 2006. Costly information acquisition: Ex-perimental analysis of a boundedly rational model. The American Economic Review 96(4)1043 – 1068.

Geman, S., D. Geman. 1984. Stochastic relaxation, gibbs distributions, and the bayesian restora-tion of images. IEEE Trans. Pattern Analysis and Machine Intelligence 6 721 – 741.

Gentzkow, M. 2007. Valuing new goods in a model with complementarity: Online newspapers.The American Economic Review 97(3) 713 – 744.

Goldstein, D. G., E. J. Johnson, A. Herrmann, M. Heitmann. 2008. Nudge your customerstoward better choices. Harvard Business Review 86(12) 99 – 105.

Hagle, T. M., G. E. Mitchell II. 1992. Goodness-of-fit measures for probit and logit. AmericanJournal of Political Science 36(3) 762 – 784.

Häubl, G., B. G. C. Dellaert, B. Donkers. 2010. Tunnel vision: Local behavioral influences onconsumer decisions in product search. Marketing Science 29(3) 438 – 455.

Homburg, C., N. Koschate, W. D. Hoyer. 2005. Do satisfied customers really pay more? astudy of the relationship between customer satisfaction and willingness to pay. Journal ofMarketing 69(2) 84 – 96.

Houston, D. A., D. R. Roskos-Ewoldsen. 1998. Cancellation and focus model of choice andpreferences for political candidates. Basic & Applied Social Psychology 20(4) 305 – 312.

ARTICLE II 39

Imai, K., D. A. van Dyk. 2005a. A bayesian analysis of the multinomial probit model usingmarginal data augmentation. Journal of Econometrics 124(2) 311 – 334.

Imai, K., D. A. van Dyk. 2005b. Mnp: R package for fitting the multinomial probit model.Journal of Statistical Software 14(3) 1 – 32.

Iyengar, S., M. R. Lepper. 2000. When choice is demotivating: Can one desire too much of agood thing? Journal of Personality and Social Psychology 96(6) 995 – 1006.

Johnson, E. J., S. Bellman, G. L. Lohse. 2002. Defaults, framing and privacy: Why optingin-opting out. Marketing Letters 13(1) 5 – 15.

Kotha, S. 1995. Mass customization: Implementing the emerging paradigm for competitiveadvantage. Strategic Management Journal 16 21–42.

Kotha, S. 1996. Mass-customization: a strategy for knowledge creation and organizationallearning. Int. J. Technology Management 11(7/8) 846 – 858.

Levav, J., M. Heitmann, A. Herrmann, S. S. Iyengar. 2010. Order in product customization.The Journal of Political Economy 118(2) 274 – 299.

Liu, X., M. J. Daniels. 2006. A new algorithm for simulating a correlation matrix based onparameter expansion and re-parameterization. Journal of Computational and GraphicalStatistics 15(4) 897 – 914.

Manchanda, P., A. Ansari, S. Gupta. 1999. The "shopping basket": A model for multicategorypurchase incidence decisions. Marketing Science 18(2) 95 – 114.

McCulloch, R. E., N. G. Polson, P. E. Rossi. 2000. A bayesian analysis of the multinomialprobit model with fully identified parameters. Journal of Econometrics 99(1) 173 – 193.

McCulloch, R. E., P. E. Rossi. 1994. An exact likelihood analysis of the multinomial probitmodel. Journal of Econometrics 64(1 – 2) 207 – 240.

McKenzie, C. R. M., M. J. Liersch, S. R. Finkelstein. 2006. Recommendations implicit inpolicy defaults. Psychological Science 17(5) 414 – 420.

Miller, G. A. 1956. The magical number seven, plus or minus two: some limits on our capacityfor processing information. Psychological Review 63(2) 81 – 97.

Moreau, C. P., K. B. Herd. 2010. To each his own? how comparisons with others influenceconsumers’ evaluations of their self-designed products. Journal of Consumer Research36(5) 806 – 819.

Muraven, M., R. F. Baumeister. 2000. Self-regulation and depletion of limited resources: Doesself-control resemble a muscle?. Psychological Bulletin 126(2) 247 – 259.

Mussweiler, T. 2003. Comparison processes in social judgment: Mechanisms and conse-quences. Psychological Review 110(3) 472 – 489.

Mussweiler, T., F. Strack. 1999. Hypothesis-consistent testing and semantic priming in theanchoring paradigm: A selective accessibility model. Journal of Experimental Social Psy-chology 35(2) 136 – 164.

Nagelkerke, N. J. D. 1991. A note on a general definition of the coefficient of determination.Biometrika 78(3) 691 – 692.

Park, C. W., S. Y. Jun, D. J. MacInnis. 2000. Choosing what i want versus rejecting what i donot want: An application of decision framing to product option choice decisions. Journalof Marketing Research 37(2) 187 – 202.

Piller, F. T., K. Moeslein, C. M. Stotko. 2004. Does mass customization pay? an economicapproach to evaluate customer integration. Production Planning & Control 15(4) 435 –444.

Randall, T., C. Terwiesch, K. T. Ulrich. 2005. Principles for user design of customized products.California Management Review 47(4) 68 – 85.

40 ARTICLE II

Randall, T., C. Terwiesch, K. T. Ulrich. 2007. User design of customized products. MarketingScience 26(2) 268 – 280.

Rossi, P. E., G. M. Allenby. 2003. Bayesian statistics and marketing. Marketing Science 22(3)304 – 328.

Salvador, F., P. M. de Holan, F.T. Piller. 2009. Cracking the code of mass customization. MITSloan Management Review 50(3) 71 – 78.

Schreier, M. 2006. The value increment of mass-customized products: an empirical assessment.Journal of Consumer Behaviour 5(4) 317 – 327.

Seetharaman, P. B., S. Chib, A. Ainslie, P. Boatwright, T. Chan, S. Gupta, N. Mehta, V. Rao,A. Strijnev. 2005. Models of multi-category choice behavior. Marketing Letters 16(3-4)239 – 254.

Silveira, G.i Da, D. Borenstein, F. S. Fogliatto. 2001. Mass customization: Literature reviewand research directions. International Journal of Production Economics 72(1) 1 – 13.

Song, I., P. K. Chintagunta. 2006. Measuring cross-category price effects with aggregate storedata. Management Science 52(10) 1594 – 1609.

Sriram, S., P. K. Chintagunta, M. K. Agarwal. 2010. Investigating consumer purchase behaviorin related technology product categories. Marketing Science 29(2) 291 – 314.

Wedel, M., J. Zhang. 2004. Analyzing brand competition across subcategories. Journal ofMarketing Research 41(4) 448 – 456.

Wright, P. 2002. Marketplace metacognition and social intelligence. The Journal of ConsumerResearch 28(4) 677 – 682.

Zhang, X., W. J. Boscardin, T. R. Belin. 2006. Sampling correlation matrices in bayesian modelswith correlated latent variables. Journal of Computational Graphics and Statistics 15 880– 896.

Zhang, X., W. J. Boscardin, T. R. Belin. 2008. Bayesian analysis of multivariate nominalmeasures using multivariate multinomial probit models. Computational Statistics & DataAnalysis 52(7) 3697 – 3708.

ARTICLE II 41

A. Tables and Figures

Table A-1 Attribute Specification: Confirmatory Contingency Analysis

Relation p-value χ2-Test Cramer’s V

Business Package × Navigation System <0.0001 0.212

Business Package × Radio <0.0001 0.221

Business Package × Phone Equipment <0.0001 0.126

Business Package × Sound System 0.0283 0.053

Rims × Navigation System <0.0001 0.147

Rims × Radio <0.0001 0.142

Rims × Phone Equipment <0.0001 0.104

Rims × Sound System <0.0001 0.230

Upholstery × Navigation System <0.0001 0.159

Upholstery × Radio <0.0001 0.202

Upholstery × Phone Equipment <0.0001 0.217

Upholstery × Sound System <0.0001 0.153

Steering Wheel × Navigation System <0.0001 0.100

Steering Wheel × Radio <0.0001 0.129

Steering Wheel × Phone Equipment <0.0001 0.094

Steering Wheel × Sound System <0.0001 0.195

Frontseats × Navigation System <0.0001 0.204

Frontseats × Radio <0.0001 0.180

Frontseats × Phone Equipment <0.0001 0.136

Frontseats × Sound System <0.0001 0.250

42 ARTICLE II

Table A-2 Parameter Estimates: Posterior Means and 95% Posterior Intervals

Variable Parameter Posterior MeanLabel (95%- Posterior Interval)

Configuration with - MMI Navigation βConfig11 -2.5952

(-2.7931,-2.4132)

Configuration with - MMI Navigation Plus βConfig12 -0.0040

(-0.0056,-0.0023)

Configuration with - MMI Radio Plus βConfig21 -0.4063

(-0.4534,-0.3521)

Configuration with - Cell Phone Setup βConfig31 -0.0791

(-0.1284,-0.0299)

Configuration with - Bluetooth Phone βConfig32 -0.0293

(-0.0315,-0.0271)

Configuration with - Bluetooth Phone βConfig33 -0.0426

(Wireless Remote) (-0.0455,-0.0398)

Configuration with - DSP-Soundsystem βConfig41 0.3081

(0.2563,0.3570)

Configuration with - BOSE Surround Sound βConfig42 -0.0003

(-0.0023,0.0019)

Business-Package βKey11 0.0105

(0.0090,0.0120)

18-Inch Aluminum Rims βKey21 -0.0079

(-0.0098,-0.0057)

19-Inch Aluminum Rims βKey22 -0.0096

(-0.0118,-0.0076)

Upholstery Fabric Mistral βKey31 0.0046

(0.0025,0.0067)

Upholstery Fabric Arkana βKey32 0.0100

(0.0047,0.0152)

Upholstery Leather Valcona βKey33 -0.0068

(-0.0086,-0.0047)

Upholstery Leather Milano βKey34 -0.0051

(-0.0077,-0.0026)

Multi-Function Steering Wheel βKey41 -0.0046

(4 Crossing Design) (-0.0064,-0.0031)

Multi-Function Steering Wheel βKey42 -0.0081

+ Shift Compensator (4 Crossing Design) (-0.0109,-0.0055)

Multi-Function Steering Wheel (heated) βKey43 -0.0095

+ Shift Compensator (4 Crossing Design) (-0.0130,-0.0053)

Electric Frontseats βKey51 -0.0151

(-0.0178,-0.0124)

Memory-Function for Driver’s Seat βKey52 -0.0155

(with electric adjustable Frontseats) (-0.0183,-0.0126)

Memory-Function for Frontseats βKey53 -0.0169

(-0.0201,-0.0139)

ARTICLE II 43

Tabl

eA

-3D

efau

ltS

ets

Key

-Ite

mD

efau

ltSe

tD

efau

ltSe

tD

efau

ltSe

tD

efau

ltSe

tD

efau

ltSe

tI

IIII

IIV

Opt

imum

Business-Packag

eY

ES

YE

SNO

YE

SY

ES

18-Inch

Aluminum

Rim

sNO

YE

SNO

NO

NO

19-Inch

Aluminum

Rim

sNO

NO

YE

SY

ES

YE

S

Upholstery

Fab

ricM

istral

NO

YE

SNO

NO

NO

Upholstery

Fab

ricArkan

aY

ES

NO

NO

NO

NO

Upholstery

Leather

Valco

na

NO

NO

YE

SNO

YE

S

Upholstery

Leather

Milan

oNO

NO

NO

YE

SNO

Multi-FunctionSteeringW

heel

(4CrossingDesign)

NO

YE

SY

ES

YE

SNO


heel+ShiftCompen

sator

(4CrossingDesign)

NO

NO

NO

NO

NO


heel(heated)+ShiftCompen

sator

(4CrossingDesign)

NO

NO

NO

NO

YE

S

ElectricFrontseats

NO

NO

NO

NO

NO

Mem

ory-F

unctionforDriver’s

Seat

(withelectric

adjustab

leFrontseats)

NO

NO

NO

NO

NO

Mem

ory-F

unctionforFrontseats

NO

NO

YE

SY

ES

YE

S

44 ARTICLE II

Tabl

eA

-4In

itial

mea

sure

men

tsca

les

Late

ntVa

riab

les

with

Indi

cato

rsSc

ale

Bas

edon

Satis

fact

ion

Homburg

etal.(2005)

SAT1

Allin

all,Iwould

besatisfiedwithmych

oices.

SAT2

Theco

nfigurationoftheattributesis

exactlywhat

Iwan

ted.

SAT3

Thech

oices

Imad

ewould

notmeetmyex

pectations.

(R)

SAT4

Would

Ihav

eto

choose

amongthesamealternatives

again,

Iwould

mak

ethesamedecisions.

SAT5

Ihav

eagoodfeelingco

nsideringthech

oices

Ijust

mad

e.

Pro

cess

Com

plex

ityDellaertan

dCOM

PL1

Iperceived

thedecision-m

akingprocess

aseffortful.

Strem

ersch(2005)

COM

PL2

Theco

nfigurationoftheattributeswas

pleasan

t.(R

)COM

PL3

Thedecision-m

akingwas

difficu

ltforme.

COM

PL4

Theco

nfigurationoftheattributeswas

somuch

funthat

Iforgotab

outthetime.

(R)

COM

PL5

Iperceived

thedecision-m

akingprocess

asco

mplicated.

Pre

fere

nce

Cer

tain

tyArgoet

al.(2005)

PREF1

Iam

sure

that

Imad

etherightch

oices.

PREF2

Iam

certainthat

thech

oices

Imad

emeetmyex

pectations.

PREF3

Iam

confiden

tthat

Ihav

eiden

tified

thealternatives

best

meetingmyneeds.

Notes:

Allmeasu

reswereassessed

onseven

-pointscales,an

choredby"strongly

disag

ree"

(1),

and"strongly

agree"

(7).

R=reverse

scored.

ARTICLE II 45

Tabl

eA

-5P

atro

niza

tion

mea

sure

men

tsca

le

Late

ntVa

riab

lew

ithIn

dica

tors

Patr

oniz

atio

nPA

TR1

Throughthepre-set

optionsIfeltco

nstricted

inmydecisions.

PATR2

Throughthepre-set

optionsIfeltirritated.

PATR3

Throughthepre-set

optionsIfeltpressuredin

thedecision-m

akingprocess.

Notes:

Allmeasu

reswereassessed

onseven

-pointscales,an

choredby"strongly

disag

ree"

(1),

and"strongly

agree"

(7).

46 ARTICLE II

Figu

reA

-1E

stim

ated

Den

sitie

sfo

rthe

Mod

elP

aram

eter

s

Den

sity

of b

eta

1

β 1

f(β1)

1234

−2.8

−2.6

−2.4

Den

sity

of b

eta

2

β 2

f(β2)

100

200

300

400

−0.0

06−0

.004

−0.0

02

Den

sity

of b

eta

3

β 3

f(β3)

51015

−0.5

0−0

.45

−0.4

0−0

.35

Den

sity

of b

eta

4

β 4

f(β4)

51015

−0.1

5−0

.10

−0.0

50.

00

Den

sity

of b

eta

5

β 5

f(β5)

100

200

300

−0.0

34−0

.032

−0.0

30−0

.028

−0.0

26−0

.024

Den

sity

of b

eta

6

β 6

f(β6)

50100

150

200

250

−0.0

50−0

.045

−0.0

40

Den

sity

of b

eta

7

β 7

f(β7)

51015

0.20

0.25

0.30

0.35

0.40

Den

sity

of b

eta

8

β 8

f(β8)

100

200

300

400

−0.0

04−0

.002

0.00

00.

002

0.00

4

Den

sity

of b

eta

9

β 9

f(β9)10

0

200

300

400

500

0.00

80.

010

0.01

20.

014

Den

sity

of b

eta

10

β 10

f(β10)

100

200

300

400

−0.0

12−0

.010

−0.0

08−0

.006

−0.0

04

Den

sity

of b

eta

11

β 11

f(β11)

100

200

300

400

−0.0

14−0

.012

−0.0

10−0

.008

−0.0

06

Den

sity

of b

eta

12

β 12

f(β12)

100

200

300

0.00

20.

004

0.00

60.

008

Den

sity

of b

eta

13

β 13

f(β13)

50100

150

0.00

50.

010

0.01

5

Den

sity

of b

eta

14

β 14

f(β14)

100

200

300

400

−0.0

10−0

.008

−0.0

06−0

.004

Den

sity

of b

eta

15

β 15

f(β15)

50100

150

200

250

300

−0.0

10−0

.008

−0.0

06−0

.004

−0.0

02

Den

sity

of b

eta

16

β 16

f(β16)10

0

200

300

400

−0.0

08−0

.006

−0.0

04−0

.002

Den

sity

of b

eta

17

β 17

f(β17)

50100

150

200

250

300

−0.0

12−0

.010

−0.0

08−0

.006

−0.0

04

Den

sity

of b

eta

18

β 18

f(β18)

50100

150

200

−0.0

15−0

.010

−0.0

05

Den

sity

of b

eta

19

β 19

f(β19)

50100

150

200

250

300

−0.0

20−0

.015

−0.0

10

Den

sity

of b

eta

20

β 20

f(β20)

50100

150

200

250

−0.0

20−0

.015

−0.0

10

Den

sity

of b

eta

21

β 21

f(β21)

50100

150

200

250

−0.0

20−0

.015

−0.0

10

ARTICLE II 47

Figure A-2 Densities - R2 and R2adj.

R2 and Radj.2

f(R2 )

and

f(R

adj.

2)

0

20

40

60

80

0.58 0.59 0.60 0.61 0.62

95% HPD Interval [0.59,0.61]

R2

Radj.2

48 ARTICLE II

B. MCMC sampling

As introduced in sections 2.2 and 2.3, the MVMNP model assumes that given

a set of explanatory variables the multivariate multionomial response is an in-

dicator of the event that some unobserved latent variable vector falls within a

certain region. The latent variable is assumed to arise from the multivariate

normal distribution, zi ∼ N(Xiβ,Σ). The likelihood of the observed discrete

data d = (d1, . . . , dn) is then obtained by integrating over the multidimensional

constrain space of latent variables.

L(d|X, β,Σ) =

n∏i=1

∫Ai,1

· · ·∫Ai,J

1

(2π)∑J

j=1(kj−1) |Σ| 12exp

(1

2(zi −Xiβ)

TΣ−1(zi −Xiβ)

)dzi

(B-1)

where the Ai,j’s are the intervals of compatible values for the latent variables

associated with the discrete choices di. Following the notation of Zhang et al.

(2008), the joint posterior density of β,Σ, and Z = (z1, . . . , zn), given the dis-

crete data d and its likelihood (B-1), is characterized as

p(β,Σ, Z|d) ∝ p(β)× p(Σ)×n∏

i=1

(Ii × ϕ(zi|Xi, β,Σ)) (B-2)

where ϕ is the multivariate standard normal density function and

Ii =J∏

j=1

⎛⎝1[di,j=0,zi,j,l<0,l=1,...,kj−1] +

kj−1∑r=1

1[di,j=r,zi,j,r=max1≤l≤kj−1(zi,j,l,0)]

⎞⎠

with 1[E] the indicator function equal to 1 when the expression E is true and

0 otherwise. Thus, the function Ii is simply an indicator evaluating to 1 if

the choice vector di is compatible with the underlying latent vector zi (Zhang

et al. 2008). The likelihood function in (B-1) involves multidimensional inte-

grals, making classical inferences difficult. Therefore, we use MCMC meth-

ods yielding random draws from the joint posterior distribution of the parame-

ters. Inference is based on the distribution of the drawn sample. In our MCMC

ARTICLE II 49

sampling algorithm, we proceed with a combination of data augmentation (Al-

bert and Chib 1993), the Gibbs sampler (Geman and Geman 1984) and the

Metropolis-Hastings algorithm (Chib and Greenberg 1995). The algorithm con-

sists of three steps. First, we sample the parameter vector β conditional on Σ,

Z and d. Assuming a prior distribution β ∼ N(b, C) for β and using stan-

dard Bayesian linear model results, β|Σ, Z, d has a multivariate normal distri-

bution β|Σ, Z, d ∼ N(β, Vβ), where Vβ = (∑n

i=1XTi Σ

−1Xi + C−1)−1 and

β = Vβ(∑n

i=1XTi Σ

−1zi + C−1b). Second, we draw samples for the latent

variables zi,j,l ∀i conditional on Xi, β, Σ, di and zi,j(−l) = (zi,j,1, . . . , zi,j,l−1,zi,j,l+1, . . . , zi,j,kj−1). The latent variable zi,j,l follows a truncated normal distri-

bution, zi,j,l ∼ NTrunc(Xi,jβ, {Σ}(q,q)), with lower bound equal to

max(zi,j(−l), 0) and upper bound equal to ∞, if di,j = l, and,

zi,j,l ∼ NTrunc(Xi,jβ, {Σ}(q,q)), with lower bound equal to −∞ and upper

bound equal tomax(zi,j(−l), 0) otherwise; q = 1+∑j−1

s=1(ks− 1). The third and

last step of the algorithm samples the variance-covariance matrix Σ. For the

constrained variance-covariance matrix Σ, we use an adjustment of the param-

eter expanded re-parameterization and Metropolis-Hastings (PX-RPMH) algo-

rithm proposed by Liu and Daniels (2006) for correlation matrices. The idea

of this sampling algorithm for correlation matrices, and constrained variance-

covariance matrices respectively, is to relax the constraints of diagonal elements

set to one, and to freely sample a variance-covariance matrix that then follows

an inverse Wishart distribution (for details, see Liu and Daniels (2006))20. In

our MCMC framework, we use a diffuse but proper prior for β; the multivari-

ate normal distribution with mean vector b = 0 and variance-covaraince matrix

C = I · 106 (I the identity matrix). The estimated probability model can then

be used to determine those key-attribute choices that are most likely to generate

the maximum joint profit together with the target-attributes.

20A different sampling algorithm has been proposed by Zhang et al. (2006).

50 ARTICLE II

Article III

Stadel, D. P. (submitted). Online Data: Predictive Power or Obscure Delusion?

International Journal of Research in Marketing.

Online Data:

Predictive Power or Obscure Delusion?

Daniel Stadel ∗

∗Daniel Stadel ([email protected]) is Ph.D. candidate at the University of St. Gallen, 9000 St. Gallen,

Switzerland.

2 ARTICLE III

Abstract

Nowadays, the internet is one of the most important information sources with still growing

popularity every day. People more and more frequently include the internet, suppositional they

have access to it, into their information search. For example, people are spending time searching

for all kinds of information about different products ranging from groceries, vacations, luxury

products, to cars and even real estates. Companies by now usually provide all the information

already online, mostly to be easily found by their customers. Potential customers can then ei-

ther directly visit the companies’ websites, or use search engines such as Google and Yahoo

to find the respective information on independent third party webpages. In either case, lots of

different types of data can be obtained with respect to consumers’ online search behavior, such

as clickstream data, search queries, blogs, and even real product choices (e.g., online product

configurators). Thus, the world wide web is rapidly developing to the world’s biggest data

romping place. In this paper, I investigate whether online data have predictive power and can

be utilized by companies in terms of improving business forecast models, or if they provide

misdirection. In particular, I consider weekly car orders of a renowned premium European car

manufacturer over a time period from June 2007 to December 2009. The respective online data

to be considered in the analysis are (i) online car configurations from the manufacturer’s own

webpage, and (ii) online search query data from Google Insights for Search. The online data

range from January 2007 to December 2009. Due to the nature of the data, time series models

are to be applied. First, a baseline model is specified without consideration of any covariates,

and the car orders are simply modeled as autoregressive processes. Second, in a time series

regression framework, I consider online car configurations as well as online search queries as

possible predictors for the variation in the original car order series. I also introduce a simple

measure to quantify the impact of covariates on predictive performance, the Forecast Impact

Factor (FIF). The results indicate that the predictive performance can be significantly improved

with respect to forecast error by the incorporation of online data. The findings suggest that the

internet is to be considered for business forecasts, especially if no other data is available. All

model parameters are estimated within a Bayesian framework.

Key words: Online Data, Car Configurator, Google Insights for Search, Time Series, Forecast-

ing, Forecast Impact Factor, Bayesian Methods

ARTICLE III 3

1. Introduction

"If you can look into the seeds of time and say, which grain will grow,and which will not, speak then to me". This introductory quote by William

Shakespeare poetically hits the mark with respect to forecasting. The main ob-

jective in the forecasting discipline in general, is to most accurately predict the

future concerning specific variables of interest, which in the area of business

forecasting range from interest rate predictions to product sale forecasts. The

subject of forecasting issues has been occupying the research literature for over

half a century (e.g., Winters 1960), and still remains of high relevance, as to

know what is going to happen can be the key for successful management de-

cisions including inventory planning, production scheduling, and expense bud-

geting.

In order to best achieve the forecasting goal one always avails oneself of

available information that is assumed to be likely to give information about the

future. In the discipline of weather-forecasting, for example, certain "signs"

that are to be occuring in advance of the actual event of interest, are used, such

as low-pressure areas to predict bad weather. This simply shows that the pres-

ence of indicators for certain events available in advance of the events them-

selves are a main source of information and significantly influence the forecast.

With respect to business forecasting including product sales, future revenue vol-

ume, market shares, etc., it could either be reverted to information such as past

product sales, or to more general information such as inflation rates, consumer

satisfaction indices, sectoral indices, etc. Relying on past information is a ret-

rospective approach and assumes that the future behaves like the past which

is a main weakness. Global information, such as indices, etc. can vaguely be

directly linked to specific brand performances or product sales. Prospective

closely consumer-related information on specific topics can directly be linked

to possible future events, as for example information on possible purchase in-

tentions. With the growing technological development and the growing inte-

gration of the developed world through the internet, there might be a chance to

open up a new valuable source of such consumer-related information. Such data

then might have enormous potential to indicate future developments. For exam-

ple, as the internet is becoming reasonably important for information search, it

4 ARTICLE III

is not unlikely that consumers research product characteristics such as prices,

quality, etc. in advance to an upcoming purchase. For example Ratchford et al.

(2003) study the use of the internet as an information source among car buyers.

Their results indicate that especially for internet-affine consumers, the online

source is of considerable importance for their information search prior to the

purchase (Ratchford et al. 2001). Klein and Ford (2003) also show that the

internet is an important factor within the information search since more than

half of the automobile buyers use the internet in their search process. As the

internet is an always growing source for information, and by the end of 2009

on average 64%21 of the population in the developed countries use the inter-

net (ITU 2010), the impact can be assumed to be significantly stronger than

back in the year 2000. Thus, the picture drawn from online searching behavior

can be assumed to mirrow a population’s interest and preference structure. For

example Decker and Trusov (2010) estimate consumer preferences based on

online product reviews and offer an econometric framework of how the plenti-

tude of online information can be turned into aggregate user preferences. This

indicates that online available data provide new chances to extract information

that can be used to better predict future events. Nevertheless, such a data surge

inevitably also quarries new challenges. Heil et al. (2010) mention the phe-

nomenon of ROPO (=Research Online Purchase Offline) in their note on the

new challenges the 21st century brings to the field of marketing. This phe-

nomenon simply implies that consumers, as already mentioned, use the internet

in their information search prior to their offline product purchases. The main

task analysts face with respect to online data is to extract valuable informa-

tion. Montgomery (2001) discusses quantitative marketing techniques that can

be used to solve internet marketing problems, e.g., banner targeting, consumer

online behavior, and trend tracking, including autoregressive models to predict

web usage (Montgomery 1999).

The access to such data on consumers’ online behavior, can definitely be

used for analyses of possible future offline behaviors and respective trends. The

parole "Reading the right signs right" then implies improved business forecasts.

Basically, there are three main aspects to achieve this goal: (i) the right data, (ii)

the right method, and (iii) the right context (variable to be predicted). In the re-

2126% of the world’s population use the internet, with 64% in developed countries and 18% in developing

countries (ITU 2010).

ARTICLE III 5

cent marketing literature online data have been discussed for different applica-

tions. Briyalogorsky and Naik (2003) study clickstream data to analyze whether

a firm’s online activity cannibalizes offline sales, and whether these activities

can also build (long-term) online equity. Bucklin and Sismeiro (2003), Sismeiro

and Bucklin (2004), Montgomery et al. (2004), Moe (2003, 2006) also analyze

clickstream data with respect to website browsing and purchase behavior. In

this context, another type of data, namely online product (movie) review (cri-

tiques) has also been investigated in order to forecast product sales (ticket-box

performance). Dellarocas and Awad (2007) show that professional critic re-

views substantially increase forecasting accuracy for movie sales, whereas Zhu

and Zhang (2010) also include consumer characteristics in a moderating role

into their analysis of the impact of online consumer reviews on product sales in

the video gaming industry. The research conducted by Chintagunta et al. (2010)

investigates the effect of online-word-of-mouth on movie ticket sales. The pre-

dictive power of online chatter has been investigated by Gruhl et al. (2005) who

show that the volume of online blog postings can be used to predict spikes in

actual consumer purchase decisions, and Dhar and Chang (2009) who broach

the issue of user-generated content in blogs and social networks for prediction

purpose of music sales.

Common ground with respect to this past research is the usage of online

data, such as clickstream data or online reviews, for predictive purposes. This

basically means, that online available information on consumer-product rela-

tions have been used to predict product related outcomes. In this paper, I follow

this background and study the impact of online product configurator informa-

tion on offline sales for a premium car manufacturer. In addition, information on

consumer online search behavior, as part of information search prior to product

purchases, is also included into the analysis. The data have been obtained from

Google Insights for Search, and include online search intensities for key-words

concerning the product of interest. For example, Ginsberg et al. (2009) showed

that such online data can adequately be used to predict probabilities for doctor’s

appointments based on search queries including influenza symptoms. Conse-

quently, in this paper, I assess the predictive power of online car configurations

and search engine queries with respect to two different car models for offline

sale forecasts. Due to the nature of sequentially observed data, all investiga-

tions are conducted in a time series framework. Forecasting and time series

6 ARTICLE III

approaches have to some extent been applied to analyze data in the field of mar-

keting for a long time (e.g., Makridakis and Wheelwright 1977, Hanssens 1980,

1998, Aaker et al. 1982, Franses 1991, 1994, Cain 2005, Lim et al. 2005, Wang

and Zhang 2008, Deleersnyder et al. 2009, Srinivasan et al. 2010). Dekimpe and

Hanssens (2000) review the usage of time series models in the marketing field

and claim the application of such techniques due to increasing sizes of data sets,

the dynamics of the environment and the emergence of internet data sources.

The remainder of the paper is organized as follows. Section 2 introduces

the methodology of a simple autoregressive time series model without consid-

eration of online data. This model serves as the baseline to which the extended

model incorporating online data is then compared. In Section 2, the model

is also applied to the car order time series for the two car models discussed

throughout this paper, and the respective results are reviewed. In Section 3, I

discuss the time series regression approach under consideration of online avail-

able data, and show how this method is applied to our real-world observations.

Following, in Section 4, both approaches are compared, and the findings are dis-

cussed with respect to the assessment of the usability of online data as possible

predictors for offline car orders. I also introduce the Forecast Impact Factor as

simple out-of-sample R2-based measure for predictive power. Finally, in Sec-

tion 5, I conclude with discussion, implications, limitations and further research

in the respective and also related directions.

2. Time series methodology for car orders

In this section, I will apply a simple time series model to weekly observed

(offline) orders of two car models of a renowned premium car manufacturer

without incorporation of any external variables such as online data. This model

serves as the baseline model throughout the paper. Let us now consider two time

series of weekly car orders for two different car models. The first car model

considered in my analysis is from the compact luxury car segment, termed as

"model I", and the second car model is from the mid-luxury car segment, termed

as "model II". The time series of the weekly car orders have been obtained over

a period of time, ranging from February 2008 to December 2009 (85 weeks)

ARTICLE III 7

Figure 1 Car orders for model I and model II

a) Car Orders − Model I

Week/Year

Car

Ord

ers

− M

odel

I

0

1000

2000

3000

4000

9/2008 35/2008 10/2009 37/2009 53/2009

●

●

●

●●

●

●●●

●

●

●

●

●

●

●

●

●●●●●

●●●●●●●●●

●

●

●

●●●●●●

●

●

●

●

●

●

●●●

●●

●

●

●

●●

●

●

●●

●●●●

●

●

●●

●

●●

●●●●●

●●

●

●●●●●●

Data for model estimation

Time series dataTo be predicted

●

b) Car Orders − Model II

Week/Year

Car

Ord

ers

− M

odel

II

0

500

1000

1500

2000

2500

25/2007 51/2007 26/2008 52/2008 26/2009 53/2009

●

●

●

●

●●●●●

●

●●

●

●

●

●

●●●●●●●●●

●●

●

●●●●●●

●

●

●

●

●●●

●●●●

●●●

●●

●●

●

●●●●●●●

●

●●●

●

●●●

●●

●●

●●●●●●●

●

●

●

●●●●●

●

●

●

●●●●●●

●●

●●

●

●

●●●

●●

●●●●●●●

●

●●●●●●

Data for model estimation

Time series dataTo be predicted

●

for model I, and over a period of time ranging from June 2007 to December

2009 (135 weeks) for model II. Figure 1 displays the time series of the car or-

ders for model I and model II, respectively. From the figure, we can obtain that

we have quite some variation in the weekly car orders for both models across

the periods for which the data have been observed. Consequently, at a later

stage in this research it is of interest what portion of this variation can be ex-

plained by online data. In order to model the car orders as time series, I first

take a closer look at their respective structure, i.e. test for stationarity and de-

termine the orders of the autoregressive and moving average components. To

assess whether the series are stationary or not, Dickey-Fuller tests have been

conducted (Dickey and Fuller 1979). Based on a p-value less than 0.0522 for

car model I, and a respective p-value of less than 0.0123 for car model II, the

null-hypotheses of non-stationary processes could be rejected for both series.

Therefore, and in line with a simple time series framework, no transformations,

such as differencing and/or logarithmizing, have been applied to the data. As

a next step, the order of the autoregressive and moving average components

need to be determined. Here, I build on standard time series techniques and

visually investigate the autocorrelation functions and the partial autocorrela-

tion functions, respectively. Briefly, whereas the autocorrelation function of

an autoregressive process of order p, AR(p), decays from order p, its partial

autocorrelation function cuts off sharply from order p. Controversely, the au-

22Test-Statistic: τI = −2.0947; the p-value was obtained from Table 4.2, p. 103 of Banerjee et al. (1993).23Test-Statistic: τII = −2.9851; the p-value was obtained from Table 4.2, p. 103 of Banerjee et al. (1993).

8 ARTICLE III

Figure 2 ACF and PACF for the car orders of model I and model II

a) ACF: Car orders − Model I

Lag

AC

F

0.0

0.5

1.0

5 10 15 20

b) PACF: Car orders − Model I

Lag

Par

tial A

CF

−0.2

0.0

0.2

0.4

5 10 15 20

c) ACF: Car orders − Model II

Lag

AC

F

0.0

0.5

1.0

5 10 15 20

d) PACF: Car orders − Model II

LagP

artia

l AC

F

−0.2

0.0

0.2

0.4

5 10 15 20

tocorrelation function of a moving average process of order q, MA(q), cuts off

after lag q, and the respective partial autocorrelation function tails off. For a

mixed process (ARMA) both autocorrelation function and partial autocorrela-

tion function decay. For a detailed discussion of time series analyis, and the

respective methods for model selection and identification, please refer to Box

et al. (2008). Figure 2 provides the autocorrelation and partial autocorrelation

functions for the two car order series. From the figure, we obtain that for both

car order processes, for model I and model II, there is only one autoregressive

component (p = 1) and no moving average component (q = 0). Therefore, both

series can simply be modeled as AR(1)-processes. Another issue to be faced in

the context of the analysis of sequentially observed car orders, is the fact that

integer-valued time series are being investigated. The time-series literature has

been excessively discussing the analysis of count data in various applications

within the last decade (e.g., Jung and Tremayne 2003, Freeland and McCabe

2004a,b, Jung and Tremayne 2006, Karlis and Ntzoufras 2006, Zhu and Joe

2006, Kim and Park 2008, Davis and Wu 2009, Drost et al. 2009, Freeland

2009, Millar 2009, Silva et al. 2009, Weiss 2009), and consequently provides

the methodology of handling them properly. The car order observations for

model I range from 66 to 2509 (mean=606.04), and the observed weekly car

orders for model II range from 170 to 4084 (mean=1341.62), respectively. For

time series of such magnitudes approximations using continuous time-series

models such as the autoregressive (AR) process with Gaussian errors are usu-

ARTICLE III 9

ally adequate (Enciso-Mora et al. 2009). Thus, both car order series focused

on throughout this paper can appropriately be analyzed by applying traditional

time series techniques (see Box et al. 2008), namely assuming a normal error

distribution; εt ∼iid N(0, σ2) ∀t = 1, ..., T .

Summarizing, all necessary preliminary investigations of the two time se-

ries to be analyzed have been conducted. The models have been identified as

stationary AR(1)-processes, and it has been shown that traditional time series

techniques are sufficient. Following, I introduce the technical details for the

AR(1)-model, estimate the parameters, and discuss the respective results.

2.1 AR(1)-process for car orders

As derived above, we can model both time series as continuous AR(1)-processes.

Therefore, for both car order series, let us consider the following time series

model:

yt = φyt−1 + εt

εt ∼ N(0, σ2) , ∀t = 1, ..., T (1)

For the stationary series of car orders, I now want to model the deviation from a

constant mean with an autoregressive error process. Including a constant mean

notation into the model formulation (1) above, we get: yt = μ+φ(yt−1−μ)+εt.

Replacing yt−1 − μ by say ut−1, and yt − μ by say ut, respectively, this leads

to ut = φut−1 + εt. Thus, we can rewrite our AR(1)-model in (1) as local level

model with autoregressive errors of order one such that

yt = μ+ ut

ut = φut−1 + εt (2)

εt ∼ N(0, σ2) , ∀t = 1, ..., T

The reason for this model formulation, i.e., a local level with the deviation

modeled as AR(1)-errors becomes more obvious in the time series regression

section, when online data covariates are introduced. Then, it is attempted to ex-

plain the variation around the local level μ by external variables such as online

car configurations and online search queries. But first, I estimate the simple

10 ARTICLE III

AR(1)-model for both car order series, assess the model fit and reflect on the

results.

2.2 Parameter estimates and car order forecasts with the sim-ple AR(1)-process

One main goal of time series modeling is the providence of forecasts for future

observations in the series. In order to do so, the parameters have been estimated

within a Bayesian framework. The MCMC sampling algorithm applied follows

the approach of Chib (1993) and can be reviewed in appendix B. I ran sam-

pling chains for 12,000 iterations and assessed the convergence by monitoring

the time-series of the draws. The results are reported based on 10,000 draws re-

tained after discarding the first 2,000 draws as burn-in iterations. The diagnostic

plots, such as trace plots and posterior densities for the model parameters, can

be obtained from appendix A (Figure A-1 for car model I, and Figure A-2 for

car model II, respectively). The relevant statistics for the parameter estimates

are displayed in Table 1. From the table, we can obtain that the posterior mean

for the autoregresssion coefficient is significantly different from zero for both

car models. The estimates also confirm the results from the unit root tests as

the autoregression coefficients have not been estimated to be close to 1. The

variance estimates are in line with the magnitude of the estimated constants,

as for both models they indicate a standard deviation of approximately 40% to

50% from the series means μI and μII , respectively. This implies a rather large

variation in the observed series. The exciting part is the question whether this

variation can sufficiently be reduced by the consideration of online data.

Using this baseline model, we can now calculate the one-step ahead predic-

tions as well as the n-step ahead predictions. Thus, the one-step ahead predic-

tions are given by

yt = E(μ+ φ(yt−1 − μ) + εt) = μ+ φ(yt−1 − μ) (3)

and the n-step-ahead predictions are given by

yt+n = E

(μ+ φn(yt − μ) +

n∑j=1

φn−jεt+j

)= μ+ φn(yt − μ) (4)

ARTICLE III 11

Table 1 Parameter estimates - Time series model without online data

Parameter Car model I Car model IIestimates [95%-HPD] [95%-HPD]

μ 1306.42 602.03[1096.26;1518.70] [515.55;695.41]

φ 0.45419 0.3926[0.27765;0.65604] [0.22689;0.56948]

σ2 263188.3 88389.53[182807.9;346532.5] [68675.02;112781.34]

with the corresponding normal predictive distributions

yt+1 ∼ N(μ+ φ(yt−1 − μ), σ2) (5)

for the one-step-ahead predictions, and

yt+n ∼ N

(μ+ φn(yt − μ), σ21− φ2n

1− φ

)(6)

for the n-step-ahead predictions.

Figure 3 provides the fitted series and the respective out-of-sample predic-

tions. The car order out-of-sample forecasts for both car models, and the re-

spective observations, can be obtained from Table A-1. As an out-of-sample

forecasting horizon, I chose 12 weeks (3 months) as companies report their per-

formances on a quaterly basis, and consequently it was intended that this was

also a managerially relevant planning horizon. From the figure and also from

Table A-1 it can easily be seen that the out-of-sample forecasts rapidly converge

toward the local level μ of the series as common for such time series models.

This means that the mean of the series is the best forecast as we do not have any

additional information to explain the variation around the local level of the se-

ries. Later in Section 3., this variation is attempted to be explained by additional

information available online.

In order to assess the goodness of fit for this baseline model (in order to

later compare it to the extended model), we calculate the mean absolute perent-

age error (MAPE) for the in-sample one-step ahead predictions (for the T − 1

12 ARTICLE III

Figure 3 Orders for car model I and car model II - Fitted Time Series

a) Car orders model I − Fitted series and predictions

Week/Year

Car

Ord

ers

− M

odel

I −

with

fitte

d va

lues

0

1000

2000

3000

4000

5000

9/2008 35/2008 10/2009 37/2009 53/2009

●

●●

●●●

●●●

●

●

●

●

●

●●

●

●●●●●●●●●●

●●●●

●

●●

●●●●●●

●

●●

●●

●●●●

●●

●

●

●

●●

●

●

●●

●●●●

●●●●

●●●

●●●●●●●

●

●●●●●●

Time series dataTo be predictedOut−of−sample forecasts

●

b) Car orders model II − Fitted series and predictions

Week/Year

Car

Ord

ers

− M

odel

II −

with

fitte

d va

lues

0

500

1000

1500

2000

2500

25/2007 51/2007 26/2008 52/2008 26/2009 53/2009

●

●

●

●

●●●●●●

●●

●

●

●

●

●●●●●●●●●●●

●

●●●●●●

●

●

●

●

●●●●●●●

●●●

●●

●●

●

●●●●●●●

●

●●●

●

●●●

●●●●●●●●●

●●

●

●

●●●●●●

●

●

●

●●●●●●

●●●●

●

●

●●●

●●

●●●●●●●

●

●●●●●●


●

predictions) as well as for the out-of-sample n-step ahead predictions by

In-sample MAPE =1

T − 1

T∑t=2

|yt − yt|yt

(7)

Out-of-sample MAPE =1

n

T+n∑t=T+1

|yt − yt|yt

(8)

with yt the respective forecast for time t. The calculated in-sample MAPE for

car model I was 0.3119, and the in-sample MAPE for car model II was 0.4087.

These values are in line with the estimated variances broken down to percentage

deviations. Generally, in-sample fit measures tend to be reasonably good as

the model parameter estimates are actually based on exactly that sample. The

original degree of model fit can be obtained by the out-of-sample fit measures,

especially important in the context of forecasting and time series modeling. The

respective MAPE values were 0.7738 for car model I and 0.8461 for car model

II. This implies a rather large deviation from the actually observed values in the

out-of-sample forecasts. Consequently, there is no doubt that it is desirable to

have additional information that reduces the uncertainty in future observations.

In the subsequent section, I discuss the time series approach applied to the car

order series and two additional series of information as explanatory variables.

ARTICLE III 13

Figure 4 Online configurations for car model I and car model II

a) Online Car Configurations − Model I

Week/Year

Car

Con

figur

atio

ns −

Mod

el I

200

400

600

800

1000

1200

37/2007 1/2008 27/2008 1/2009 27/2009 53/2009

●●

●●●●●●●

●●

●

●●●●

●

●●●●

●●

●●

●

●

●

●●●

●

●●

●●●

●●

●●●●●●●●

●●●●●●●●

●●●●●●●

●

●

●●

●

●

●

●

●●●

●●●●

●

●

●

●

●●●

●●●●

●

●●●●

●●●●

●

●

●

●

●●

●

●●●

●●●●●●●

●

●●

●

●

●

●

b) Online Car Configurations − Model II

Week/Year

Car

Con

figur

atio

ns −

Mod

el II

200

400

600

1/2007 27/2007 1/2008 27/2008 1/2009 27/2009 53/2009

●●

●

●

●

●●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●●●●

●

●●

●

●●

●●

●●●●

●●

●

●

●

●

●

●●

●●●

●

●

●●●

●

●●●●●

●●●

●●

●●●●●●

●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●

●

●

●

●

●●●

●●

●

●●

●

●

●

●

●

●●

●

●

●●

●●●●

●

●●●

●

●

●●●●●●

●●●●●●

●

●

●

●

●●

●

●

3. Time series regression for car orders and onlinedata as predictors

In this section, I will show how two different types of online data can appro-

priately be incorporated in my analysis framework. Therefore, I obtained two

time series of online data closely related to the corresponding car model or-

ders. The first data series for car model I and car model II are the weekly car

configurations of the respective models on the car manufacturer’s website. In

other words, the manufacturer has information about how many cars of model

I and II, respectively, have been configured on its website. Figure 4 displays

the time series of the car configurations from June 2007 to December 2009

for model I and from January 2007 to December 2009 for model II, respec-

tively. As we can obtain from the figure, we see that the time series of online

configurations have some variation around the series means, and are thus ex-

pected to have some explanatory power of the variation in the car order series as

it is assumed to reflect the consumers’ average interests in a certain car model.

Because of missing data values, both time series of online car configurations

have been interpolated from June 2008 to November 2008. Those missing

values are due to software changes during which period the configurator was

offline and could not be used by the customers. In order to include the online

car configurations into the analysis they have been mean-centered with respect

to the series mean. Hence, I use the variation around a baseline level observed

14 ARTICLE III

Figure 5 Google search intensity for car model I and car model II

a) Online Search Intensity − Model I

Week/Year

Onl

ine

Sea

rch

Inte

nsity

− M

odel

I

50

60

70

80

90

100

37/2007 1/2008 27/2008 1/2009 27/2009 53/2009

●

●●

●●●

●

●

●

●●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●●

●

●●

●●●

●●

●

●

●

●●

●●

●

●●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●●

●●

●

●●

●

●

●●●●

●

●●

●●

●●

●

●●●●●

●

●●

●●

●

●

●

●

●●●

●

●

●

●

●

●

●●

●

●

●●

●●

●

b) Online Search Intensity − Model II

Week/Year

Onl

ine

Sea

rch

Inte

nsity

− M

odel

II

60

70

80

90

100

1/2007 27/2007 1/2008 27/2008 1/2009 27/2009 53/2009

●●●

●

●

●

●

●

●●

●

●●●●

●●

●

●●●●

●

●

●

●

●●

●●

●

●

●

●

●

●

●

●●●

●●

●

●

●

●

●

●

●●●

●

●●●

●

●

●

●●

●

●●

●

●

●●

●

●

●

●●

●

●

●

●

●

●●●

●●

●

●

●

●●●●

●●●●

●●

●●●●●●●●

●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●●

●

●

●

●●●

●

●●

●

●

●●●●●●

●●

●

●

●

●●●●●●

●

●●

●

from the online configurations to explain the variation observed in the car orders

themselves. Recalling the model formulation of our simple AR(1)-process in-

cluding a local level μ, the online configurations are utilized to explain a portion

of the variance in the car order series.

The second data series for the two car models are online search inquiries

recorded by Google Insights for Search. These data reflect the online search

intensity for certain key words over a period of time. In our case, the time series

show in the figurative sense the search intensity for the key words "car model I"

and "car model II", respectively. Figure 5 shows the corresponding data series.

Here, it is also expected that some information about consumers’ inter-

ests can account for variation in the car order series. As for the online car

configurations, the time series for the search intensities have also been mean-

centered. The observations for the search intensities are normalized to range

from 0 to 100, with 100 reflecting the maximum value within the considered

period of time.24 Before I continue with the technical details of the extended

time series regression model, I cover the issue of appointing the right period for

which the online data have been observed to the corresponding offline actions

(here: car orders). In this paper, I do not have a theory-based assumption of

specific time lags for the different online data. This means, there is no hypoth-

esis tested whether consumers configure their car, or search for product related

24For details on the normalizing procedure please refer to Google Insights for Search.

ARTICLE III 15

Figure 6 Local level model with deviations

Time

Car

Ord

ers

200

400

600

800

1 2 3 ... t ... T

μΔ1

Δ2

Δ3

Δt

ΔT

Δt= Deviation from the local level μ at time t

Car orders

information any time specific in advance. Although it is certainly worthwile to

appropriately address the issue of how much time in advance to a possible pur-

chase new car buyers usually visit third party websites and online configuration

tools on manufacturer’s websites, I proceed with calculating a model for each

lag-combination for online configurations and online search intensity. The in-

teresting question for the determinants of a respective online search behavior is

beyond the objective of this paper. This basically means, that for each combi-

nation of 24 weeks of lag-time for both online data series, a model is estimated.

This implies a total of 24× 24 = 576 models. Thus, the model performing best

can be chosen for forecasting purpose. Following, I derive the regression model

and review the results.

3.1 Time series regression with AR(1)-errors

In the preceding sections, for the car order series, I have discussed a local

level model. Considered more technically, the dependent series of interest

was assumed to be constant with some variation around the series mean, say

yt = μ + Δt. Figure 6 schematically displays such an interpretation of the car

order series. In the baseline model, the deviation from the local level μ at time t,

Δt, was modeled as a stationary AR(1)-process, ut = φut−1+εt. In this section,

I want to use online observables as covariates to explain part of that variation,

16 ARTICLE III

i.e., Δt = x′tβ + ut. The AR(1)-structure of the error terms is maintained as

it still concerns sequentially observed data. Thus, for the incorporation of co-

variates into the analysis, I consider a time series regression model including an

intercept and an error structure arising from a stationary AR(1)-process. Thus,

the model can be set up as follows

yt = μ+ x′tβ + ut

ut = φut−1 + εt (9)

εt ∼ N(0, σ2) , ∀t = 1, ..., T

When we compare the two model formulations in (2) and (9), we can easily

obtain that the only difference is the additional regression component x′tβ, as toexplain part of the variation in the original time series. Next, I will estimate the

model, derive forecasts and discuss the results.

3.2 Car order forecasts with the time series regression model

This section processes the model estimation for the time series regression and

the respective results. In sum, 576 regression models have been estimated using

a Bayesian framework. For each lag-combination of the two online data sources

the in-sample MAPE is reported in Table A-2 in appendix A for car model I and

in Table A-3 in appendix A for car model II, respectively. For the further anal-

ysis only the models minimizing the in-sample MAPE are considered. For car

model I, a time lag of 14 weeks for car configurations and a time lag of 18

weeks for the search queries performed best based on the in-sample MAPE as

measure of accuracy. For car model II, we get a different lag-structure. The

lag-combination performing best with respect to the in-sample MAPE implies

that on average customers configure their car 23 weeks in advance to a purchase,

and that they consult the internet in their information search only 10 weeks prior

to the order. The determination of the lag-structure with respect to the data is

certainly an issue to be further discussed, but is beyond the scope of this paper.

Here, I only want to show that information accessible online can be utilized

to improve forecasting performance. Although a more theory-based approach

might be desirable, I will continue with these data driven results. Equivalently

to Section 2.2, I used MCMC sampling following Chib (1993, see appendix

ARTICLE III 17

Table 2 Parameter estimates - Time series regression with online data

Parameter Car model I Car model IIestimates [95%-HPD] [95%-HPD]

μ 1279.15 578.2[1033.65;1561.43] [513.26;648.59]

βOC 0.54959 0.74787[-0.13684;1.32860] [0.12967;1.24680]

βSI -3.2163 -14.4326[-15.8470;11.1279] [-21.0154;-7.4852]

φ 0.49546 0.25200[0.30139;0.69581] [0.07873;0.42324]

σ2 262342.8 73950.78[189475.9;347134.3] [56503.31;93591.92]

Time Lag- Conf igurations 14 23- Search Intensity 18 10

B). I ran sampling chains for 12,000 iterations and assessed the convergence

by monitoring the time-series of the draws. The posterior inference again is

based on 10,000 draws retained after discarding the first 2,000 draws as burn-

in iterations. The respective diagnostic plots can be obtained from appendix A

(Figure A-3 for car model I, and Figure A-4 for car model II, respectively).

Table 2 provides the characteristics of the posterior parameter distributions.

From the table, we can obtain that the estimates for the local levels μI and

μII are close to the estimates from our baseline model. This implies that the

specification of the mean-centered covariates is reasonable to explain a portion

of the variance around the series means. For car model I, the parameter char-

acteristics of the explanatory variables, online car configurations and online

search intensity, indicate weak predictive power as both variables are likely to

affect the car orders in different directions. This is also confirmed by the large

estimate for the variance close to the original estimate from the autoregressive

model (1). For car model II, we get better results. It can be obtained from

Table 2 that the parameter estimates for both covariates are significantly differ-

ent from zero, as their 95% highest probability density regions do not change

signs. It can also be seen that online car configurations positively affect car

orders, whereas online search intensity negatively affects offline purchases. To

18 ARTICLE III

Figure 7 Orders for car model I and car model II - Fitted Time Series

a) Car orders model I − Fitted series and predictions

Week/Year

Car

Ord

ers

− M

odel

I −

with

fitte

d va

lues

0

1000

2000

3000

4000

5000

9/2008 35/2008 10/2009 37/2009 53/2009

●

●●

●●●

●●●

●

●

●

●

●

●●

●

●●●●●●●●●●

●●●●

●

●●

●●●●●●

●

●●

●●

●●●●

●●

●

●

●

●●

●

●

●●

●●●●

●●●●

●●●

●●●●●●●

●

●●●●●●


●

b) Car orders model II − Fitted series and predictions

Week/Year

Car

Ord

ers

− M

odel

II −

with

fitte

d va

lues

0

500

1000

1500

2000

2500

25/2007 51/2007 26/2008 52/2008 26/2009 53/2009

●

●

●

●

●●●●●●

●●

●

●

●

●

●●●●●●●●●●●

●

●●●●●●

●

●

●

●

●●●●●●●

●●●

●●

●●

●

●●●●●●●

●

●●●

●

●●●

●●●●●●●●●

●●

●

●

●●●●●●

●

●

●

●●●●●●

●●●●

●

●

●●●

●●

●●●●●●●

●

●●●●●●


●

dare an explanation of these results, one could possibly say that having planned

on buying a new car, say 23 weeks in advance of the likely purchase, and then

changing their mind, customers use the internet to search for alternatives. But

the interpretation of the results in the direction of behavioral decision theory is

not the scope of this present study. Figure 7 displays the fitted series as well as

the 12-weeks ahead forecasts. The assessment of the goodness of fit measures

leads to an in-sample MAPE of 0.3010 for car model I and an in-sample MAPE

of 0.3547 for car model II. The out-of-sample MAPEs take on the values 0.7320

and 0.3482, respectively. To better understand the benefits of including external

variables such as online data into the analysis, in the next section I will discuss

a detailed comparison of the two presented approaches.

4. Comparison of the car order forecasts with andwithout consideration of online data

In the previous two sections, I set up two different models for business forecast-

ing with respect to car orders for two models, termed as car model I and carmodel II, respectively; (i) a simple stationary autoregressive time series model

of order one (Section 2.) set up as a constant level model with autoregressive

error structure, and (ii) a time series regression model with autoregressive error

structure of order one (Section 3.) incorporating online data such as online car

ARTICLE III 19

Table 3 Mean absolute percentage errors (MAPE)

with Online data Car model I Car model II

In-sample MAPE 0.3010 0.3547

Out-of-sample MAPE 0.7320 0.3482

without Online data Car model I Car model II

In-sample MAPE 0.3119 0.4087

Out-of-sample MAPE 0.7738 0.8461

configurations and online search intensity as possible predictors. In this sec-

tion, I compare the predictive performance of the two approaches with respect

to forecast errors. The measure of forecasting accuracy is the mean absolute

percentage error (MAPE) as already used in Section 3. to determine the optimal

lag structure of the online predictors. Table 3 provides the respective calculated

MAPE-values for the simple time series model in Section 2., as well as the cal-

culated MAPE-values for the time series regression model in Section 3. for the

two car models. As can easily be obtained from the table, the time series re-

gression approach including the online data outperforms the simple time series

approach with no online data included with respect to the MAPE-measure for

both car models. For car model I, the in-sample predictive performance mea-

sured by the MAPE with respect to one-step-ahead forecasts is improved by

3.49%, and the out-of-sample MAPE for the 12-week forecasts is improved by

5.4%. This indicates little predictive power for the online data with respect to

car model I. This can also be seen by looking at the two estimated variances for

the modeled processes by the two approaches. Without consideration of online

data, the process variance is estimated to be σ2I = 263188.3 (see Table 1), com-

pared to an estimated variance of σ2I = 262342.8 (see Table 2) when online data

are included via time series regression. Thus, the estimated process variance

could only be reduced by 0.32% through the incorporation of the discussed on-

line predictors. This result is also confirmed by the respective adjusted R2 as

a result of the sampling scheme provided in appendix B, which follows a data

transformation and standard linear model results (Chib 1993). The resulting

R2adj. of 0.0039 from the time series regression approach also indicates that only

0.39% of the variation in the time series of the car orders for model I can be

20 ARTICLE III

Figure 8 Comparison of Foracasts - with and without online data

a) Car order predictions − Model I

Week/Year

Car

ord

ers

mod

el I

− P

redi

ctio

ns

400

600

800

1000

1200

1400

42/2009 45/2009 48/2009 51/2009 53/2009

● ● ● ● ●●

●

●●

●

●

●

Observed car orders: Model IPredictions without online dataPredictions with online data

●

b) Car order predictions − Model II

Week/Year

Car

ord

ers

mod

el II

− P

redi

ctio

ns

300

400

500

600

42/2009 45/2009 48/2009 51/2009 53/2009

●●

● ● ●●

●

● ●

●

● ●

Observed car orders: Model IIPredictions without online dataPredictions with online data

●

explained by the online data. Hence, such a result leads to a conclusion that

the online data could not significantly contribute to the predictive performance.

Figure 8a) provides the graph for the 12 out-of-sample observations and the

predictions by our two considered approaches for car model I. Although, the

results do not support a sufficient improvement of the forecast performance

by the incorporation of online data, from the figure, we can still obtain that

the variation in the predicted series from the time series regression is more in

line with the variation in the observed data than the predictions by the simple

AR(1)-model which rapidly converge toward a constant mean. We see that the

predicted series from the model with online data reflects a similar see-saw pat-

tern as is observed from the real series. This might indicate that online data

can in fact be used to predict changes and variations in the car order series for

model I. Considering the results for car model II, we determine a better predic-

tive power for the online data. For car model II, the MAPE measure of forecast

accuracy obtained from Table 3 confirms a reduction of the in-sample forecast

error by 13.21%, and an improvement of the out-of-sample forecast error by

58.85%. This indicates a strong predictive performance of the model estimated

with online data as explanatory variables. The estimated variances for the car

order series with respect to the two different approaches also confirm such an

improvement of model performance. The variance for the simple AR(1)-model

was estimated to be σ2II = 88389.53 (see Table 1) compared to an estimated

variance of σ2II = 73950.78 (see Table 2) for the time series regression model

ARTICLE III 21

with online data as predictors. This shows that by incorporation of online car

configurations and online search intensity, the variance could be reduced by

19.35%. The respective adjusted R2 from the regression based estimation (see

appendix B) was calculated to be 0.1769. This also implies that the online data

could explain over 17% of the variation in the originally observed car order se-

ries. Figure 8b) displays the 12 out-of-sample observations and the predictions

by our two considered approaches for car model II. As well as for car model

I, and in an even more clear-cut fashion, we can obtain that the order forecasts

provided by the regression model with online data are closer to the actually

observed values. Again, we can see that the time series regression approach

better reflects the see-saw variation in the data. For model II, the results indi-

cate significant forecasting potential of online configurations and online search

intensity.

Forecast Impact Factor (FIF)

In order to operationalize the determination of the predictive power of external

variables for forecasting purpose, I introduce anR2-based measure for forecast-

ing performance, the Forecast Impact Factor (FIF). The FIF is defined as the

out-of-sample R2, as it is calculated as the mean portion of the mean squared

errors with respect to the n out-of-sample observations which can be accounted

for by the regression variables compared to the baseline model. Thus,

FIFn(x) = 1−∑T+n

t=T+1(ΔREGt )2∑T+n

t=T+1(ΔAR(1)t )2

, (10)

with Δt = yt − yt, yt the forecast with respect to the applied model (REG or

AR(1)). The FIF depends on the number n of out-of-sample observations for

which the mean is calculated, and on the explanatory variables x. In addition to

the overal FIF for a set of explanatory variables x, it could also be of interest

to determine the predictive power of a single variable of interest, given that

other variables have been available and used for model estimation. Therefore,

the conditional FIF for an explanatory variable xj given a set of J explanatory

variables, can then be calculated as

FIFn(xj|x1, ..., xj−1, xj+1, ..., xJ) = 1−∑T+n

t=T+1(ΔREGt + βjxj)

2∑T+nt=T+1(Δ

AR(1)t )2

. (11)

22 ARTICLE III

Table 4 Unconditional and conditional Forecast Impact Factors

Car model I Car model II

FIF12(xOC , xSI) 0.0990 0.7595

FIF12(xOC |xSI) 0.1119 0.3900

FIF12(xSI |xOC) 0.0943 0.6048

The calculations for the conditional FIF of a set of covariates is straightfor-

ward. With the Forecast Impact Factor one can easily calculate the forecast

impact of (a) variable(s) of interest, unconditionally or conditional on a given

set of other regressors, and thus the impact of each variable in a set of informa-

tion can be distangled. For the application in this paper, the unconditional and

conditional Forecast Impact Factors for both car models can be obtained from

Table 4. The results are in line with those from the preceding sections, as we

see that for car model I, we get rather low FIF s compared to rather high ones

for car model II.

To summarize, the results, although more critically for car model I, gener-

ally indicate an enormous information potential for online data with respect to

forecasting performance. Thus, in the future, firms should embrace the chal-

lenges to correctly incorporate online data into their forecasting procedures in

order to improve their predictive performances. In the subsequent final sec-

tion I discuss general aspects of online data, the general findings of this study,

research and managerial implications, and motivate further research in related

directions.

5. Conclusion

Forecasting has long been an issue in the economic literature. In this paper,

I exemplarily show how online data can appropriately be incorporated into a

simple business forecast model. Due to the nature of sequentially (weekly) ob-

served car order data, simple time series methods are applied. I investigate how

online car configurations available from a manufacturer’s webpage, as well as

freely available data on internet search behavior improve car order forecasts. I

ARTICLE III 23

can show that information from these online data for one car model can account

for over 17% of the variation in the original car order series. The correspond-

ing Forecast Impact Factor of 0.76 provides evidence for the predictive power

of online data. Thus, there exist online data that are to be utilized by compa-

nies in terms of improving forecasting models. The goal of this paper is not to

postulate a monopoly of online data as predictors for future events, but is sim-

ply a demontration that, compared to a baseline model without any predictors,

online data can improve the predictive performance as they can account for a

significant portion of variation in the series of interest.

From a methodological perspective, although the methodology is not the

main objective in this paper, it can be argued if the models can be further

improved by using more complex approaches. For example, in a time series

setting, the consideration of varying parameters in the time domain could ac-

count for changes in the importance of the predictors. Such dynamics in the

parameters could more precisely reflect changing relevances of the internet and

consequently online data, as for example the predictive power of online search

queries could increase over time as more people are provided regular access

to the internet. State-space models in general, and dynamic linear models in

particular as closely related to time series regression, could be more advanced

to cope the additional challenge of system dynamics that cannot be denied in

such a fast changing and developing environment as given by the world wide

web. The methodology was certainly not the main contribution of this paper,

as I just wanted to indicate the growing relevance of online data with respect

of predictive power for future trends and events. Further research on related

topics incorporating sequentially observed online data should therefore apply

proper methods that better cope the challenges of a fast growing and fast chang-

ing relation structure. Though simple time series methods have been applied

throughout this paper, evidence for the predictive power of online data could

sufficiently be provided.

Another aspect to be discussed, is the determination of the time lags for

the explanatory variables such as online car configurations and online search

intensity. In this research, I simply used a data driven approach using those

time lags minimizing the in-sample mean absolute percentage error (MAPE).

Future work could consider more theoretically driven approaches as to con-

24 ARTICLE III

sider information about general online search behavior and how far in advance

people use the internet within their information search for products prior to

purchase. The database for such analyses certainly exists and is growing ever-

day. Another critique might arise from a possible lack of reliability in the

data. This issue is difficult to address. But using online search queries and

configuration frequencies implies that people show interest in certain topics

and/or products. This is different to user-generated content with respect to prod-

uct reviews and movie critiques, as those could easily be manipulated to induce

certain opinions. The critical part here is that, as shown by Chintagunta et al.

(2010), the valence and not the volume tends to be the key driver for predic-

tions. With respect to online search queries this issue seems to be less critical

as the influence of manipulation of search intensity, if possible, is marginal as

in this area the volume is obviously the matter.

In general, this research stream can be driven a lot further in very differ-

ent directions. First, the predictive power of online data should definitely be

investigated in competition with traditional data incorporated into forecasting

models. Such results would provide the absolute benefit of recording customers

online behavior. Applying the conditional Forecast Impact Factor, introduced

in this paper, can then reveal the relevance of online data when competing with

tradionally considered indicators, e.g., gasoline prices in the automotive indus-

try. Second, having access to individual level and sociodemographic data can

provide the basis for segment specific Forecast Impact Factors. One starting

point here could be emerging social networks, such as Facebook or Myspace,

as user provide a lot of relevant information on their profiles. Recent devel-

opments in the field of social networks, such as companies having Facebook

profiles, or third party websites providing analytics for social media, offer a

great playground for subject-related analysis motivated in this paper.

Summarizing, the emerging online community, and hence the correspond-

ing available data offer a variety of interesting fields for further research. In

this paper, my attempt was to provide evidence for the predictive power of on-

line data, and to demonstrate how such data can simply be used in forecasting

models. The Forecast Impact Factor is a hands-on tool to assess the predictive

power, and can be used to easily compare competing alternatives. I hope that I

could motivate further research in this and related directions.

ARTICLE III 25

ReferencesAaker, D. A., J. M. Carman, R. Jacobson. 1982. Modeling advertising-sales relationships in-

volving feedback: A time series analysis of six cereal brands. Journal of Marketing Re-search 19(1) 116–125.

Banerjee, A., J. J. Dolado, J. W. Galbraith, D. F. Hendry. 1993. Cointegration, Error Correction,and the Econometric Analysis of Non-Stationary Data. Oxford University Press, Oxford.

Box, G. E. P., G. M. Jenkins, G. C. Reinsel. 2008. Time Series Analysis: Forecasting andControl. 4th ed. John Wiley & Sond, Inc.

Briyalogorsky, E., P. Naik. 2003. Clicks and mortar: The effect of on-line activities on off-linesales. Marketing Letters 14(1) 21–32.

Bucklin, R. E., C. Sismeiro. 2003. A model of web site browsing behavior estimated on click-stream data. Journal of Marketing Research 40(3) 249–267.

Cain, P. M. 2005. Modelling and forecasting brand share: A dynamic demand system approach.International Journal of Research in Marketing 22(2) 203–220.

Chib, S. 1993. Bayes regression with autoregressive errors : A gibbs sampling approach. Jour-nal of Econometrics 58(3) 275–294.

Chintagunta, P. K., S. Gopinath, S. Venkataraman. 2010. The effects of online user reviews onmovie box office performance: Accounting for sequential rollout and aggregation acrosslocal markets. Marketing Science 29(5) 944–957.

Davis, R. A., R. Wu. 2009. A negative binomial model for time series of counts. Biometrika3(96) 735–749.

Decker, R., M. Trusov. 2010. Estimating aggregate consumer preferences from online productreviews. International Journal of Research in Marketing 27(4) 293–307.

Dekimpe, M. G., D. M. Hanssens. 2000. Time-series models in marketing:: Past, present andfuture. International Journal of Research in Marketing 17(2-3) 183–193.

Deleersnyder, B., M. G. Dekimpe, J.-B. E.M Steenkamp, P. S.H Leeflang. 2009. The roleof national culture in advertising’s sensitivity to business cycles: An investigation acrosscontinents. Journal of Marketing Research 46(5) 623–636.

Dellarocas, X., C.and Zhang, N. Awad. 2007. Exploring the value of online product reviewsin forecasting sales: The case of motion pictures. Journal of Interactive Marketing 21(4)23–45.

Dhar, V., E. A. Chang. 2009. Does chatter matter? the impact of user-generated content onmusic sales. Journal of Interactive Marketing 23(4) 300–307.

Dickey, D. A., W. A. Fuller. 1979. Distribution of the estimators for autoregressive time serieswith a unit root. Journal of the American Statistical Association 74(366) 427–431.

Drost, F. C., R. van den Akker, B. J. M. Werker. 2009. Efficient estimation of auto-regressionparameters and innovation distributions for semiparametric integer-valued ar(p) models.Journal of the Royal Statististical Society / Series B 71(2) 467–485.

Enciso-Mora, V., P. Neal, T. Subba Rao. 2009. Efficient order selection algorithms for integer-valued arma processes. Journal of Time Series Analysis 30(1) 1–18.

Franses, P. H. 1991. Primary demand for beer in the netherlands: An application of ARMAXmodel specification. Journal of Marketing Research 28(2) 240–245.

Franses, P. H. 1994. Modeling new product sales; an application of cointegration analysis.International Journal of Research in Marketing 11(5) 491–502.

Freeland, R. K. 2009. True integer value time series. AStA Advances in Statistical Analysis94(3) 217–229.

26 ARTICLE III

Freeland, R. K., B. P. M. McCabe. 2004a. Analysis of low count time series data by poissonautoregression. Journal of Time Series Analysis 25(5) 701–722.

Freeland, R. K., B. P. M. McCabe. 2004b. Forecasting discrete valued low count time series.International Journal of Forecasting 20(3) 427–434.

Geman, S., D. Geman. 1984. Stochastic relaxation, gibbs distributions, and the bayesian restora-tion of images. IEEE Trans. Pattern Analysis and Machine Intelligence 6 721 – 741.

Ginsberg, J., M.H. Mohebbi, R.S. Patel, L. Brammer, M.S. Smolinski, L. Brilliant. 2009. De-tecting influenza epidemics using search engine query data. Nature 457(7232) 1012–1014.

Gruhl, D., R. Guha, R. Kumar, J. Novak, A. Tomkins. 2005. The predictive power of onlinechatter. Proceedings of the eleventh ACM SIGKDD international conference on Knowledgediscovery in data mining. KDD ’05, ACM, New York, NY, USA, 78–87.

Hanssens, D. M. 1980. Market response, competitive behavior, and time series analysis. Journalof Marketing Research 17(4) 470–485.

Hanssens, D. M. 1998. Order forecasts, retail sales, and the marketing mix for consumerdurables. Journal of Forecasting 17(3-4) 327–346.

Heil, O., D. Lehmann, S. Stremersch. 2010. Marketing competition in the 21st century. Inter-national Journal of Research in Marketing 27(2) 161–163.

ITU, (International Telecommunication Union). 2010. Measuring the information society. Tech.rep., UN Agency for Information and Communication Technologies.

Jung, R. C., A. R. Tremayne. 2003. Testing for serial dependence in time series models ofcounts. Journal of Time Series Analysis 24(1) 65–84.

Jung, R. C., A. R. Tremayne. 2006. Binomial thinning models for integer time series. StatisticalModelling 6 81–96.

Karlis, D., I. Ntzoufras. 2006. Bayesian analysis of the differences of count data. Statistics inMedicine 25(11) 1885–1905.

Kim, H., Y. Park. 2008. A non-stationary integer-valued autoregressive model. StatisticalPapers 49(3) 485–502.

Klein, L. R., G. T. Ford. 2003. Consumer search for information in the digital age: An empiricalstudy of prepurchase search for automobiles. Journal of Interactive Marketing 17(3) 29–49.

Lim, J., I. S. Currim, R. L. Andrews. 2005. Consumer heterogeneity in the longer-term effectsof price promotions. International Journal of Research in Marketing 22(4) 441–457.

Makridakis, S., S. C. Wheelwright. 1977. Forecasting: Issues & challenges for marketingmanagement. Journal of Marketing 41(4) 24–38.

Millar, R. B. 2009. Comparison of hierarchical bayesian models for overdispersed count datausing dic and bayes’ factors. Biometrics 65(3) 962–969.

Moe, W. W. 2003. Buying, searching, or browsing: Differentiating between online shoppersusing in-store navigational clickstream. Journal of Consumer Psychology 13(1/2) 29–39.

Moe, W. W. 2006. An empirical two-stage choice model with varying decision rules applied tointernet clickstream data. Journal of Marketing Research 43(4) 680–692.

Montgomery, A. L. 1999. Using clickstream data to predict www usage. Working paper,Graduate School of Industrial Administration, Carnegie Mellon University.

Montgomery, A. L. 2001. Applying quantitative marketing techniques to the internet. Interfaces31(2) 90–108.

Montgomery, A. L., S. Li, K. Srinivasan, J. C. Liechty. 2004. Modeling online browsing andpath analysis using clickstream data. Marketing Science 23(4) 579–595.

Ratchford, B. T., M.-S. Lee, D. Talukdar. 2003. The impact of the internet on informationsearch for automobiles. Journal of Marketing Research 40(2) 193–209.

ARTICLE III 27

Ratchford, B. T., D. Talukdar, M.-S. Lee. 2001. A model of consumer choice of the internet asan information source. International Journal of Electronic Commerce 5(3) 7–21.

Silva, N., I. Pereira, M. E. Silva. 2009. Forecasting in INAR(1) model. REVSTAT – StatisticalJournal 7(1) 119–134.

Sismeiro, C., R. E. Bucklin. 2004. Modeling purchase behavior at an e-commerce web site: Atask-completion approach. Journal of Marketing Research 41(3) 306–323.

Srinivasan, S., M. Vanhuele, K. Pauwels. 2010. Mind-set metrics in market response models:An integrative approach. Journal of Marketing Research 47(4) 672–684.

Wang, F., X.-P. (Steven) Zhang. 2008. Reasons for market evolution and budgeting implications.Journal of Marketing 72(5) 15–30.

Weiss, C. 2009. Modelling time series of counts with overdispersion. Statistical Methods &Applications 18 507–519.

Winters, P. R. 1960. Forecasting sales by exponentially weighted moving averages. Manage-ment Science 6(3) 324–342.

Zhu, F., X. (Michael) Zhang. 2010. Impact of online consumer reviews on sales: The moderat-ing role of product and consumer characteristics. Journal of Marketing 74(2) 133–148.

Zhu, R., H. Joe. 2006. Modelling count data time series with markov processes based onbinomial thinning. Journal of Time Series Analysis 27(5) 725–738.

28 ARTICLE III

A. Tables and figures

Table A-1 Out-of-sample forecasts and 95%-upper and lower bounds- without online data -

Car model In-step-ahead Observation Forecast 95%-Lower bound 95%-Upper bound

1 768 1079.59 74.09 2085.092 777 1203.40 99.05 2307.743 780 1259.63 135.97 2383.284 788 1285.17 157.56 2412.775 787 1296.76 168.35 2425.186 751 1302.03 173.45 2430.617 999 1304.43 175.81 2433.048 773 1305.51 176.89 2434.139 739 1306.01 177.38 2434.6310 948 1306.23 177.61 2434.8511 534 1306.33 177.71 2434.9612 419 1306.38 177.76 2435.00

Car model IIn-step-ahead Observation Forecast 95%-Lower bound 95%-Upper bound

1 340 505.04 0 [-77.66] 1087.752 325 563.95 0 [-62.05] 1189.963 339 587.08 0 [-45.33] 1219.494 348 596.16 0 [-37.24] 1229.565 353 599.72 0 [-33.82] 1233.276 330 601.12 0 [-32.45] 1234.707 425 601.67 0 [-31.90] 1235.258 305 601.89 0 [-31.69] 1235.479 296 601.97 0 [-31.60] 1235.5510 403 602.01 0 [-31.57] 1235.5811 243 602.02 0 [-31.56] 1235.6012 234 602.03 0 [-31.55] 1235.60

Note: Negative values have been truncated to zero as car orders are alwaysgreater or equal to zero.

ARTICLE III 29

Tabl

eA

-2In

-sam

ple

mea

nab

solu

tepe

cent

age

erro

r(M

AP

E)-

Car

mod

elI

Onl

ine

Car

Con

figu

ratio

nsLa

g1

Lag

2La

g3

Lag

4La

g5

Lag

6La

g7

Lag

8La

g9

Lag

10La

g11

Lag

12

Goo

gle

Sear

chIn

tens

ity

Lag

10.3086

0.3096

0.3093

0.3093

0.3150

0.3125

0.3102

0.3065

0.3112

0.3090

0.3082

0.3071

Lag

20.3170

0.3166

0.3165

0.3141

0.3181

0.3160

0.3138

0.3118

0.3112

0.3105

0.3119

0.3091

Lag

30.3141

0.3150

0.3160

0.3130

0.3173

0.3133

0.3135

0.3085

0.3098

0.3091

0.3085

0.3086

Lag

40.3117

0.3119

0.3102

0.3100

0.3153

0.3164

0.3127

0.3101

0.3124

0.3109

0.3090

0.3077

Lag

50.3158

0.3159

0.3169

0.3142

0.3168

0.3159

0.3112

0.3115

0.3120

0.3108

0.3115

0.3103

Lag

60.3163

0.3162

0.3162

0.3146

0.3192

0.3167

0.3127

0.3116

0.3110

0.3100

0.3103

0.3094

Lag

70.3128

0.3121

0.3119

0.3109

0.3158

0.3141

0.3109

0.3085

0.3084

0.3105

0.3075

0.3074

Lag

80.3161

0.3162

0.3161

0.3131

0.3186

0.3156

0.3138

0.3118

0.3102

0.3098

0.3107

0.3096

Lag

90.3145

0.3152

0.3143

0.3133

0.3183

0.3160

0.3131

0.3111

0.3100

0.3111

0.3112

0.3097

Lag

100.3163

0.3157

0.3166

0.3147

0.3188

0.3150

0.3138

0.3126

0.3114

0.3113

0.3101

0.3104

Lag

110.3162

0.3170

0.3170

0.3163

0.3207

0.3169

0.3128

0.3124

0.3097

0.3099

0.3123

0.3098

Lag

120.3117

0.3119

0.3106

0.3100

0.3141

0.3130

0.3106

0.3062

0.3073

0.3064

0.3034

0.3030

Lag

130.3158

0.3154

0.3170

0.3145

0.3172

0.3156

0.3122

0.3121

0.3092

0.3101

0.3114

0.3125

Lag

140.3148

0.3162

0.3155

0.3119

0.3162

0.3155

0.3114

0.3114

0.3098

0.3091

0.3091

0.3104

Lag

150.3141

0.3155

0.3159

0.3130

0.3178

0.3161

0.3119

0.3113

0.3112

0.3101

0.3117

0.3096

Lag

160.3164

0.3154

0.3166

0.3138

0.3153

0.3165

0.3118

0.3122

0.3094

0.3096

0.3124

0.3086

Lag

170.3135

0.3114

0.3101

0.3092

0.3139

0.3108

0.3072

0.3087

0.3065

0.3060

0.3078

0.3049

Lag

180.3161

0.3155

0.3157

0.3141

0.3184

0.3159

0.3104

0.3108

0.3100

0.3097

0.3116

0.3096

Lag

190.3117

0.3111

0.3113

0.3063

0.3091

0.3081

0.3036

0.3045

0.3044

0.3052

0.3039

0.3032

Lag

200.3167

0.3151

0.3152

0.3132

0.3184

0.3151

0.3116

0.3106

0.3089

0.3110

0.3107

0.3112

Lag

210.3150

0.3137

0.3164

0.3136

0.3170

0.3141

0.3118

0.3098

0.3082

0.3092

0.3102

0.3103

Lag

220.3178

0.3146

0.3171

0.3142

0.3150

0.3166

0.3128

0.3098

0.3102

0.3097

0.3109

0.3076

Lag

230.3179

0.3169

0.3161

0.3126

0.3167

0.3161

0.3118

0.3109

0.3103

0.3103

0.3107

0.3103

Lag

240.3174

0.3153

0.3164

0.3135

0.3178

0.3156

0.3136

0.3111

0.3097

0.3105

0.3114

0.3099

Onl

ine

Car

Con

figu

ratio

nsLa

g13

Lag

14La

g15

Lag

16La

g17

Lag

18La

g19

Lag

20La

g21

Lag

22La

g23

Lag

24

Goo

gle

Sear

chIn

tens

ity

Lag

10.3082

0.3050

0.3092

0.3081

0.3059

0.3079

0.3102

0.3093

0.3149

0.3040

0.3135

0.3215

Lag

20.3111

0.3038

0.3152

0.3128

0.3095

0.3129

0.3180

0.3162

0.3153

0.3096

0.3173

0.3190

Lag

30.3089

0.3034

0.3132

0.3113

0.3060

0.3112

0.3139

0.3129

0.3164

0.3082

0.3170

0.3168

Lag

40.3091

0.3079

0.3120

0.3120

0.3107

0.3113

0.3139

0.3121

0.3124

0.3092

0.3151

0.3197

Lag

50.3103

0.3067

0.3143

0.3130

0.3096

0.3158

0.3182

0.3150

0.3170

0.3097

0.3193

0.3205

Lag

60.3111

0.3046

0.3149

0.3137

0.3092

0.3119

0.3182

0.3156

0.3181

0.3108

0.3173

0.3195

Lag

70.3084

0.3030

0.3112

0.3100

0.3066

0.3104

0.3114

0.3127

0.3143

0.3063

0.3133

0.3181

Lag

80.3107

0.3045

0.3141

0.3128

0.3101

0.3129

0.3188

0.3149

0.3177

0.3107

0.3174

0.3200

Lag

90.3101

0.3058

0.3147

0.3142

0.3087

0.3139

0.3166

0.3158

0.3151

0.3096

0.3171

0.3201

Lag

100.3114

0.3052

0.3149

0.3139

0.3092

0.3140

0.3163

0.3167

0.3168

0.3104

0.3176

0.3199

Lag

110.3118

0.3039

0.3140

0.3120

0.3090

0.3127

0.3182

0.3155

0.3174

0.3109

0.3182

0.3196

Lag

120.3061

0.3011

0.3074

0.3092

0.3036

0.3069

0.3110

0.3105

0.3108

0.3042

0.3116

0.3140

Lag

130.3113

0.3027

0.3153

0.3150

0.3099

0.3148

0.3187

0.3170

0.3157

0.3116

0.3181

0.3187

Lag

140.3117

0.3048

0.3142

0.3132

0.3094

0.3138

0.3161

0.3157

0.3151

0.3110

0.3158

0.3188

Lag

150.3118

0.3046

0.3149

0.3146

0.3094

0.3147

0.3175

0.3153

0.3159

0.3105

0.3163

0.3200

Lag

160.3111

0.3043

0.3156

0.3140

0.3096

0.3136

0.3181

0.3171

0.3172

0.3104

0.3173

0.3196

Lag

170.3089

0.3033

0.3110

0.3114

0.3078

0.3099

0.3126

0.3124

0.3116

0.3096

0.3130

0.3171

Lag

180.3102

0.30

100.3149

0.3142

0.3071

0.3150

0.3176

0.3163

0.3169

0.3092

0.3172

0.3197

Lag

190.3052

0.3010

0.3080

0.3088

0.3046

0.3081

0.3094

0.3114

0.3102

0.3072

0.3115

0.3123

Lag

200.3118

0.3011

0.3150

0.3134

0.3096

0.3146

0.3206

0.3160

0.3179

0.3101

0.3186

0.3204

Lag

210.3093

0.3047

0.3145

0.3125

0.3087

0.3135

0.3169

0.3164

0.3160

0.3105

0.3165

0.3184

Lag

220.3102

0.3055

0.3140

0.3149

0.3100

0.3142

0.3183

0.3175

0.3153

0.3127

0.3185

0.3211

Lag

230.3113

0.3043

0.3132

0.3141

0.3096

0.3120

0.3189

0.3180

0.3168

0.3106

0.3174

0.3203

Lag

240.3105

0.3039

0.3137

0.3149

0.3095

0.3135

0.3216

0.3176

0.3190

0.3109

0.3187

0.3211

30 ARTICLE III

Tabl

eA

-3In

-sam

ple

mea

nab

solu

tepe

cent

age

erro

r(M

AP

E)-

Car

mod

elII

Onl

ine

Car

Con

figu

ratio

nsLa

g1

Lag

2La

g3

Lag

4La

g5

Lag

6La

g7

Lag

8La

g9

Lag

10La

g11

Lag

12

Goo

gle

Sear

chIn

tens

ity

Lag

10.3895

0.3985

0.3981

0.3960

0.4044

0.3989

0.3988

0.4064

0.4029

0.3995

0.3993

0.4034

Lag

20.3766

0.3873

0.3913

0.3860

0.3959

0.3899

0.3923

0.4016

0.3960

0.3927

0.3923

0.3993

Lag

30.3843

0.3890

0.3917

0.3865

0.3993

0.3939

0.3912

0.4035

0.3983

0.3926

0.3963

0.4004

Lag

40.3920

0.3972

0.3981

0.3975

0.4049

0.4002

0.3992

0.4050

0.4055

0.4027

0.4026

0.4055

Lag

50.3740

0.3810

0.3789

0.3753

0.3879

0.3832

0.3832

0.3916

0.3851

0.3840

0.3839

0.3865

Lag

60.3817

0.3873

0.3882

0.3822

0.3910

0.3863

0.3884

0.3973

0.3926

0.3913

0.3922

0.3938

Lag

70.3891

0.3936

0.3934

0.3939

0.3975

0.3927

0.3928

0.4036

0.3989

0.3961

0.3958

0.4017

Lag

80.3765

0.3821

0.3809

0.3785

0.3849

0.3773

0.3794

0.3894

0.3849

0.3835

0.3818

0.3887

Lag

90.3895

0.3969

0.3973

0.3942

0.4020

0.3960

0.3969

0.4076

0.4024

0.4023

0.3999

0.4045

Lag

100.3605

0.3719

0.3706

0.3685

0.3736

0.3671

0.3668

0.3761

0.3689

0.3695

0.3698

0.3751

Lag

110.3787

0.3819

0.3824

0.3807

0.3883

0.3828

0.3853

0.3948

0.3888

0.3853

0.3870

0.3906

Lag

120.3829

0.3877

0.3862

0.3866

0.3908

0.3876

0.3914

0.3928

0.3904

0.3861

0.3852

0.3926

Lag

130.3720

0.3787

0.3797

0.3755

0.3814

0.3794

0.3826

0.3908

0.3843

0.3806

0.3780

0.3819

Lag

140.3843

0.3933

0.3934

0.3920

0.3958

0.3902

0.3938

0.4032

0.3999

0.3940

0.3931

0.3987

Lag

150.3687

0.3724

0.3793

0.3708

0.3820

0.3742

0.3746

0.3828

0.3818

0.3755

0.3740

0.3804

Lag

160.3892

0.3969

0.3965

0.3940

0.4018

0.3947

0.3932

0.4036

0.4013

0.3985

0.4004

0.4018

Lag

170.3891

0.3925

0.3948

0.3899

0.3979

0.3954

0.3934

0.3998

0.3969

0.3937

0.3933

0.3981

Lag

180.3859

0.3880

0.3865

0.3856

0.3891

0.3895

0.3877

0.3965

0.3912

0.3899

0.3873

0.3933

Lag

190.3910

0.3995

0.4031

0.3984

0.4048

0.3991

0.3993

0.4099

0.4068

0.3998

0.4009

0.4065

Lag

200.3705

0.3819

0.3839

0.3804

0.3835

0.3857

0.3798

0.3837

0.3850

0.3848

0.3814

0.3855

Lag

210.3784

0.3811

0.3857

0.3829

0.3854

0.3848

0.3833

0.3926

0.3890

0.3851

0.3832

0.3889

Lag

220.3932

0.4025

0.4016

0.3970

0.4049

0.4024

0.3984

0.4071

0.4073

0.4041

0.4014

0.4075

Lag

230.3873

0.3920

0.3957

0.3923

0.3971

0.3928

0.3899

0.4013

0.3966

0.3920

0.3948

0.3977

Lag

240.3889

0.3951

0.3964

0.3952

0.3980

0.3950

0.3946

0.4008

0.3989

0.3994

0.3949

0.4020

Onl

ine

Car

Con

figu

ratio

nsLa

g13

Lag

14La

g15

Lag

16La

g17

Lag

18La

g19

Lag

20La

g21

Lag

22La

g23

Lag

24

Goo

gle

Sear

chIn

tens

ity

Lag

10.4099

0.3938

0.3988

0.4050

0.4000

0.4034

0.4029

0.3978

0.4051

0.4019

0.3893

0.3952

Lag

20.4071

0.3872

0.3939

0.3951

0.3969

0.3988

0.3987

0.3892

0.3994

0.3941

0.3842

0.3889

Lag

30.4075

0.3908

0.3946

0.3987

0.3962

0.4013

0.4001

0.3926

0.4007

0.3958

0.3837

0.3909

Lag

40.4142

0.3946

0.4020

0.4050

0.4015

0.4063

0.4063

0.3970

0.4050

0.4012

0.3883

0.3960

Lag

50.3949

0.3743

0.3818

0.3882

0.3844

0.3899

0.3895

0.3821

0.3868

0.3844

0.3722

0.3780

Lag

60.4017

0.3848

0.3877

0.3932

0.3915

0.3918

0.3937

0.3871

0.3919

0.3872

0.3764

0.3816

Lag

70.4070

0.3866

0.3965

0.3981

0.3948

0.3998

0.4022

0.3916

0.4008

0.3955

0.3805

0.3886

Lag

80.3929

0.3789

0.3788

0.3847

0.3821

0.3851

0.3844

0.3775

0.3848

0.3825

0.3708

0.3728

Lag

90.4125

0.3941

0.3958

0.4019

0.3997

0.4017

0.4046

0.3962

0.3993

0.3974

0.3849

0.3911

Lag

100.3793

0.3633

0.3685

0.3678

0.3706

0.3749

0.3732

0.3646

0.3686

0.3648

0.35

470.3668

Lag

110.3981

0.3810

0.3833

0.3890

0.3858

0.3882

0.3902

0.3784

0.3855

0.3838

0.3660

0.3742

Lag

120.3939

0.3799

0.3853

0.3901

0.3890

0.3879

0.3904

0.3805

0.3830

0.3828

0.3748

0.3701

Lag

130.3909

0.3744

0.3784

0.3822

0.3817

0.3854

0.3843

0.3761

0.3837

0.3757

0.3708

0.3755

Lag

140.4055

0.3858

0.3897

0.3972

0.3949

0.3994

0.3971

0.3873

0.3951

0.3907

0.3781

0.3879

Lag

150.3820

0.3660

0.3704

0.3764

0.3774

0.3763

0.3781

0.3709

0.3743

0.3698

0.3608

0.3648

Lag

160.4090

0.3898

0.3945

0.3987

0.3971

0.4022

0.4004

0.3927

0.4012

0.3968

0.3830

0.3878

Lag

170.4047

0.3889

0.3916

0.3943

0.3904

0.3942

0.3967

0.3843

0.3933

0.3915

0.3773

0.3848

Lag

180.4010

0.3788

0.3881

0.3892

0.3864

0.3872

0.3914

0.3826

0.3827

0.3823

0.3701

0.3763

Lag

190.4122

0.3944

0.3998

0.4028

0.3982

0.3994

0.4029

0.3913

0.3990

0.3947

0.3830

0.3906

Lag

200.3921

0.3791

0.3813

0.3838

0.3815

0.3802

0.3827

0.3716

0.3723

0.3778

0.3555

0.3663

Lag

210.3961

0.3744

0.3811

0.3842

0.3805

0.3815

0.3818

0.3713

0.3720

0.3740

0.3618

0.3651

Lag

220.4121

0.3938

0.4004

0.4046

0.4020

0.4065

0.4079

0.3962

0.4034

0.4011

0.3866

0.3959

Lag

230.4069

0.3871

0.3937

0.3954

0.3954

0.3980

0.3944

0.3905

0.3905

0.3939

0.3755

0.3845

Lag

240.4074

0.3888

0.3944

0.3984

0.3942

0.3990

0.3996

0.3878

0.3959

0.3909

0.3774

0.3842

ARTICLE III 31

Table A-4 Out-of-sample forecasts and 95%-upper and lower bounds- with online data -

Car model In-step-ahead Observation Forecast 95%-Lower bound 95%-Upper bound

1 768 1070.06 66.17 2073.942 777 1199.41 79.06 2319.753 780 1363.70 216.58 2510.834 788 1307.93 154.32 2461.535 787 1237.06 81.87 2392.256 751 1210.82 55.24 2366.407 999 1271.86 116.19 2427.548 773 1295.27 139.57 2450.979 739 1282.71 127.01 2438.4110 948 1236.68 80.98 2392.3911 534 1224.59 68.88 2380.2912 419 1245.34 89.63 2401.04

Car model IIn-step-ahead Observation Forecast 95%-Lower bound 95%-Upper bound

1 340 438.51 0 [ -94.48] 971.502 325 409.61 0 [-140.04] 959.283 339 362.76 0 [-187.94] 913.454 348 344.63 0 [-206.13] 895.395 353 381.00 0 [-169.76] 931.776 330 419.15 0 [-131.61] 969.927 425 427.60 0 [-123.16] 978.378 305 493.79 0 [ -56.98] 1044.559 296 482.34 0 [ -68.42] 1033.1110 403 410.63 0 [-140.14] 961.3911 243 419.68 0 [-131.08] 970.4512 234 514.50 0 [ -36.26] 1065.27

Note: Negative values have been truncated to zero as car orders are alwaysgreater or equal to zero.

32 ARTICLE III

Figure A-1 Traceplots and posterior densities for model parameters - Car model I- without online data -

a) Traceplot − μμI

Iteration

μμ I

1000

1200

1400

1600

200 400 600 800

b) Posterior density − μμI

μμI

0.000

0.001

0.002

0.003

0.004

1000 1200 1400 1600

E((μμI)) == 1306.42

95% HPD Interval [1096.26,1518.7]

c) Traceplot − φφI

Iteration

φφ I

0.2

0.4

0.6

0.8

200 400 600 800

d) Posterior density − φφI

φφI

0

1

2

3

4

5

0.2 0.3 0.4 0.5 0.6 0.7

E((φφI)) == 0.45419

95% HPD Interval [0.27765,0.65604]

e)Traceplot − σσI2

Iteration

σσ I2

150000

200000

250000

300000

350000

400000

450000

200 400 600 800

f) Posterior density − σσI2

σσI2

0.0e+00

2.0e−06

4.0e−06

6.0e−06

8.0e−06

1.0e−05

200000 250000 300000 350000 400000

E((σσI2)) == 263188.3

95% HPD Interval [182807.89,346532.52]

ARTICLE III 33

Figure A-2 Traceplots and posterior densities for model parameters - Car model II- without online data -

a) Traceplot − μμII

Iteration

μμ II

500

600

700

200 400 600 800

b) Posterior density − μμII

μμII

0.000

0.002

0.004

0.006

0.008

400 500 600 700 800

E((μμII)) == 602.03

95% HPD Interval [515.55,695.41]

c) Traceplot − φφII

Iteration

φφ II

0.2

0.4

0.6

200 400 600 800

d) Posterior density − φφII

φφII

0

1

2

3

4

5

0.2 0.3 0.4 0.5 0.6 0.7

E((θθII)) == 0.3926

95% HPD Interval [0.22689,0.56948]

e)Traceplot − σσII2

Iteration

σσ II2

60000

80000

100000

120000

140000

200 400 600 800

f) Posterior density − σσII2

σσII2

0e+00

1e−05

2e−05

3e−05

4e−05

60000 80000 100000 120000

E((σσII2)) == 88389.53

95% HPD Interval [68675.02,112781.34]

34 ARTICLE III

Figure A-3 Traceplots and posterior densities for model parameters - Car model I- with online data -

a) Traceplot − μμI

Iteration

μμ I

800

1000

1200

1400

1600

200 400 600 800

b) Posterior density − μμI

μμI

0.000

0.001

0.002

0.003

0.004

1000 1200 1400 1600

E((μμI)) == 1279.15

95% HPD Interval [1033.65,1561.43]

c) Traceplot − ββI((OC))

Iteration

ββ I((OC

))

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

200 400 600 800

d) Posterior density − ββI((OC))

ββI((OC))

0.0

0.5

1.0

0.0 0.5 1.0 1.5

E((ββI((OC)))) == 0.54959

95% HPD Interval [−0.1368,1.3286]

e) Traceplot − ββI((SI))

Iteration

ββ I((SI))

−20

−10

0

10

20

200 400 600 800

f) Posterior density − ββI((SI))

ββI((SI))

0.00

0.02

0.04

0.06

−20 −10 0 10

E((ββI((SI)))) == −− 3.2163

95% HPD Interval [−15.85,11.13]

g) Traceplot − φφI

Iteration

φφ I

0.2

0.4

0.6

0.8

200 400 600 800

h) Posterior density − φφI

φφI

0

1

2

3

4

5

0.2 0.4 0.6 0.8

E((φφI)) == 0.49546

95% HPD Interval [0.27765,0.65604]

i) Traceplot − σσI2

Iteration

σσ I2

200000

250000

300000

350000

400000

200 400 600 800

j) Posterior density − σσI2

σσI2

0.0e+00

2.0e−06

4.0e−06

6.0e−06

8.0e−06

1.0e−05

200000 250000 300000 350000 400000

E((σσI2)) == 262342.8

95% HPD Interval [189475.93,347134.3]

ARTICLE III 35

Figure A-4 Traceplots and posterior densities for model parameters - Car model II- with online data -

a) Traceplot − μμII

Iteration

μμ II

450

500

550

600

650

700

200 400 600 800

b) Posterior density − μμII

μμII

0.000

0.005

0.010

500 550 600 650

E((μμII)) == 578.2

95% HPD Interval [513.26,648.59]

c) Traceplot − ββII((OC))

Iteration

ββ II((O

C))

−0.5

0.0

0.5

1.0

1.5

200 400 600 800

d) Posterior density − ββII((OC))

ββII((OC))

0.0

0.5

1.0

1.5

0.0 0.5 1.0 1.5

E((ββII((OC)))) == 0.74787

95% HPD Interval [0.13,1.25]

e) Traceplot − ββII((SI))

Iteration

ββ II((S

I))

−25

−20

−15

−10

−5

200 400 600 800

f) Posterior density − ββII((SI))

ββIII((SI))

0.00

0.05

0.10

−25 −20 −15 −10 −5

E((ββII((SI)))) == −− 14.4326

95% HPD Interval [−21.02,−7.49]

g) Traceplot − φφII

Iteration

φφ II

0.0

0.1

0.2

0.3

0.4

0.5

0.6

200 400 600 800

h) Posterior density − φφII

φφII

0

1

2

3

4

5

0.0 0.1 0.2 0.3 0.4 0.5

E((φφII)) == 0.252

95% HPD Interval [0.07873,0.42324]

i) Traceplot − σσII2

Iteration

σσ II2

60000

80000

100000

120000

200 400 600 800

j) Posterior density − σσII2

σσII2

−1e−05

0e+00

1e−05

2e−05

3e−05

4e−05

60000 80000 100000

E((σσII2)) == 73950.78

95% HPD Interval [56503.31,93591.92]

36 ARTICLE III

B. MCMC sampling

I consider the following model in which an observation yt, at time t, is generated

by the regression model with autocorrelated errors of order one:

yt = x′tθ + ut

ut = φut−1 + εt (B-1)

εt ∼ N(0, σ2), ∀ t ∈ {1, ..., T}

where xt is a ((k+1)×1)-vector of k covariates and a constant for the intercept,

xt = (1, x1, ..., xk)′, θ = (μ, β1, ..., βk)

′ ∈ Sθ, and φ ∈ Sφ is the coefficient for

the AR(1)-error process. The parameters θ and φ are defined on their supports

Sθ and Sφ, respectively. I follow a Bayesian approach to estimate posterior

distributions for the model parameters θ = (μ, β1, ..., βk)′, φ and σ2. Let there-

fore y1:t denote all the observations up to time t, and yt|1:t−1 the expected value

for the tth observation given all the past information on the time series and the

covariates. Because I only consider a process with autocorrelated structure of

order one, this implies that yt|1:t−1 = yt|t−1, with yt|t−1 = x′tθ + ut. The likeli-

hood of the observed time series Y = {yt}Tt=2 of car orders conditional on the

initial observation y1, and the parameters θ, φ,and σ2 is then given by

L(Y|y1, θ, φ, σ2) =T∏t=2

f(yt|yt−1, θ, φ, σ2)

∝ σ−(n−1)exp

(− 1

2σ2

T∑t=2

(yt − yt|t−1)2)

(B-2)

In order to sample from the model parameters’, θ = (μ, β)′, φ and σ2, joint

posterior distribution, I use Gibbs sampling (Geman and Geman 1984) as de-

scribed in Chib (1993) for Bayesian regression models with autocorrelated er-

rors. Given the prior information on the unknown parameters, π(θ, φ, σ2), and

applying Bayes theorem, the joint posterior distribution of interest is given by

π(θ, φ, σ2|Y) ∝ L(Y|y1, θ, φ, σ2)π(θ, φ, σ2) (B-3)

ARTICLE III 37

with a normalizing constant given by the often analytically intractable integral

K =

∫L(Y|y1, θ, φ, σ2)π(θ, φ, σ2)dθdφdσ2.

To compass these difficult posterior computations, I simply exploit the conve-

nient conditional structure of the model that allows me to use a Gibbs sampler

as iterative Monte Carlo method (Geman and Geman 1984). Hence, I can draw

samples from the full conditional distributions of the parameters, which leads,

in the stage of convergency of the sampling method, to posterior draws of the

joint distribution of the parameters. Supppose, for our model, I presume prior

distributions such that

π(θ, φ, σ2) = π(θ)π(φ)π(σ2) (B-4)

which asserts that θ, φ, and σ2 are a priori independent. Let then

θ ∼ N(k+1)(η,C)I(θ∈Sθ), φ ∼ N(φ0,Φ0)I(φ∈Sφ), σ2 ∼ IG(a0, b0) (B-5)

be the respective prior distributions for the unknown model parameters, with

I(E) equal to one if the event E is true, and zero otherwise. These prior distri-

butions are a combination of a multivariate normal distribution for θ truncated

to the region Sθ, a normal distribution for φ truncated to the region Sφ and an

inverse gamma distribution for the variance parameter σ2. In our specific ap-

plication to weekly car orders, I set Sθ = [0,∞) × R2 which ensures that the

local level constant remains positive as car orders cannot become negative. The

truncation region for φ is set to Sφ = (−1, 1) which implies a stationary er-

ror process. The indicator functions can simply be droppped if the restrictions

are not being imposed. I use diffuse, non-informative priors over the model

parameters θ, φ and σ2 as I set the hyperparameters in (B-5) at η = (0, 0, 0)′,

C(1,1) = 106, C(2,2) = C(3,3) = 102, C(i,j) = 0, ∀i �= j, i, j = 1, 2, 3, a0 = 1,

b0 = 10, φ0 = 0, and Φ0 = 1.

In order to apply a simple Gibbs sampling algorithm, as pointed out in Chib

(1993), the variables yt are transformed to y∗t = yt − φyt−1. Thus, the model in

(B-1) simply becomes

yt − φyt−1 = x′tθ − φx′t−1θ + ut − φut−1 (B-6)

38 ARTICLE III

From (B-1), it is also obtained that εt = ut − φut−1, with εt ∼ N(0, σ2). Using

this result together with (B-6), it is easy to confirm that y∗t |y1:t−1 ∼ N(x∗′t θ, σ

2),

is independently normally distributed, where x∗t = xt − φxt−1, t = 2, ..., T .

Then, due to the independence of the {y∗t }Tt=2, the model in terms of the trans-

formed variables {y∗t }Tt=2 is given by the simple regression model

Y∗ = X∗θ + ε, ε ∼ N(0, σ2IT−1), (B-7)

where IT−1 is the (T − 1)× (T − 1)-Identity matrix, Y∗ = {y∗t }Tt=2 and X∗ =(x∗2, ..., x

∗T )′, respectively. Combining the normal prior for θ in (B-5) with the

likelihood of the normal regression model in (B-7), and in line with standard

Bayesian linear model results it is easily derived that

θ = (μ, β1, ..., βk)′|y1:T , φ, σ2 ∼ N(k+1)(θ,Vθ)I(θ∈Sθ), (B-8)

where θ = Vθ(C−1η + σ−2X∗′Y∗) and Vθ = (C−1 +X∗′X∗)−1. Given θ, and

φ, and including the respective prior, the full conditional distribution of σ2 is

easily obtained in closed form as a standard result

σ2|y1:T , θ, φ ∼ IG

(a0 +

T − 1

2, b0 +

1

2SSX∗Y∗

), (B-9)

where SSX∗Y∗ = (Y∗ −X∗θ)′(Y∗ −X∗θ).

The conditional posterior distribution is, as discussed in Chib (1993), not

difficult at all. The major trick here, is to consider the errors ut = yt − x′tθthemselves in a linear regression model. Thus,

u = Uφ+ ε (B-10)

where u = (u2, ..., uT )′ and U = (u1, ..., uT−1)′. Again using standard result

from linear regression, the conditional posterior distribution for the autoregres-

sive parameter φ is obtained as a truncated normal distribution

φ|y1:T , θ, σ2 ∼ N(φ, vΦ)I(θ∈Sφ), (B-11)

where φ = vΦ(Φ−10 φ0+σ−2U′u) and vΦ = (Φ−10 +σ−2U′U)−1. It could also be

drawn from the untruncated normal distribution retaining the draw if it lies in

the open interval (−1, 1). One additional result of this strategy is the providence

ARTICLE III 39

of a conditional probability of the stationarity of the error process (Chib 1993).

This conditional probability is simply the proportion of accepted draws from

the untruncated normal distribution.

With the availability of all full conditional distributions, in stage of conver-

gency, the draws yield from the joint prosterior distribution of the parameters.

40 ARTICLE III

CURRICULUM VITAE

Personal Information

Name Daniel Philipp Stadel

Date/Place of Birth February 23, 1982, Munich

Education

Conferral of a Doctorate (PhD in Management)12/2008 – 04/2011 University of St. Gallen, Switzerland

Research Institute for Customer Insight

07/2009 – 08/2009 University of Michigan, Ann Arbor, USA

Summer Program in Quantitative Research Methods

Degree Program Economical Mathematics (Dipl.-Math. oec.)10/2002 – 08/2008 Ulm University, Germany

Focus: Financial Mathematics and Statistics

08/2006 – 07/2007 University of West Florida, Pensacola, USA

School Education09/1992 – 06/2001 Nikolaus-Kopernikus-Gymnasium,

Weissenhorn

Work Experience

12/2008 – 04/2011 Research Assistant and Doctoral Candidate,

Institute for Customer Insight

(former: Center for Business Metrics)

11/2007 – 05/2008 Working Student at the Risk-Controlling Division,

Savings Bank Ulm

09/2007 – 10/2007 Intern at the Risk-Controlling Division,

Savings Bank Ulm

08/2006 – 07/2007 Assistant at the Statistics Center,

University of West Florida, Pensacola, USA

05/2004 – 07/2006 Student Assistant at the Ulm University,

Department for Mathematics and Business Studies

advanced statistical models for pricing, mass ...file/dis3937.pdf · advanced statistical models...

Documents