team incentives and performance: evidence from a … team incentives and performance: evidence from...

34
1 Team incentives and performance: Evidence from a retail chain 1 November 2014 Guido Friebel, Matthias Heinz, Miriam Krüger, Nick Zubanov Abstract: There is substantial field evidence that incentive pay increases the performance of workers when individual performance is measurable. Comparable evidence for teams, however, is scarce. We fill this gap by a randomized experiment on team incentives in a retail chain of roughly 200 shops and 1200 employees. It is technologically impossible to measure individual performance, but the firm measures team (shop) performance along various dimensions. Using stratified randomization, we introduced a team bonus conditioned on sales targets fixed well before the team incentive was discussed. Treated shops increase their sales on average by around three percent, wages increase by around two percent on average (and up to 13 percent). The team incentive works best for (i) shops in larger towns and cities where arguably the marginal productivity of effort is highest; (ii) shops with younger employees, for whom the marginal costs of effort is likely to be lowest, and (iii) shops that did not reach their targets regularly before the introduction of the bonus, for whom the effect of effort on the marginal probability of success is likely to be largest. Keywords: randomized experiment, controlled trial (RCT), natural field experiment, team bonus, insider econometrics JEL codes: D23, J33, M52 1 All authors are at Goethe University Frankfurt, except for Heinz who is affiliated with the University of Cologne. Friebel is also affiliated with CEPR and IZA, and Zubanov with IZA. We are grateful for the support of Deutsche Forschungsgemeinschaft (DFG). We would like to thank for their comments: Iwan Barankay, Oriana Bandiera, Nick Bloom, Johan Lagerlöf, John List, and participants in seminars at Bergen, Cologne, Columbia, Copenhagen, Frankfurt, Maastricht, the European Bank for Reconstruction and Development, London, a conference organized by the university of Arhus, a workshop organized by LMU Munich, the Annual GEABA conference and the NBER Organizational Economics Working Group Meeting 2014. We would also like to praise the team spirit of our partners in the retail chain, and of Artur Anschukov, Sandra Fakiner, Larissa Fuchs, Andre Groeger, Daniel Herbold, Malte Heisel, Robin Kraft, Stefan Pasch, Jutta Preussler, Elsa Schmoock, Patrick Schneider, Sonja Stamness, Sascha Wilhelm, who provided excellent research assistance.

Upload: dinhkhanh

Post on 20-Apr-2018

219 views

Category:

Documents


2 download

TRANSCRIPT

1

Team incentives and performance: Evidence from a retail chain1

November 2014

Guido Friebel, Matthias Heinz, Miriam Krüger, Nick Zubanov

Abstract: There is substantial field evidence that incentive pay increases the performance

of workers when individual performance is measurable. Comparable evidence for teams,

however, is scarce. We fill this gap by a randomized experiment on team incentives in a

retail chain of roughly 200 shops and 1200 employees. It is technologically impossible to

measure individual performance, but the firm measures team (shop) performance along

various dimensions. Using stratified randomization, we introduced a team bonus

conditioned on sales targets fixed well before the team incentive was discussed. Treated

shops increase their sales on average by around three percent, wages increase by around

two percent on average (and up to 13 percent). The team incentive works best for (i) shops

in larger towns and cities where arguably the marginal productivity of effort is highest; (ii)

shops with younger employees, for whom the marginal costs of effort is likely to be

lowest, and (iii) shops that did not reach their targets regularly before the introduction of

the bonus, for whom the effect of effort on the marginal probability of success is likely to

be largest.

Keywords: randomized experiment, controlled trial (RCT), natural field experiment, team

bonus, insider econometrics

JEL codes: D23, J33, M52

1All authors are at Goethe University Frankfurt, except for Heinz who is affiliated with the University

of Cologne. Friebel is also affiliated with CEPR and IZA, and Zubanov with IZA. We are grateful for

the support of Deutsche Forschungsgemeinschaft (DFG). We would like to thank for their comments:

Iwan Barankay, Oriana Bandiera, Nick Bloom, Johan Lagerlöf, John List, and participants in seminars

at Bergen, Cologne, Columbia, Copenhagen, Frankfurt, Maastricht, the European Bank for

Reconstruction and Development, London, a conference organized by the university of Arhus, a

workshop organized by LMU Munich, the Annual GEABA conference and the NBER Organizational

Economics Working Group Meeting 2014. We would also like to praise the team spirit of our partners

in the retail chain, and of Artur Anschukov, Sandra Fakiner, Larissa Fuchs, Andre Groeger, Daniel

Herbold, Malte Heisel, Robin Kraft, Stefan Pasch, Jutta Preussler, Elsa Schmoock, Patrick Schneider,

Sonja Stamness, Sascha Wilhelm, who provided excellent research assistance.

2

1. Introduction

“How can members of a team be rewarded and induced to work efficiently?” This is the

question that Alchian and Demsetz (1972) asked more than 40 years ago in one of the most

influential contributions to the economic literature on organizations. Alchian and Demsetz’

focus was on input monitoring; an alternative would be incentives conditioned on joint output.

The very nature of teamwork, however, blurs the performance of individuals into a common

performance signal, weakening the effect of monetary incentives (Holmström, 1982). While

there is substantial evidence that incentives work quite well provided individual performance

is measurable (Lazear, 2000, Shearer, 2004, Bandiera et al, 2009), a number of questions are

open: under what conditions do team incentives raise efficiency in the field, and by how much

(Bloom and van Reenen, 2011), do such incentives lead to unintended reactions and gaming,

and what mechanisms may affect performance?

We address this research gap through a randomized, controlled experiment in a retail

chain in Germany. Our study is the first one in which the effect of team incentives is analyzed

in a natural field experiment combining both realism and randomization (List and Rasul,

2011). In particular, the employees are working in an ongoing firm and they are carrying out

their normal day-to-day job. Besides the change in the compensation scheme, there is no other

intervention, and we took great care in ensuring that employees would not consider

themselves as part of an experiment. Except for our partners in management and the worker

council, no one was aware of our involvement and communication was taken care of by

management, not by us. The firm used the term “pilot”, which they also use when introducing

new HR or marketing practices for a limit period of time, and we conditioned incentive pay

on the existing performance measurement system used for the compensation of middle and

lower management.

Using the stratified randomization method developed by Barrios (2014), we

introduced a team bonus conditioned on sales targets that were fixed well before the team

incentive was discussed. Treated shops increase their sales on average by around three

percent, wages increase by around two percent on average (and up to 13 percent). The team

incentive works best for (i) shops in larger towns and cities where arguably the marginal

productivity of effort is highest; (ii) shops with younger employees, for whom the marginal

costs of effort is likely to be lowest, and (iii) shops that did not reach their targets regularly

before the introduction of the bonus, for whom the effect of effort on the marginal probability

of success is likely to be largest. A fourth result is owing to an institutional specificity in

Germany: roughly a third of the workers in our shops are so-called mini-jobbers, registered

3

unemployed with an income (on top of unemployment benefits) of 450 or less. For tax

reasons, these employees were not eligible for the bonus. We find that the treatment effect is

lower in shops with a higher proportion of mini-job workers, although the eligible team

members would receive a larger share of the team bonus (holding other things equal). This

result points to the importance of complementarities between team members.

The main effect of the team incentive consists of an increase in the customers served,

so incentives seem to increase operational efficiency, rather than increasing sales by up-

selling activities. We find no effect of team size on treatment effect, which seems

counterintuitive from a free-riding perspective, but is in line with the “group-size paradox”

analyzed by Esteban and Ray (2001). We also find no evidence that the incentive is gamed as

measured by additional orders of bread or higher return rates of unsold bread.

Our study distinguishes itself from the existing literature on the effect of group

incentives and team work. First, in contrast to much of the literature, we look at small work

teams in which people interact on a regular basis, and not on agencies or divisions like

Propper et al (2011), Courty and Marschke (1997), or other papers surveyed by Prendergast

(1997). Second, we are dealing with a technology in which job design necessarily builds on

team rather than individual work. (Why this is the case, we explain in the paragraph below).

Our question is not whether team rather than individual work is preferable for incentive and

efficiency reasons (Itoh, 1991, Che and Yoo, 2001). Rather, we ask whether given

technologically determined work organization in teams, an incentive can raise output (sales),

in what magnitude, and what the impact of incentives depends on. Third, we control for

variables widely believed to interact with team incentives, such as organizational

commitment, job, and context satisfaction and perception of leadership, but find no significant

results.

Our retail chain consists of shops where employees bake and sell pre-fabricated bread

and cakes, prepare and sell sandwiches, snacks, and hot and cold beverages. On average,

seven full- and part-time employees work in each shop, a third of the employees are mini-

jobbers. Wages are low (on average around ten Euros), at a level slightly above the currently

debated, but not yet implemented, nation-wide minimum wage. Individual work organization

and performance measurement is impossible, because there is a broad variety of tasks each

person has to carry out, including handling the goods delivered, preparing food in the oven,

taking care of the customers, and handling the cash register. The time workers spend on each

task varies much, people work in overlapping shifts and are supposed to help each other. The

need to deal with different tasks of high volatility makes it too costly to have highly

4

specialized agents who would be idle most of their time. Furthermore, providing individual

incentives would lead to measurement and gaming problems and productivity losses because

of forgone help efforts among the members of the team (cf. Itoh, 1991, Auriol et al, 2002).

For many years prior to our intervention, shop performance has been measured along a

number of dimensions, such as sales, personnel costs, and qualitative indicators, all of which

have traditionally been used to incentivize top, middle and shop managers. Prior to the

intervention, however, the more than 1,000 sales agents in the shops never received any

performance related pay. In April 2014, in half of the almost 200 shops, we introduced a

monthly team bonus. Shops were assigned to treatment and control groups through an optimal

stratification procedure developed by Barrios (2014). Management informed the teams in the

treatment group through personal communication, letters and posters about the incentive

scheme.

As usual in sales people compensation we used a step bonus function (Figure 1). We

were aware that step functions have many issues, but linear compensation Teams that reached

or surpassed by up to one percent the sales target defined by top management at the beginning

of the year (well before the decision in favor of the team incentive was made) would receive a

100 Euros bonus. For each additional one percent of sales beyond the target, an extra bonus of

50 Euros was offered. The bonus was capped at 300 Euros when the sales exceeded the target

by more than four percent. Teams were initially informed that the bonus would be paid for a

pilot period of three months ending on the 30th

of June 2014. The teams were informed that

the bonus would be shared among the full- and part time employees including the shop

manager, according to the hours they worked in the respective month, compared to the total

work hours of the team.

We find a treatment effect of roughly three percent on sales over the period between

April and June 2014. Many of the teams reach sales levels beyond which the bonus is capped.

Interestingly, and contrary to the free-riding argument but in line with the literature on the

group size paradox (Esteban and Ray, 2001), larger shops in the treatment group fare no

worse in terms of sales than smaller ones. Shops in cities compared to smaller municipalities

feature treatment effects of around six percent, arguably because consumers react more

intensively to increased sales activities. Shops that, in the past, were less likely to reach the

sales targets react more intensively to the bonus than shops that were more likely. Job and

context satisfaction, and organizational commitment, as measured by a firm-wide survey prior

and unrelated to the introduction of the team bonus, plays no role for the effect of the team

bonus. Neither does the treatment affect these measures in a second survey during the

5

treatment period. The firm made extra profits of around 50,000 Euros per month from the

treated shops, and the wage payments to the sales agents increased by around 12,000 Euros

per month. We carry out a comparison with investments in renovations if the point and sales

and argue that the team bonus provides much higher returns to the firm. Following a

successful implementation of the team bonus, the firm decided to roll out the scheme to all

shops for the period from July to December 2014, signing a respective agreement with the

worker council at the end of June.

For many reasons retailing is a natural candidate to test for the effectiveness of team

incentives. Many people in retail chains work in teams, that is, the individual efforts can only

be mapped into some joint output signals, such as sales.2 Demand volatility is high, so people

must be prepared to switch tasks frequently, and even if sales people can specialize to some

extent, they must be willing to help each other, all of which makes it complicated, if not

impossible, to provide them with individual incentives. A study of team incentives study in

retailing, such as ours, will also make a rather general case because retailing is one of the

largest industries in the world in terms of employment. In Germany, the country in which our

experiment took place, more than 4 million people work in retailing, that is, almost 10% of

the country’s active labor force. Most of these people work in larger groups or chains,3 just as

in our study. Many other service industries, in particular restaurants, hotels, airlines, are

similar because, again, technology forbids individual incentives, people carry out many tasks,

individual performance measurement is difficult, if not impossible, and people do not get

individual incentives.4

Our paper contributes to the existing literature on incentives by providing clean causal

evidence on the performance effects of team incentives, thus filling an important gap in

empirical research. In our study, all shops have the same technology, team incentives are

randomly assigned, people cannot self-select into teams, because hiring is centralized, and

there are no moves between shops. This makes our study different from other papers in which

the adoption of teamwork or team incentives may be endogenous. Boning et al (2007),

Hamilton et al (2003), Bandiera et al (2009) and Bandiera et al (2013) all find supportive

evidence that team work and incentives raise efficiency. However, the first study shows that

the decision in favor of teamwork and its effects depend on technology, while the others

2 http://job-descriptions.careerplanner.com/Retail-Salespersons.cfm 3

http://de.statista.com/statistik/daten/studie/261517/umfrage/beschaeftigte-im-deutschen-einzelhandel-

nach-unternehmensgroessen/ 4 Other type of restaurants and cafés a very different type of business. Here, workers are much more

specialized and they receive individual incentive pay in the form of tips.

6

observe sorting of workers of similar productivities into teams. In particular, Hamilton et al

(2003) are instructive as they find that people even forgo individual earnings in exchange for

working in a team, which is a question orthogonal to ours, where team work is

technologically fixed, but compensation schemes vary. Also complementary to our work is

the paper from Delfgaauw et al (2013), which look at competition between teams,

incentivized through tournaments. The authors focus on gender, the effects of prize spread,

how far shops are away from targets and of social cohesiveness in teams. Our paper rather

shows the pure effects of an incentive conditioned on team output (not on relative output), has

teams that almost entirely consist of women, finds substantial effects of the bonus of six

percent in bigger municipalities, but no measurable interactions with organizational

commitment, perceived leadership and other factors. CITE ALSO DELFGAAUW AT AL

2014; THEY FIND IN A TOURNAMENT THAT SHOPS THAT ARE FAR BEHIND DO

NOT REPSOND TO RELATIVE INTENCIVES; WE FIND THAT SHOPS THAT LAG

BEHIND RESPOND THE MOST.

Is a six percent sales increase a large effect? We would argue yes. First of all, as

noticed by a substantial literature by Bloom and co-authors, Germany is one of the countries

with the highest level of managerial efficiency, second only to the US, and that also applied to

retailing (Bailey and Solow, 2001). Second, in stark contrast to France (Bertrand and

Kramarz, 2002), entry barriers are low, and competitive pressure, in particular triggered by

aggressive discounters such as Aldi and Lidl, is high. It is actually precisely the entry of these

firms into the market of “our” chain that triggered the change in incentives that we analyze

here.

2. Background

In 2013, we were contacted by the general manager of a bakery chain who sought

advice on how to cope with the challenges of a rapidly changing market. Since the 1980s,

bakery chains, some of them owning hundreds of shops and with sales numbers of up to a

billion Euro, had successfully built their business model on attractive locations (including

supermarkets and malls), and economies of scale. The chains had crowded out many of the

existing small master bakeries whose number and market shares had steadily declined. In

2011, however, discounters Aldi and Lidl had begun to bake and sell fresh bread, rolls and

related products in their dense network of shops, with large success. Their bread is widely

believed to be of similar quality as the one of the chains, but sold at much lower prices, hence

forcing the incumbent chains to rethink their business model. As a consequence, many of the

7

chains were differentiating their product portfolio moving into the market for snacks and

sandwiches and beverages, traditionally covered by cafés and fast food chains. They also tried

to intensify their sales activities on freshly baked cakes, a market the discounters have not

(yet) entered. These moves were accompanied by substantial investments in point of sales,

making them more modern, better designed and often equipping them with a “café section”

were customers can sit down to eat and drink.

The manager told us that for this strategic move to be successful, the behavior of the

personnel should be changed, and that they needed to be more actively involved in sales

activities. The English saying “something sells like hotcakes” has a German equivalent

“something sells like warm rolls” or “something sells like sliced bread”, and, indeed, many of

our partners in the firm (“the bakery”) complained that sales agents took a rather passive

stance in their interactions with customers. Turnover in the bakery was relatively high,

making training activities a questionable investment, also because personnel consisted mainly

of very low-skilled workers without comprehensive vocational training (only one fourth of the

staff have finished an apprenticeship). The bakery had experimented before with hiring more

qualified employees, but concluded that there was little if any increase in the quality of

service, at the cost of a substantially increasing wage bill. So, the challenge was to find a lever

through which the existing staff could be motivated to spend more effort on their sales

activities.

We agreed to help the bakery under the condition that we would have access to all

relevant data, and could design a possible intervention through a randomized controlled

experiment. In particular, we explained that our randomization would be more successful, if

we had access to historic sales data. We received sales data over more than two years,

allowing us to carry out a very precise stratified assignment, which is explained in more detail

in the next section. In exchange we offered our advice free of charge, and covered some of the

costs, in particular the one of research assistant needed to carry out an employee survey

before and during the treatment period.

When our partners presented the existing compensation structure to us, we realized

that there was a very detailed system of key performance indicators (KPI) according to which

managers were evaluated, and on the basis of which they were compensated. These KPIs

include sales, personnel costs, specific strategic goals such as enhancing the share of a certain

product in total sales etc. Most importantly, at the end of each year, sales targets are

determined for the next year, and these targets are never adjusted during the year. For the year

2014, management had based targets on 2013 sales, corrected by an expected decrease of 2%

8

on average. District managers, who are responsible for 10 to 15 shops, are incentivized

through a number of targets along these dimensions, and the 193 shop managers receive

bonuses when they reach certain sales and personnel cost targets and certain grades in

regularly carried out mystery shopper visits (the topic of another paper of ours). However,

sales agents, representing the bulk of staff, only receive fixed wages, which for most workers

are regulated by collective agreements. There is also a second group of workers, on so-called

mini jobs who can earn up to 450 Euro on top of their welfare benefits.

At the end of February 2013, we proposed introducing a team bonus conditional on

reaching or passing the existing sales targets. Our partners first reacted surprised by this

suggestion. A member of the management team put it bluntly: “We have never seriously

thought about this.” Other members of the management team were afraid that payments could

turn out to be a burden on the firm. We provided our partners with some simulations showing

that the expected payments would likely to be lower than 20,000 Euro per month in case (i)

half of the shops were treated; (ii) a step function capped at four percent sales above the target

was used; with (iii) a top monthly bonus of 300 Euro. While this convinced top managers to

try a “pilot” study with half of the shops assigned to the team bonus, district managers were

afraid that wage costs would rise, meaning that they could not reach the targets of keeping

personnel costs low. The general manager reacted by suggesting that the extra bonus

payments would be paid from a different budget and would not affect the personnel costs

relevant for district managers’ performance. District managers were quick to realize that in

such a setting they were likely to benefit as well, if the team bonus resulted in an increase of

sales of the shops under their supervision. The worker council also was in favour of the

bonus, in particular, because it was designed as a pure add-on payment. Also, trust between

worker council and management was high, and a new collective agreement had been written

concerning fixed wages, such that ratchet effects were unlikely.

To reduce the risk of information leakage, middle management was informed about

treatment and control shops under their purview only some days before introduction of the

team bonus. Another concern was raised about the possibility of envy between treatment and

control shops. We discussed this issue with the worker council who suggested that we should

explain that the intervention was “just a pilot” and that everybody would have the same 50-50

chance of taking part in it. This, the worker council argued, would be acceptable for the non-

treated shops in case they would learn about the bonus scheme in other shops. In the next

section, we explain in more detail how we communicated the scheme and how we minimized

the risk of contamination between the treated and non-treated shops.

9

3. The experiment

In March 2014, a month prior to the introduction of the team bonus, and before its

announcement to the employees and managers of the treatment group of shops, we carried out

an employee survey in order to measure employee satisfaction with the job context, general

situation, and organizational commitment, following some influential work in industrial

psychology (Allen and Meyer, 1990). In May, we carried out a second wave. This latter

survey entailed the same questions plus additional items in relation to the team bonus, and

some social interactions within and between shops.

Our partners and we thought the survey would be useful for a number of reasons. First,

it would allow an additional check whether treatment and control groups were balanced

samples; second, we hypothesized that the treatment effect may depend on employees’ pre-

treatment satisfaction and commitment levels; third, because of the team bonus, employees’

satisfaction levels could increase. The survey was distributed through the district managers

and collected by our research assistants in sealed envelopes. Arguably, because the

anonymous surveys were collected on site, and because of ours’ and the district managers’

substantial communication efforts, the feedback rates were almost 80 percent in the first and

above 60 percent in the second wave of the survey.

In preparation of team bonus introduction, we designed information leaflets to be

posted in the back offices of the treatment shops, and letters that were distributed by the

district managers to the employees. In contrast to the employee survey, the logo of Goethe

University did not show on these materials (see Appendix 2), so that people would not

perceive themselves as part of an experiment. In fact, there was no mention of our research

team in any communication regarding the bonus.

We trained district managers in how to explain the team bonus to shop managers. We

also instructed them about how to react to questions of employees in control group shops

about the bonus. In that case, district managers would explain that “this is a pilot. Everybody

had the same chance to be drawn with a 50-50 chance. The work council agreed to this

procedure.” We were afraid that there could be contamination between the treatment and

control group. Envy or frustration of control group shops could lead them to reduce their

efforts, which could then be picked up erroneously as a positive treatment effect. To monitor

this risk, we regularly called the district managers and asked them whether employees in the

control group had asked them questions about the team bonus. We actually only heard about

three cases in April, all of which were satisfied with their district manager’s answer. In May

10

and June, nobody inquired. In addition, we also regularly checked the Facebook page of the

bakery on which customers and employees alike tend to discuss issues such as product quality

but also (sometimes to the dissatisfaction of management) internal issues such as stress at the

workplace, quality issues of products, or problems of leadership and organizational culture.

We could not find a single entry on the team bonus. However, we also built in some questions

into the second wave of the employee survey to check for potential channels of

contamination, in particular, we asked people how frequently they interacted with colleagues

in the same and in other shops, and we found that contacts to employees in other shops are

very rare: only 20% of respondents indicated that they ever spoke to a colleague from another

shop. Hence, it seems fair to say that contamination between treatment and control shops is

unlikely to be an issue.

FIGURE 1 ABOUT HERE

Figure 1 depicts the bonus offered to the treatment shops. Shops that reach the sales

target for the month receive a bonus of 100 Euro to be shared between the part-time and full-

time employees in the shop. The bonus increases by 50 Euro for each percent point above the

target and is capped at 300 Euro per month at 4 percent points above the target. Hence, the

team in a shop can make extra earnings of up to 900 Euro in the treatment quarter April to

June. We provided the employees with examples of what sales increases would mean in terms

of additional goods to be sold per day (for instance a one-percent increase above the sales

target for a mid-sized shop would be tantamount to selling per day ten additional rolls, two

loafs of bread, some sandwiches and some cups of coffee). The fact that a number of shops

failed to reach the target by small amounts (for instance, in April, one shop failed to reach the

target by 16 Euros, and another one by 8 Euros) is an indicator that there was no manipulation

and that at least in the beginning of the treatment phase, employees found it hard to estimate

their likelihood to reach the various targets, although district managers regularly communicate

to all shops, treatment and control shops alike, their current sales figures.

What rests to be explained is the randomization procedure. We follow Barrios (2014)

who shows that randomizing pairwise by using the predicted outcome variable, in our case

sales, minimizes the variance of the differences in the treatment and control outcomes post

treatment. We use historic observations between January 2012 and December 2013 to run a

regression of log sales on labor input with month and shop fixed effects, from which we

obtain predicted sales. We then rank the shops according to the predicted sales and randomize

within the pairs of shops with adjacent ranks (1-2, 3-4, ..., 192-193). Because we had an odd

number of shops in our sample, we excluded the shop with the median rank from the pairs and

11

assigned it randomly. The resulting treatment and control groups comprised 97 and 96 shops,

respectively. Table 1 summarizes their pre-treatment characteristics.

TABLE 1 ABOUT HERE

Thanks to our randomization procedure, the treatment and control samples are

balanced in the average pre-treatment sales, our key outcome variable. They are also similar

in other potentially relevant characteristics, such as the percentage of unsold goods, number

of customer-visits, frequency of achieving the sales target, location, and employee attitudes.

In fact, none of the averages reported in Table 1 differ significantly between the groups. An

average shop sells over 27,000 Euros worth of goods5, employs 7 people most of whom are

female in their late 30s, unskilled, and working part-time. There is a sizeable share of workers

on a mini-job, around 30%, who for tax reasons were excluded from the team bonus scheme.

Sales are quite variable, with location and size differences explaining 90% of the variance.

There is also a considerable variation within shops, much of which is due to seasonal demand,

temporary closures for renovation, and market dynamics, such as the entry and exit of

competitors, all of which factors we control in our statistical analysis.

FIGURE 2 ABOUT HERE

Figure 2 displays spatial distribution of our control and treatment shops. The region in

which our partner firm operates spans roughly 100 km from West to East and 60 km from

North to South, an economy of more than 3 million inhabitants. Shop locations vary in

population size. However, all shops are placed on the premises of supermarkets owned by the

parent company, or in their immediate vicinity, relying thus mostly on the customer traffic to

and from grocery shopping.

4. Baseline results

Table 2 reports the treatment and control shops characteristics in the treatment period (April

to June 2014), giving a first impression of the treatment effect. Sales and customer-visits have

gone down, reflecting the secular downward trend in the bakery business. Yet, the drop in

sales and customer-visits being more pronounced in the control than in the treatment group

suggests a positive treatment effect. In fact, the difference-in-difference estimated effects on

the log sales and customer-visits are 3.3% and 2.8%, respectively, both significant at

5 One shop, located in the busy Frankfurt airport and assigned randomly to the treatment group, sold

on average 118000 euros worth of goods per month in the pre-treatment period and employed 22

people. Excluding this shop, the average pre-treatment sales in the treatment group are 27176 euros

per month with standard deviation of 10885 euros, which is much closer to the same characteristics of

the control group. Removing this shop from our regression sample does not change the estimated

treatment effects.

12

conventional levels. Since there is no significant treatment effect on other outcomes, we

proceed with a more in-depth analysis of sales and customer-visits.

TABLE 2 ABOUT HERE

To visualize the treatment effect on sales, Figure 3 plots the treatment and control

groups' year-on-year sales growth in the treatment month versus the sales levels in the same

months (April to June) of 2013. Additionally, Figure 4 displays the kernel density graphs of

the year-on-year sales growth for the two groups. There is a shift in the treatment group's

sales growth distribution to the right from the control group's which is fairly uniform across

the growth rates and initial sales levels.

FIGURE 3 ABOUT HERE

FIGURE 4 ABOUT HERE

To identify the treatment effect in a more systematic way, we run the difference-in-

difference estimator in several regression specifications where we control for other factors

that may affect sales and address the estimation issues most frequently discussed in the

experimental econometrics literature. The first issue is serial correlation in the residuals,

which leads to underestimated coefficient standard errors and false positives as a result

(Bertrand, Duflo and Mullainathan, 2004). The second is the correlation between the

treatment status and the baseline outcome, which, despite randomization, may occur in finite

samples, causing the “regression towards the mean” problem (Stigler, 1997).

We start with a specification with shop and time fixed effects:

(1)

ln(𝑠𝑎𝑙𝑒𝑠𝑖𝑡) = 𝛽0 + 𝛽1𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑖 + 𝛽2𝑎𝑓𝑡𝑒𝑟𝑡 + 𝛽3𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑖 ∗ 𝑎𝑓𝑡𝑒𝑟𝑡 + 𝑝𝑒𝑟𝑖𝑜𝑑𝑡

+ 𝑠ℎ𝑜𝑝 𝑓𝑖𝑥𝑒𝑑 𝑒𝑓𝑓𝑒𝑐𝑡𝑖 + 𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑠𝑖𝑡 + 𝑒𝑟𝑟𝑜𝑟𝑖𝑡

where ln(salesit) is the log sales in shop i and period t, the treatment dummy takes the values

1 for the treatment and 0 for the control group shops, the after dummy is 0 for the periods

before treatment and 1 thereafter, 𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑠𝑖𝑡 include the log total hours worked and dummies

for renovation within the last two months, and errorit is the idiosyncratic error term which we

allow to correlate within each shop using the Stata cluster option. Coefficient 3 is the

difference-in-difference estimate of the treatment effect. Columns 1 and 2 in Table 3 are

based on equation (1).

TABLE 3 ABOUT HERE

In addition to clustering errors at the shop level, which may still underestimate

coefficient standard errors in small samples, we implement another solution proposed in

13

Bertrand, Duflo and Mullainathan (2004) – to collapse all observations into the pre- and post-

treatment periods and estimate

(2)

ln(𝑠𝑎𝑙𝑒𝑠𝑖𝜏) = 𝛽0 + 𝛽1𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑖 + 𝛽2𝑎𝑓𝑡𝑒𝑟𝜏 + 𝛽3𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑖 ∗ 𝑎𝑓𝑡𝑒𝑟𝜏

+ 𝑠ℎ𝑜𝑝 𝑓𝑖𝑥𝑒𝑑 𝑒𝑓𝑓𝑒𝑐𝑡𝑖 + 𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑠𝑖𝑡 + 𝑒𝑟𝑟𝑜𝑟𝑖𝑡

where is the average log sales pre ( =0) and post ( =1) treatment. Column 3 in

Table 3 reports the treatment effect estimated with equation (2). Bootstrapping, another

recommended solution, produces standard errors of similar magnitude.

To control for regression to the mean in we augment eq. (2) with past sales:

(3)

where regression errors are still clustered at the shop level. The estimates from equation (3)

are reported in column 4 of Table 3.

In the next series of specifications, we estimate the treatment effect by comparing

post-treatment sales growth relative to a chosen baseline b in the treatment and control

groups,

(4)

In principle, specification (4) is similar to (3), but some extra flexibility in regression

specification is achieved by varying the baseline, , specifying it as the average

sales across all months before the start of the treatment (column 5 in Table 3), the same

months in 2013 (column 6), and the average sales in January-March 2014 (column 7).

Whatever specification we use, we obtain the average treatment effect estimates of

similar magnitude – around 3% – and significance. This uniformity suggests that neither of

the estimation issues we mentioned above and addressed in our analysis is important on our

sample. Indeed, simply clustering the errors by shop is sufficient on the relatively large

sample such as ours. Regression to the mean is not a concern either since our sample is well

balanced. The estimated average treatment effect on sales of 3% implies an extra 820 Euros

(=[exp(0.03)-1]*27000) worth of sales per month in the average shop. Calculating the

treatment effect in each month with our preferred specification (1) (results reported in Table

4), we find it to be 2.9% in April 2014, 3.7% in May, and 2.9% in June, a steady effect

without noticeable abatement.

TABLE 4 ABOUT HERE

Turning to the treatment effect on salespeople's income, more than 50% of the workers in

the treatment group received a bonus in the treatment period, which averaged at 114.7 Euros

ln(salesit )

ln(salesi1) = b0 +g × ln(salesi0 )+b1 × treatmenti +b2 ×aftert +b3 × treatmenti ×aftert +errorit

 

ln(salesit ) - ln(salesib) = b0 + g× ln(salesib ) + b3× treatmenti +errorit

 

ln(salesib )

14

or 3.9% of the average recipient's quarterly earnings. The total bonus payments made by the

company in April to June 2014 amounted to 35,150 Euros.

To gauge the profitability of our team bonus scheme, we compare the implied gains from

it with its implied total costs. With the treatment effect of 820 Euros per average shop per

month, the extra sales amount to 474780 Euros per quarter (=820 Euros times 3 months times

193 shops). Given the historic share of 0.56 of value added in sales, the implied operational

profit gain is 265,880 Euros. On the costs side, there are additional bonus payments to shop

managers and higher ranks (which we estimate at 20,000 Euros per quarter), and our research

expenses, which, though not billed to the company, need to be included in the costs. We

estimate around 60 person days of senior researchers at a rate of 1,000 Euro = 60,000 Euros.

Also we estimate 50 research assistant days at the going rate of 110 Euro = 5,500 Euros.

Material and travel costs were around 12,000 Euros. The total implementation costs of the

bonus scheme are thus 132,650 Euros, or about half its implied gains. Put differently, the

bonus scheme as an “investment in people” project would break even within one quarter from

its start.

In a meeting in June 2014, the management team decided to roll out the team bonus to

all shops as of July 1st 2014. A collective agreement was written with the work council,

according to which the team bonus would be granted until the end of 2014.

5. Treatment effect heterogeneity

Although our treatment and control groups are balanced across a number of characteristics

that might affect sales, the treatment effect may still vary in magnitude between different

shops in the treatment group. We expect the treatment effect to vary along several

dimensions, among which we analyze the following: shop location, workforce size and

composition, success in reaching the sales target in the past, and employee attitudes.

5.1. Shop location

Shop location affects the magnitude of effort’s response to a given incentive by changing the

marginal product of effort. Thus, extra effort pays more in populous, urban locations that have

office workers who might come in for lunch, and visitors who might buy a snack on the go;

incentivized sales agent may “up-sell” to both these groups. On the other hand, smaller

locations have mostly regular shoppers whose demand for bread is harder to affect - hence the

lower marginal product of sales effort in those locations. Besides, shops in urban locations

have more competitors nearby, whose customers may be won over.

15

TABLE 5 ABOUT HERE

Table 5 reports the treatment effect by shop location. As expected, the treatment effect is

largest, at 6%, in shops located in big towns (>60000 inhabitants), going down to 3.8% in

midsize towns, and zero in villages. As before, the treatment effect is fairly stable in time.

5.2. Workforce size and composition

Shop workforce size will influence the magnitude of the treatment effect by increasing

the total effort as the sum of individual efforts, as well as by decreasing the individual effort

through free-riding. Which of these two opposite tendencies will prevail depends on the team

production technology and the individual costs of effort function. Thus, Esteban and Ray

(2001) show, in a collective action setting, that when the costs of effort are quadratic or

steeper larger teams will outperform smaller ones in total effort even if there are no individual

effort complementarities. The presence of complementarities, that is, rewards to individual

team members being nonexcludable, or, equivalently, the total effort being more than the sum

of individual efforts, will reduce the ``steepness'' of the costs of effort function required to

deliver the total effort growing with team size.

TABLE 6 ABOUT HERE

To measure the variation in the treatment effect with shop size, we interact the

treatment dummy with the dummies for the quartiles of the shop-average number of workers

not on a mini-job (the mini job workers did not receive a bonus). Table 6 shows that the

treatment effect is larger in bigger shops (column 1), and that the observed differences in the

treatment effect do not owe to bigger shops being located in bigger towns (column 2). In fact,

the treatment effect increases with shop size faster in big towns than elsewhere.

Turning to the shop workforce composition, we explore treatment effect heterogeneity

with shop workforce tenure, age, and the share of mini-job workers. We expect the treatment

effect to be larger for younger workforce, since younger workers are on average poorer and

thus more susceptible to material incentives. Besides, holding wealth fixed, there may be an

element of resistance to change, which is weaker for younger workers, in the individual

responses to our innovative treatment.

TABLE 7 ABOUT HERE

Table 7 reports treatment effects in the shops below and above the median workforce age and

tenure, on the whole sample as well as separately in big towns and elsewhere. Consistently

with our expectations, ``younger'' shops respond to treatment more strongly. A further

analysis suggests that the differential response to treatment by age and tenure is driven mainly

16

by age: running our preferred difference-in-difference specification (1) with the treatment

effect being interacted with age and tenure separately as well as jointly produces a significant

interaction with age but not with tenure.

The treatment effect should decrease linearly with the share of mini-job workers in

shop team, reflecting a decrease in the size of the team that is incentivized. There will also be

an additional negative influence if there are effort complementarities between mini-job and

ordinary workers, since stronger complementarities increase the weight of each worker's

contribution to the total output6. To accommodate the later, nonlinear, effect, we rerun our

regression specification with the treatment dummy interacted with the quartiles of the shop-

average share of mini-job workers. The results are reported in Table 8.

TABLE 8 ABOUT HERE

We find that the treatment effect goes down with the share of mini-job employees, especially

in the shops located outside big towns. The abrupt drop in the treatment effect to zero past the

second (whole sample) or first (shops outside big towns) quartile of the average mini-job

worker share implies a steeper than linear decrease, which suggests effort complementarities

between mini-job and ordinary workers in shop team.

5.3. Past sales target achievement

We expect the treatment effect to vary with the past performance around the sales

target. Historic record of achieving sales targets is informative for shop teams to gauge their

probability of success in the future, since the targets are largely based on past sales (with a

correction for the overall trend, hence the higher frequency of reaching the target in both

groups, recall Table 2) and set in the beginning of the year. However, the pattern of the

treatment effect's variation with past performance is hard to predict, since the marginal utility

of effort, and hence the effort's response to treatment, is influenced by at least two opposing

considerations. First, although for the more successful shops the expected bonus is higher, the

marginal utility of putting more effort than before the treatment is lower because the bonus is

capped. Hence, shops historically performing closer to the target would respond to treatment

less strongly. On the other hand, shops that have been too unsuccessful in reaching their

targets in the past may not react because reaching the target is not realistic enough.

TABLE 9 ABOUT HERE

6 As an example of the empirical framework required here, Iranzo, Schivardi and Tosetti (2008)

estimate a constant elasticity of substitution production function of different workers' skills within

their firms. They find skill complementarity between, and substitutability within, occupational groups.

17

Table 9 reports treatment estimates by quartile of historic distance to the sales target,

which is measured in two ways: i) as the difference between actual sales and sales target

averaged for each shop over the pre-treatment period; and ii) as the frequency of each shop

achieving its target in the pre-treatment period. Shops in the bottom three quartiles of the

distance to the target reacted to the treatment more strongly than did those in the top quartile,

suggesting that rewarding the attainment of too easily achievable targets is not an effective

motivator.

5.4. Employee attitudes

We expect the effort response to treatment to vary with the workers' attitudes towards

their employer. To investigate this possibility, as well as to gather important background data

for our study, we ran questionnaires among the employees before (March) and after (May) the

treatment. In the cover letter sent to every shop, we emphasized that this survey was for our

research purposes and had nothing to do with their employer, and guaranteed anonymity of

employees' response; we also distributed and picked up the questionnaires ourselves rather

than let the company do so for us, as an extra guarantee of anonymity. We had an 80%

response rate for the first survey, and 65% for the second.

There are three aggregate attitudes scores we measured in both waves of the

questionnaire: commitment to the firm, satisfaction with the work context, and overall job

satisfaction. None of these scores were affected by our treatment (recall Table 2). Moreover,

none of them moderates the treatment effect, implying that workers' response to treatment

does not depend on their attitudes towards their firm or their job.

6. Mechanisms

The extra 3% of sales in the treatment shops compared to control may have been achieved by

serving more customers, selling more per customer, having a broader range of goods to better

serve diverse customer demands, selling more side goods (drinks, snacks, etc.), or through a

combination of the above mechanisms. In this section, we go through available empirical

evidence to ascertain the likely role of each of these mechanisms.

TABLE 10 ABOUT HERE

We find a treatment effect on the number of customer-visits that is commensurate with

that on sales (Table 10), so that the effect on sales per customer visit is virtually nil. This

finding implies that the role of mechanisms leading to more sales per customer visit, such as

broader assortment and more side sales, is marginal. With most of the extra sales driven by

18

the increase in the number of transactions, is it greater operational efficiency or better

customer service that drove this increase? Greater efficiency would be part of the explanation

if there were longer queues before the treatment than thereafter. However, comparing the

mystery shopping reports in April to June 2014 with those in January to March 2014

(immediately before treatment), we see no change in the reports on salesperson availability,

which was generally rated as high. Hence, there is no evidence to suggest a significant

decrease in the customer queues.

Turning to better customer service as the remaining explanation, we ran our own

mystery shopping tour of 140 randomly selected shops in our sample (capacity constraints

prevented us from touring every shop). Our research assistants were instructed to act like

ordinary customers and to buy the ``bread of the month'' or the closest substitute to it. After

leaving the shop, they were asked to take note of how friendly the sales staff were, and

whether the question ``Would you like anything else?'' or similar was asked. We found that

while the frequency of asking the ``anything else?" question was comparable in the control

and treatment groups overall (0.72 control vs. 0.79 treatment), in big towns, where the

treatment effect was largest, the treatment shops asked this question significantly more

frequently: 0.82 of the time vs. 0.61 in the control shops. We have also found a positive

correlation, though insignificant, between asking this question and monthly sales in the

treatment period. However, rerunning our regressions with the ``anything else?" question

included as control, we see no reduction in the treatment effect. In sum, although there are

signs pointing to the importance of enhanced customer experience in shaping the treatment

effect, we do not have strong enough evidence to precisely identify the mechanism that drove

our results.

7. Conclusion

Teams are a ubiquitous feature of modern production, and so are monetary incentives.

While the knowledge about the effectiveness of individual incentives is both broad and deep,

much less is known about team incentives. Problems of endogeneity, complementarities and

self-selection into teams make causally interpretable evidence about the effectiveness of team

incentives hard to obtain. We contribute to the incentives literature by providing evidence on

the effectiveness of team incentives. We have designed a fairly large randomized controlled

experiment with 193 shops of a bakery chain in Germany. Power calculations on the basis of

27 months of observations pre treatment and 3 months post treatment informed us that we

would need 70 shops in each group to detect a 3 percent treatment effect at a significance

19

level of 5 percent with the probability 0.9. Our estimated treatment effect is indeed around

3%, and is highly significant. There is also substantial heterogeneity, with the treatment effect

being largest in big towns, shops with younger workforce and few mini-job employees. The

single most important immediate cause of the treatment effect on sales we find is increased

customer traffic; there is no effect on sales per customer visit. We are unable to precisely

distinguish between greater operational efficiency (that is, smaller queues) and better

customer service as the two mechanisms that led to higher sales through increased customer

traffic.

20

References

Alchian, A. A., & Demsetz, H. (1972). Production, information costs, and economic

organization. The American Economic Review, 62(5), 777-795.

Allen, N. J. & Meyer, J. P. (1990). The measurement and antecedents of affective,

continuance and normative commitment to the organization. Journal of Occupational and

Organizational Psychology. 63(1), 1-18.

Auriol, E, Friebel, G. & Pechlivanos, L. (2002). Career concerns in teams. Journal of Labor

Economics, 20(2), 289-307.

Baily, Martin Neil, and Robert M. Solow. "International productivity comparisons built from

the firm level." Journal of Economic Perspectives (2001): 151-172.

Bandiera, O., Barankay, I & Rasul, I. (2013). Team Incentives: Evidence from a Field

Experiment. Journal of the European Economic Association, 11(5), 1079-1114.

Bandiera, O., Barankay, I & Rasul, I. (2009). Social Connections and Incentives in the

Workplace: Evidence from Personnel Data. Econometrica, 77(4), 1047-1094.

Bertrand, M., Duflo, E., & Mullainathan, S. (2004). How much should we trust difference-in-

difference estimates. Quarterly Journal of Economics, 119(1), 249-275

Bertrand, M., & Kramarz, F. (2002). Does entry regulation hinder job creation? Evidence

from the French retail industry. Quarterly Journal of Economics, 117(4), 1369-1413.

Barrios, T. (2014). Optimal stratification in randomized experiments. Mimeo, Harvard

University.

Bloom, N. & Van Reenen, J. (2011). Human resource management and productivity.

Handbook of Labor Economics, 4B(19), 1697-1767.

Boning, B., Ichniowski, C. & Shaw, K. (2007). Opportunity counts: Teams and the

effectiveness of production incentives. Journal of Labor Economics, 25(4), 613-650.

Courty, P. and Marschke, G (1997). Measuring government performance: Lessons from a

federal job-training program. American Economic Review, 383-388.

Delfgaauw, J., Dur, R., Non, A. & Verbeck, W. (2014). Dynamic incentive effects of relative

performance pay: A field experiment. Labour Economics, 28, 1-13.

Delfgaauw, J., Dur, R., Sol, J. & Verbeke, W. (2013). Tournament Incentives in the Field:

Gender Differences in the Workplace. Journal of Labor Economics, 31(2), 305-326.

Esteban, J., & Ray, D. (2001). Collective action and the group size paradox. American

Political Science Review, 95(3), 663-672.

European Foundation for the Improvement of Working and Living Conditions (2007),

Teamwork and high performance work organisation.

21

Hamilton, B., Nickerson, J. & Owan, H. (2003). Team Incentives and Worker Heterogeneity:

An Empirical Analysis of the Impact of Teams on Productivity and Participation. Journal of

Political Economy, 111(3), 465-497.

Holmström, B. (1982). Moral hazard in teams. The Bell Journal of Economics, 13(2), 324-

340.

Ichniowski, C., Shaw, K. & Prennushi, G. (1997). The Effects of Human Resource Management Practices on Productivity. The American Economic Review, 87(3), 291-313. Iranzo, S., Schivardi, F., & Tosetti, E. (2008). Skill Dispersion and Firm Productivity: An Analysis with Employer-Employee Matched Data. Journal of Labor Economics, 26(2), 247-285.

Itoh, H. (1991). Incentives to help in multi-agent situations. Econometrica, 59(3), 611-636.

Lazear, E. P. (2000). Performance Pay and Productivity. The American Economic Review,

90(5), 1346-1361.

List, J. & Rasul, I. (2011). Field Experiments in Labor Economics. Handbook of Labor

Economics, 4A, 103-228.

Shearer, B. (2004). Piece rates, fixed wages and incentives: Evidence from a field experiment.

Review of Economic Studies, 71(2), 513-534.

Stigler, S. (1997). Regression towards the mean, historically considered. Statistical Methods

in Medical Research, 6(2), 103-114.

22

Appendix I: Tables and Figures

Figure 1: The Team Bonus

Figure 2: A map of shops by treatment (red) and control group (blue)

23

Figure 3: Scatter plot, year on year sales growth on log sales, April, May and June 2013

Figure 4: Kernel distribution sales growth treatment versus control group

24

Table 1: Characteristics of the control and treatment shops before the treatment

Control Treatment t-test

(n = 96) (n = 97) p-value

Mean monthly sales (SD) 27453 28194

(11481) (14542)

Mean monthly sales (in logs, SD) 10.14 10.15

(0.39) (0.41)

Unsold goods as % of sales (SD) 16.16 (7.0) 15.54 (6.9) 0.331

Mean number of customer-visits (SD) 10028 (3921) 10131 (4018) 0.856

Mean monthly quit rate (SD) 1.9% (4.1%) 1.8% (4.1%)

Frequency of achieving the sales target 35.8% 35.2% 0.860

Mean mystery shopping score 2013 (SD) 96.1% 95.5%

Mean mystery shopping score 2014 (SD) 32.2 32.2

Big town 37.6% 33.6%

Medium/small town 26.0% 29.6%

Village 36.4% 36.7%

Mean age, years 39.8 (6.4) 40.9 (6.3)

Share of females 94.9% 93.0%

Share of full-time employees 71.8% 64.8%

Total number of sales agents 552 580

Mean number of agents per shop (SD) 7.4 (3.2) 7.4 (3.2)

Mean age, years 39.5 (6.1) 39.9 (6.0)

Share of females 93.1 92.4

Share of employees with a permanent contract 66.6% 67.9%

Share of full-time employees 9.7% 10.4%

Share of part-time employees 56.7% 59.7%

Share of employees with a "mini-employment" contract33.6% 29.9%

Share of unskilled workers 77.5% 72.3%

Mean commitment score (SD) 4.50 (1.55) 4.42 (1.69) 0.523

Mean work satisfaction score (SD) 4.45 (1.51) 4.33 (1.57) 0.422

Mean overall satisfaction score (SD) 4.98 (1.63) 4.90 (1.70) 0.548

Standard deviations are in parentheses. Column 3 reports the p-values of the two-sided t-test of

equality of the means for a selection of variables. "Big town", "medium/small town" and "village"

refer to municipalities with more than 90,000; 5,000 to 60,000; and fewer than 5,000 inhabitants,

respectively. Panels D and E are based on the personnel records from the firm as of July 1 2014,

excluding apprentices and interns (18 in the control and 11 in the treatment group). Panel F reports

the means of the commitment, work satisfaction and overall satisfaction scores constructed

according to Allen and Meyer (1990) from the employee survey administered in March 2014. In

total, 563 employees in the control, and 580 employees in the treatment group participated in the

survey (response rate 79.5%).

Panel E: Characteristics of sales agents

Panel F: Employee attitudes

Panel A: Quantitative performance indicators

Panel D: Characteristics of shop managers

Panel B: Qualitative performance indicators

0.695

0.846

Panel C: Shop location

25

Table 2: Characteristics of the control and treatment shops in the treatment period

(April - June 2014)

Control Treatment Diff-in-Diff t-test

(n = 96) (n = 97) p-value

Mean monthly sales (SD) 25376 26995(10708) (15036)

Mean monthly sales (in logs, SD) 10.06 10.10(0.40) (0.42)

Unsold goods as % of sales (SD) 22.88 (9.8) 22.35 (13.3) 0.940

Mean number of customer-visits (SD) 9115 (3582) 9465 (3790) 0.062

Mean monthly quit rate (SD) 1.42% (4.89) 1.69% (5.64) 0.336

Sales targets achieved 44.8% 49.1% 0.442

Mean mystery shopping score 2014 (SD) 32.4% (1.0%) 32.2% (1.2%) 0.295

Mean commitment score (SD) 4.20 (1.28) 4.24 (1.35) 0.468

Mean work satisfaction score (SD) 4.39 (1.34) 4.48 (1.20) 0.245

Mean overall satisfaction score (SD) 3.59 (1.12) 3.72 (1.02) 0.162

Panel F: Employee attitudes

Standard deviations are in parentheses. Column 3 reports the p-values of the two-sided significance test

for the difference-in-difference estimate of the treatment effect. The second employee survey was

administered in May 2014 with a response rate of 76%.

Panel A: Quantitative performance indicators

0.061

0.034

Panel B: Qualitative performance indicators

26

Table 3: Treatment effect estimates

Specification (1) (2) (3) (4) (5) (6) (7)

Treatment effect 0.032 0.033 0.030 0.030 0.032 0.026 0.027

(.013) (.011) (.014) (.014) (.014) (.014) (.014)

Shop fixed effects yes yes yes no no no no

Month dummy vars yes yes no no no no no

Other controls yes yes no yes yes yes yes

Observations 4916 4904 386 193 577 561 577

The table shows the difference-in-difference treatment effect estimates based on several regression specifications with the log sales as the

dependent variable. In all specifications the unit of observation is individual shop. In specification 1, we regress monthly sales from January

2012 until June 2014 on the "treatment group" and "after treatment" dummies and their cross-product. Specification 2 is the same but omits the

outliers, defined as year-on-year sales change exceeding 30% (roughly the top and bottom 1% of the sales growth distribution). The reasons for

such substantial increases or decreases in sales are construction sites close to the bakeries, competitors who enter or leave the market,

temporary closures of shops because of renovations or sunny weather, which affects sales in bakeries located in shopping centers. Specification

3 is the same as 1, except that we use log average sales over the periods before and after the treatment (hence two observations per shop).

Specification 4 includes past sales as an additional control, hence one observation per shop. In specification 5, we regress the log monthly sales

in April, May and June 2014 (the treatment period) on the treatment dummy and the baseline sales in the respective shop, defined as the log

average sales over the pre-treatment period. In specification 6, we regress the log monthly sales in the treatment period on the treatment dummy

and the log sales in the respective months in 2013. Specification 7 is the same as 5 except that we use the log average sales in January-Mach

2014 as the baseline. Standard errors are clustered by shop. Cluster-bootstrapped standard errors (available on request) are similar in

magnitude.

27

Table 4: Treatment effect by month

Specifications April 2014 May 2014 June 2014

Treatment effect 0.029 0.037 0.029

(.011) (.022) (.014)

Observations 4532 4532 4532

The regression specification is the same as spec. 1 in Table 3: The log

monthly sales regressed on the "treatment group" and "after treatment"

dummies, their cross-product, and controls.

28

Table 5: Treatment effect by shop location

April 2014 May 2014 June 2014 Overall

Shops located in big towns 0.059 0.055 0.049 0.055

(.019) (.051) (.024) (.025)

Shops in midsize towns 0.023 0.049 0.045 0.038

(.018) (.023) (.026) (.02)

Shops in villages 0.004 0.011 -0.001 0.005

(.019) (.02) (.022) (.019)

The regression specification is the same as spec. 1 in Table 3. The cells in the table give estimated

treatment effect in a given month and location. For example, 0.065 is the treatment effect in April 2014

in shops located in big towns. Standard errors are clustered by shop.

29

Table 6: Treatment effect by quartile of shop size (number of workers)

Whole sample Big towns Elsewhere

Quartile 1 0.001 0.016 -0.006

(.024) (.041) (.029)

Quartile 2 0.022 0.005 0.027

(.022) (.038) (.027)

Quartile 3 0.041 0.046 0.056

(.027) (.054) (.023)

Quartile 4 0.059 0.125 -0.029

(.025) (.043) (.024)

Observations 4916 1760 3156

Shop size is defined as the number of workers employed in a shop

excluding those on a mini job. Quartile of shop size is defined separately

for each subsample (shops located in big towns tend to employ more

workers). Standard errors are clustered by shop.

30

Table 7: Treatment effect by shop-average employee age and tenure in January-March

2014 (the quarter before the treatment)

Below median Above median Below median Above median Below median Above median

Treatment effect 0.043 0.021 0.068 0.033 0.022 0.013

(.019) (.017) (.043) (.032) (.018) (.022)

Observations 2446 2470 873 887 1599 1557

Below median Above median Below median Above median Below median Above median

Treatment effect 0.061 0.001 0.063 0.043 0.034 0.003

(.019) (.017) (.036) (.031) (.024) (.016)

Observations 2453 2463 894 866 1601 1555

Whole Sample Big towns Elsewhere

The samples are split into below and above the median age/tenure of the workforce excluding workers employed in a mini

job. Standard errors are clustered by shop.

Tenure

Whole Sample Big towns Elsewhere

Age

31

Table 8: Treatment effect by the average share of mini-

job employees

Whole sample Big towns Elsewhere

Quartile 1 (<0.06) 0.071 0.052 0.079

(.033) (.076) (.036)

Quartile 2 (0.06 - 0.11) 0.050 0.098 0.005

(.026) (.042) (.031)

Quartile 3 (0.11-0.16) 0.003 0.053 -0.011

(.019) (.039) (.02)

Quartile 4 (>0.16) -0.003 -0.019 0.002

(.021) (.035) (.027)

Observations 4916 1760 3156

The share of mini-job workers is defined as the ratio of the hours worked by

these workers to the total hours worked. Quartiles of the share of mini-job

workers are very similar for every location, and so are defined on the whole

sample. Standard errors are clustered by shop.

32

Table 9: Treatment effect by pre-treatment deviation of sales from the target

Distance measure: pre-treatment average sales/target difference

Quartile 1 (<-8%) Quartile 2 (-8% to -4.5%) Quartile 3 (-4.5% to 0%) Quartile 4 (>0%)

Treatment

effect 0.046 0.036 0.047 0.003

(.026) (.028) (.027) (.017)

Observations 1202 1242 1246 1226

Distance measure: pre-treatment frequency of achieving the target

Quartile 1

(<16%) Quartile 2 (16% to 30%) Quartile 3 (30% to 50%)

Quartile 4

(>50%)

Treatment

effect 0.052 0.048 0.026 -0.009

(.022) (.025) (.03) (.016)

Observations 1256 1255 1228 1117

The regression specification is the same as spec. 1 in Table 3. Standard errors are clustered by shop.

33

Table 10: Treatment effect on the number of customer visits and sales per

customer visit

All shops Big towns Other locations Village

Treatment effect on customer visits 0.027 0.046 0.032 0.006

(.011) (.02) (.019) (.017)

Treatment effect on sales per visit 0.004 0.008 0.004 0.000

(.007) (.018) (.005) (.007)

34

Appendix II

Information leaflet

<LOGO OF THE BAKERY>

AN ALLE VOLL- UND TEILZEITKRÄFTE: VERDIENEN SIE SICH IHREN TEAM-BONUS

In den Monaten April, Mai und Juni 2014 erhält das Team Ihrer Filiale einen Team-Bonus bei Erreichung oder Übererfüllung der Umsatzziele. So sieht das Bonus-Programm für Voll- und Teilzeitkräfte aus:

Bei Erreichung oder Übererfüllung von bis zu 1%, erhält das Filial-

Team einen Bonus von 100€ für den entsprechenden Monat.

Bei 1% bis 2% über dem Umsatzziel erhält das Filial-Team einen

Bonus von 150€.

Bei 2% bis 3% beträgt der Team-Bonus 200€.

Bei 3% bis 4% beträgt der Team-Bonus 250€.

Bei 4% oder mehr gibt es einen Team-Bonus von 300€.

Jedes Filial-Team kann also im Quartal einen Bonus von bis zu 900€ erreichen! Bitte beachten Sie:

Details zur Aufteilung unter den Team-Mitgliedern und Fehlzeiten

finden Sie im Infobrief.

Leider können wir diese Regelung aus steuerrechtlichen Gründen

nicht für geringfügig Beschäftigte anwenden.

Bei Fragen wenden Sie sich bitte an Ihre Bezirksleiter/innen, die Ihnen gerne weiterhelfen und ihnen regelmäßig mitteilen werden, ob sie Ihre Umsatzziele erreicht haben.