agent sorting by incentive systems in mission firms...

Agent Sorting by Incentive Systems in Mission Firms: Implications forHealth Care and Other Credence Goods Markets

December 2, 2019

Abstract

Intrinsic motivation and monetary incentives affect an expert’s choice of firm, and thus also theperformance of the firm. These effects are particularly important for consumers of credence goods inmission industries, where experts perform duties that align with their own personal belief systems andquality cannot be directly incentivized. In this environment, different contract designs may attractexperts with different intrinsic motivations to select into a mission industry, and therefore likely playa significant role in determining the quantity and quality of labor supply. We use a model of expertdecision-making to predict how three different contract designs (capitation, fee-for-service, and salary)affect experts’ decisions to enter mission firms with credence goods and the performance of the firmthereafter. The model demonstrates that contracts that constrain maximum earnings (e.g., capitationand salary) will likely reduce supply, but maintain experts who are highly mission motivated andthereby, increase industry performance. To test the model’s ability to predict the marginal effect oflimiting available contracts in a mission firm on expert preferences and performance we constructeda laboratory experiment. Our results demonstrate the potential for a tradeoff between labor supplyand quality of performance across different contract designs.

Key words: Credence Goods, Mission, Incentive, Labor-Supply, Health Care

JEL Codes: C91, D82, J22, I18

Acknowledgements The authors would like to thank participants at the University of Virginia’s JEL seminar, the NorthAmerican Economic Science Association Meeting, and the Southern Economic Association’s Annual Meeting for theirhelpful feedback.

1 Introduction

In standard labor markets, monetary incentives determine which firms attract workers

and, after finding employment, the amount of effort the workers provide. However, in

mission industries, intrinsic motivation for the firm’s mission is an additional factor for

these two outcomes. The countervailing influence of monetary incentives and intrinsic

motivation is particularly important for consumers of credence goods in mission indus-

tries. The characteristic trait of a credence good market is that customers do not observe

whether the services they were provided were of the type or quality that they needed.

In contrast, the expert service providers have better information in both the assessment

of the services needed ex ante and whether or not the services were appropriate ex post,

presenting them with a nearly ideal environment to place their interests ahead of their

customers (Emons, 1997; Dulleck and Kerschbamer, 2006; Dulleck et al., 2011). For exam-

ple, physicians have better information regarding diagnosis and appropriate treatments

than the patient. Therefore, payments cannot be directly tied to quality and the patient’s

well-being is largely determined by physicians’ intrinsic motivations to help their patients.

1

More generally, the performance of mission firms that produce credence goods is deter-

mined by how closely aligned the firms’ missions are with their employees’ personal belief

systems.

Many credence goods firms and policy makers use contract design to motivate employ-

ees under the constraint that monetary payments cannot be directly tied to quality. In

the 1990s, many health care providers adopted a capitated incentive system in place of

a fee-for-service incentive system to reduce health care expenditures (Frakt and Mayes,

2012). In higher education, piece rate bonuses have become a popular way to motivate

more research output (Franzoni et al., 2011). And, of course, salaries are commonplace

across many industries. However, if and how these contract designs attract motivated

workers into mission firms is understudied and, as noted above, pivotal for the success

of the contract or policy. We investigate the interplay of contract design with the effects

of mission-motivation along two dimensions: 1) selection into a mission industry or a

standard labor market and 2) firm performance in the mission industry conditional on

this selection.

Motivated by the healthcare application, we consider the market effects of restricting

payment to one of three types of contracts for the mission firm; fee-for-service where the

expert is paid a piece rate per service provided, capitation where the expert is paid a piece

rate per customer they serve, and salary where the expert is paid a flat rate regardless of

provided services. We develop a model with experts of varying levels of mission-motivation

to predict the labor supply and quality of services provided under the different incentive

systems.

Broadly speaking, we demonstrate a tradeoff between labor supply and quality to the

mission firm. We show that fee-for-service attracts the most experts and so implies the

largest labor supply to the mission firm. This is because the fee-for-service incentive

system allows all types to maximize their utility by entering the mission firm. Profit-

driven types can provide too many services in order to maximize earnings, while mission-

driven types can still provide the desired appropriate services. However, the monetary

incentives in capitation and salary are not great enough to attract profit-driven types

leaving only experts with mission motivation in the mission firm. Further, in exploring

how selection impacts quality of service, we show that it is exactly the profit-driven types

that drive quality down, so while there is greater labor supply in fee-for-service, we show

that performance will be lower in fee-for-service than in salary and capitation. Though we

will introduce the subtle details in the model section, we predict that the supply of labor

will be the greatest in fee-for-service followed by salary and then capitation. Conversely,

we predict that quality conditional on selecting into the mission firm will be greatest in

2

salary followed by capitation and then fee-for-service.

To test the model and the predicted comparative statics, we analyzed a real-effort

credence good laboratory experiment. In our experiment, participants played the role of

experts and were asked to choose between a mission firm and a non-mission firm. The

task, proofreading math quizzes, was the same for both firms. In the mission firm, an

expert worked for a customer, another subject in the room, whose payment was depen-

dent on the expert’s performance in the real-effort task. In the non-mission firm, the

expert’s performance had no impact. We varied the set of available incentive systems

offered in the mission firm across treatments, while the non-mission firm always offered a

salary. Specifically, in the baseline treatment, experts could select among fee-for-service,

capitation, or salary in the mission firm. In the other three main treatments, the mission

firm offered only one of the incentive systems.

Our experiments provided several insights into the accuracy of our model at predicting

interaction between mission drive, incentive system, and selection into a mission firm

producing credence goods. First, when incentive systems were restricted in the mission

firm, labor supply was the greatest when the fee-for-service incentive system was offered,

followed by salary, and lastly capitation. Second, we found significant differences in

customer outcomes across the payment structures; customer payoffs were largest for salary,

followed by capitation, and lastly fee-for-service. In short, we found support for both of

the predicted comparative statics. However, we also found that the latter result was

largely driven by the incentive system itself, rather than the differing distributions of

mission motivated experts that select into the various incentive systems. This finding is

consistent with our model; however, our model also predicts that the distributions may

influence customer payoffs as well, and, if they did, the negative impact on customer payoff

from fee-for-service would only be more pronounced. This leads us to the recommendation

for social planners to push for a salary incentive system in credence goods markets with

missions. The salary incentive system produces the highest quality outcomes for customers

regardless of the distribution of mission-driven experts in the labor supply. We explore

this recommendation in more detail in our conclusions.

The remainder of the paper is organized as follows. Section 2 provides an overview of

existing literature exploring financial incentives, intrinsic motivation, and sorting. Section

3 introduces our model. Section 4 discusses our experimental design, predictions, and

protocol. Section 5 provides our results. Section 6 relates our results to observations from

the health care industry’s labor market and concludes.

3

2 Financial Incentives, Intrinsic Motivation, and Expert Sorting

in Mission Firms

The study contributes to the literature on contracts in mission-based firms. Previous work

has demonstrated that the alignment of firm and expert mission improves expert perfor-

mance without additional costs to the firm (Tonin and Vlassopoulos, 2010). Besley and

Ghatak (2005) and Carpenter and Gong (2016) found that when workers were matched

in mission with their organizations, financial and mission preferences could be substi-

tutes. Likewise, several field studies have found that varying salaries offered by mission

firms alters the characteristics of the applicant pool (Barfort et al., 2015; Delfgaauw and

Dur, 2007; Dal Bo et al., 2013; Deserranno, 2019). Specifically, mission firms offering

lower wages were still able to attract motivated applicants and as firms increased wages

the number of non-motivated applicants increased as well. Delfgaauw and Dur provide

a model that predicts this tradeoff by classifying workers as lazy or motivated. In the

model, higher salary wages make the unverifiable public sector more attractive to lazy

workers leading to crowding out of motivated workers and worse outcomes. We expand

on this work by considering incentive systems other than salary, which are common in

credence goods markets. Further, our experimental design allows us to not only measure

the incentive system’s impact on agent selection, but also the performance of the agents

in the mission firm.

Laboratory experiments on sorting have found that self-selection of incentive system

interacts with experts’ skills and performance (Cadsby et al., 2007; Dohmen and Falk,

2011; Macpherson et al., 2014). Workers who were given the opportunity to select their

incentive system were more productive with piece rate than flat rate (as expected), but

another interpretation of this result is that piece rate payment can attract more productive

workers (Lazear, 2000). However, these results are developed for firms with no mission

and without credence goods and therefore, focus on effort and ability as they relate to

performance-based incentive systems (and also ignore intrinsic motivation). This is a

particularly important distinction, as performance-based pay is unattainable due to the

information asymmetries present in credence goods markets.

Experimental studies of intrinsic motivation and contract design in mission-based firms

suggest that experts in credence goods environments are influenced by incentive system

and the firm’s mission (Hennig-Schmidt et al., 2011; Green, 2014; Banuri and Keefer, 2016;

Huck et al., 2016). In their experimental studies, Green (2014) and Hennig-Schmidt

et al. (2011) both show that experts deviate from their profit-maximizing actions to

improve their customers’ well-being. In a more recent experiment, Bejarano et al. (2017)

4

allowed experts to select among incentive systems: fee-for-service, salary, and capitation.

Though there was no non-mission firm, the experiment revealed that experts had a strong

preference for fee-for-service and salary. This suggests that restricting the payment system

in a mission firm to capitation, there may be significant drop in labor supply.

What we add to the literature is selection between mission and non-mission firms and

a theory based on expert decision-making describing the selection process. The above

experimental studies laid the groundwork for our investigation by showing that incentive

systems and selection within a mission industry for credence goods influence behavior

in these markets. However, in the real-world, experts can choose among jobs in mission

or non-mission firms based on their preferences for particular incentive systems. We

want to understand more about how the incentive systems influence this decision, and

how this may impact productivity of the mission industry. Further, while past models

demonstrate how labor supply to public service varies with different wages, their models

are not amenable to firms that incentivize experts with contracts other than salary. By

incorporating mission-motivation and personal profit into the expert’s utility function, we

are able to predict selection and performance in a non-salary environment.

3 Model

In this section we develop a model of an expert i providing services to J customers in

a mission firm, and the ultimate effects for selection between a mission firm and a non-

mission firm. In the mission firm, each customer j ∈ J asks i to evaluate and provide a

service to address aj tasks. Not all tasks need to be addressed with a service. Indeed,

while a service provides a benefit to the customer in half of the tasks it is harmful in the

other half. For notation, let the total number of tasks be A =∑J

j=1 aj.

Initially, neither the expert nor the customer know whether a service is beneficial or

not. However, for each task aj, the expert can exert effort e at a cost of ci to generate

a signal σaj regarding whether a service would be beneficial, σaj = ben, or harmful,

σaj = harm. The signal is not perfectly informative, and we do not specify a distribution

here, although we make an assumption on payoffs below that is inherently an assumption

on the signal structure. For simplicity, we do assume that Pr(σaj = ben) = Pr(σaj =

harm) = 1/2. Alternatively, the expert can exert no effort ne, which is costless and

provides no information. Due to ability and/or time constraints, the expert can exert

effort at most mi times.

For each task aj, after making the effort choice and receiving the signal if i exerted

effort, i decides whether to provide a service s or no service ns. Denote the customer’s

5

payoff from i’s address to aj by xj; it is a positive value η if a beneficial service is provided,

a negative value ν if a harmful service is provided, and 0 if no service is provided. Let

E[xj] =∑

ajE[xj] denote customer j’s total expected payoff from all services provided

and vi =∑

j E[xj] denote total customer expected payoffs from i’s decisions.

Our main modeling assumption is that experts have the utility function

ui = xi −mci +J∑

j=1

(αiE[xj]1E[xj ]≥0 + βiE[xj]1E[xj ]<0

)where

• xi is i’s monetary payoff.

• m ≤ mi is the number of e’s that i chooses.

• αi, βi ≥ 0 are preference parameters we call i’s preference type.

– Larger αi means i cares more about the mission of providing benefits to j.

– Larger βi means i hates harming j more (note that this is the case where E[xj]

is negative so large βi corresponds to a big disutility).

Finally, the expert has an outside option, the non-mission firm, which pays them k

and so chooses to enter the mission firm if and only if ui ≥ k.

The three incentive systems that determine xi we consider are 1) Fee-for-service: i

receives pf for each service provided 2) Capitation: i receives pc for each customer for

whom they provide at least one service and 3)Salary: i receives ps.

For the analysis, a few pieces of additional notation are convenient. When needed to

distinguish among the incentive systems, we will write i’s utility as uki and total expected

customer payoffs as vki where k ∈ {FFS,CAP, SAL} denotes the incentive systems fee-

for-service, capitation, and salary respectively. Also, we drop all i subscripts hereafter to

ease the exposition.

Before going into the details, we make a comment on the strategy of the expert. For

each aj, i will choose 1 of the 4 following options:

• Option 1: Choose effort and then provide a service if and only if σaj = ben. Denote

j’s expected payoff from this option by

y1 = E[xj|e, s iff σaj = ben]

• Option 2: Choose effort and then provide a service regardless of σaj . Denote j’s

expected payoff from this option by

y2 = E[xj|e, s]

6

• Option 3: Choose no effort and then provide a service. Denote j’s expected payoff

from this option by

y3 = E[xj|ne, s]

• Option 4: Choose no effort and then provide no service. The expected payoff to j is

0.

We consider cases where y1 > y2, y3 < 0, and E[xj|e, s, σaj = harm] < 0.1 The case

y3 < 0 is natural, because otherwise there is little incentive to exert effort. Furthermore,

the other two are also natural so that signals are informative. In some applications, it

might be reasonable to additionally assume that y2 = y3 because services are provided

for sure in both cases. This would make the analysis simpler (Option 3 would dominate

Option 2), but in our application we think they are likely different and even that y2 > 0

(see footnote 3).

There are two other strategies i could choose to address a given aj, but they are

dominated because i’s payoff is (weakly) increasing in services provided, not choosing

effort, and the customers’ payoffs. The first other strategy is effort followed by a service

if and only if σaj = harm, but this is dominated by Option 2, because the customer’s

payoff is larger from a service following the signal ben and so if it is optimal to provide a

service after harm it is also optimal after ben. The second is effort followed by no service

regardless of σaj , but this is dominated by Option 4 which only differs in that i does not

pay the cost of effort.

3.1 Analysis

We analyze behavior for the three different incentive systems by considering two speci-

fications for α and β. First, we look at α = β = 0 (profit-maximizers). For this type,

maximizing x is the only priority so i never exerts effort and provides any number of

services in salary, at least one service per customer in capitation, and all A services in

fee-for-service. For concreteness, we assume that i does the least number of services to

maximize x. Expert payoff is ps in salary, Jpc in capitation, and Apf in fee-for-service.

Total customer expected payoffs are 0 in salary, Jy3 in capitation, and Ay3 in fee-for-

service. Note that y3 < 0 and J ≤ A (strictly less as long as some customer has at least

two tasks) so, from the customer’s perspective, salary is best and fee-for-service is worst.

We assume that for everyone else, β =∞. The assumption β =∞ simply means that

hurting customers (in expectation) is never an option and then α measures how much

the expert cares about them ranging from α = 0 (do-no-harm) to α very large (mission

1The expected values y1, y2, and y3 are ex-ante to decision-making. This last expected value is ex-post. It is theexpected value after i exerts effort and receives the signal harm.

7

driven). For ease of exposition we will identify profit-maximizers by α = β = 0 and all

others by α (with the implicit assumption that β = ∞ for these types). We now go

through each of the three incentive systems in closer detail for this case.

3.1.1 Salary

In salary, one’s own monetary payoff is fixed so it does not enter the decision. Option 2 is

dominated by Option 1, because Option 1 provides a higher customer payoff for the same

cost c. Option 3 is dominated by Option 4 because Option 3 provides a negative expected

customer payoff. Hence, the expert decides between Options 1 and 4 which give i the

payoffs −c+ αy1 and 0 respectively. So i chooses Option 1 as many times as possible (m

times) if α ≥ c/y1. Payoffs for the expert (uSAL) and expected payoffs for the customers

(vSAL) are summarized as follows:

uSAL =

{ps if α < c/y1

ps + m(−c+ αy1) if α ≥ c/y1

vSAL =

{0 if α < c/y1

my1 if α ≥ c/y1

We are considering a continuum of preference types, but there are just two broad types

of behavior we predict (profit-maximization is the same as Option 4 here, although it will

be distinct for the other incentive systems). Hence, we define what we call a behavior

type as the behavior induced by some underlying preference types. We will give them a

descriptive name, put the preference types that display the behavior in parentheses, and

then describe their behavior. For salary, the two behavior types are

• Profit-Maximizer/Do-Nothing: (α = β = 0 or α < c/y1) Perform no services and

deliver an expected payoff of 0 to the customers.

• Do-Best: (α ≥ c/y1) Perform 1/2m services and deliver a positive expected payoff

to customers.

One final important note is that behavior can differ within behavioral types because

experts have different values of m. So different experts within the group may perform a

different number of services, but they will provide them using the same option.

3.1.2 Capitation

In capitation, the expert’s monetary payoff only depends on doing at least one service per

customer. It may be tempting to think that Options 2 or 3 could be optimal (at least

until a service has been provided for a given customer), but this is never the case. If i

8

provides a first service to customer j after no effort or after σaj = harm, then E[xj] < 0

for this service so i will have to provide more services (to avoid a negative infinity payoff).

But then, i should just wait until σaj = ben, because providing the service now does not

affect i’s final monetary payoff and negatively affects j. The expert will again do only

Option 1 or Option 4.

The Pr(σaj = ben) = 1/2 so the expected number of tasks per customer to provide a

service with Option 1 is 2. The expert will never do Option 1 if pc + 2(−c + αy1) < 0.

Otherwise, they will do Option 1 until σaj = ben and so they provide a service. Once

they have done a service for j, pc is out of the calculation and so continuing to do Option

1 comes down to the sign of −c + αy1, just as under salary.2 Expected payoffs are

summarized as follows

uCAP =

0 if α < (c− 1/2pc)/y1

min{1/2m, J}pc +min{m, 2J}(−c+ αy1) if (c− 1/2pc)/y1 ≤ α < c/y1

min{1/2m, J}pc + m(−c+ αy1) if α ≥ c/y1

vCAP =

0 if α < (c− 1/2pc)/y1

min{m, 2J}y1 if (c− 1/2pc)/y1 ≤ α < c/y1

my1 if α ≥ c/y1

There are four behavioral types here:

• Profit-Maximizer: (α = β = 0) Perform J services and deliver a negative expected

payoff to the customers.

• Do-Nothing: (α < (c−1/2pc)/y1) Perform no services and deliver an expected payoff

of 0 to the customers.

• Do-One: ((c−1/2pc)/y1 ≤ α < c/y1) Perform (at most) 1 service per customer which

delivers any customer who gets a service a small positive expected payoff.

• Do-Best: (α ≥ c/y1) Perform 1/2m services which delivers any customer who gets a

service a positive expected payoff.

3.1.3 Fee-For-Service

Fee-for-service is more complicated, because every service provides a monetary payoff to

the expert and so all four Options are potentially optimal. First, note that the payoff

from doing Option 3 is pf +αy3 as long as the total expected payoff for j is non-negative.

2The order in which i goes through the tasks also matters here and we assume that this is a choice variable for theexpert (it is in the experiment). The optimal thing to do is obviously to go through each customers tasks until a service isprovided and then move on to the next customer. If all customers are served, then time-permitting and if i’s preference issuch, i will return to previously served customer to provide more services later.

9

Hence, if i does Option 1 or Option 2 for customer j and is delivering a positive payoff,

they will want to follow with Option 3 if pf + αy3 > 0 until j’s payoff is lowered to 0 or i

runs out of tasks.

Each Option 1 raises customer payoff by y1 and so can be accompanied by (y1/− y3)

Option 3’s (we assume partial tasks are possible for simplicity though the analysis is

basically the same without this assumption). The associated utility is

1/2pf − c+ αy1 +y1

−y3

(pf + αy3) = (1/2− y1/y3)pf − c

Each Option 2 raises customer payoff by y2 and so (if y2 > 0, otherwise this option is

never optimal) can be accompanied by (y2/− y3) Option 3’s. The associated utility is

pf − c+ αy2 +y2

−y3

(pf + αy3) = (1− y2/y3)pf − c

Comparing, Option 1 is optimal if and only if y1−y2−y3≥ 1/2. Now, suppose either Option

1 is optimal and m(1 + (y1/− y3)) ≤ A or Option 2 is optimal and m(1 + (y2/− y3)) ≤ A.

Then, behavior is fully characterized. The expert does Option 1 m times and Option 3

(y1/− y3)m times or Option 2 m times and Option 3 (y2/− y3)m times.

However, when m is large, then this is not feasible because it would amount to more

than A services. The expert then has to decide whether to do less Option 1/Option 2’s

or less Option 3’s. For the Option 1 case, the former gives 1/2pf − c+αy1 and the latter

pf + αy3 so the expert will do less Option 3’s when α ≥ (1/2pf + c)/(y1 − y3). For the

Option 2 case, the former gives pf − c+ αy2 and the latter pf + αy3 so the expert will do

less Option 3’s when α ≥ c/(y2 − y3). Finally, when pf + αy3 ≤ 0 then the expert will

only do Option 1 or Option 2 which is a comparison of 1/2pf − c + αy1 to pf − c + αy2

so Option 1 is better when α ≥ 1/2pf/(y1 − y2).

For simplicity, we will not specify behavior types based on Option 1 vs. Option 2. It’s

hard to observe effort on a specific problem in the experiment so it’s hard to know when

a non-service happened because of effort and a harmful signal or just no effort. Hence,

the behavior types are

• Profit-Maximizer: (α = β = 0) Perform A services and deliver a very negative payoff

to customers.

• Not-Max-Services-Do-No-Harm: (α < (pf/ − y3) and m(1 + (y1/ − y3)) ≤ A andy1−y2−y3≥ 1/2 or m(1 + (y2/− y3)) ≤ A and y1−y2

−y3≤ 1/2) Do less than A services and

deliver a customer payoff equal to 0 in expectation.

• Max-Services-Do-No-Harm: (α < (pf/−y3) and m(1 + (y1/−y3)) ≥ A, y1−y2−y3≥ 1/2,

and α ≤ (1/2pf + c)/(y1 − y3) or m(1 + (y2/ − y3)) ≥ A, y1−y2−y3

≤ 1/2, and α ≤c/(y2 − y3)) Do A services and deliver a customer payoff equal to 0 in expectation.

10

12 question math quiz

Subjects select payment scheme

and mission using strategy

method

All subjects proofread 16 math

quizzes over 28 minutes

Dictator Game

Phase 1 Phase 2 Phase 3 Phase 5

Other-regarding risk

preferences (MPL)

Phase 4

Figure 1: Experimental Protocol

• Max-Services-Do-Good:(α < (pf/− y3) and m(1 + (y1/− y3)) ≥ A, y1−y2−y3≥ 1/2, and

α ≥ (1/2pf +c)/(y1−y3) or m(1+(y2/−y3)) ≥ A, y1−y2−y3≤ 1/2, and α ≥ c/(y2−y3))

Do A services and deliver a customer payoff greater than 0 in expectation.

• Do-Best: (α ≥ (pf/ − y3)) Do 1/2m or m services and deliver a customer payoff

greater than 0 in expectation.

4 Experimental Design

We utilized an established experimental design for investigating a mission firm producing

credence goods introduced in Green (2014) and modified in Agrawal et al. (2019). This

design utilizes a real effort task that captures the key characteristics of a credence goods

market and creates a mission firm where actions impact other subjects in the experiment.

Our innovation to this design is to add selection into the mission firm. An overview of

the design is illustrated in Figure 1 and we describe the details in the rest of this section.

4.1 Experimental Phases

There are five phases in the experiment as illustrated in Figure 1. The main phases are

Phase 2 and Phase 3 so we begin with those. It is easier to start with Phase 3 and then

work backwards to describe Phase 2.

Phase 3: Credence Goods Provision

We begin with Phase 3 which corresponds to expert decision-making after entry into

the mission firm in the model and we will fill in the model parameters as we explain the

design to fit the design to the model.

There were 4 identical rounds in this phase, and, in each round, the participants were

asked to proofread 4 math quizzes, each with 12 multiple choice questions taken from

11

the SAT math test (Official SAT Practice Test, 2015). Each quiz is a single customer j

and so, in total, there are J = 16 customers. We call this the expert math task, because

the participants were taking on the role of an expert providing the credence good to

customers.

One of the multiple choice answers to each SAT question was provided to the partici-

pants by us, the experimenters. Some of the answers we provided were the correct answer

(where a service would be harmful) while others were an incorrect answer (where a ser-

vice could be beneficial). The experts’ math task was to edit (i.e. provide a service) the

provided answers to the math quizzes. We only allowed them to make edits to a subset

of the questions which we call the highlighted questions. The highlighted questions for

each quiz j are the aj tasks in the model. The subset contained all the questions where

we provided an incorrect answer and an equal number of questions where we provided a

correct answer. For example, if the 12 question quiz had 2 incorrect answers, then there

were 4 total highlighted questions, the two with incorrect answers and two with correct

answers. This highlighting design protocol makes it so participants would know that there

was exactly a 1/2 chance that any question they could edit had an incorrect answer and

that they could easily calculate the total number of incorrect answers (divide the number

of highlighted questions by two).

The 4 quizzes from round to round were identical, although the answers we filled out

for the questions varied. In total, there were 44 incorrect answers, so participants could

edit up to A = 88 questions. The participants were given this number in the instructions.

They had 7 minutes to make edits in each round. In summary, they had 28 minutes to

proofread and make edits to 88 highlighted questions across 16 quizzes.

Participants who had selected into the mission firm (this selection is Phase 2, see below)

were paid according to one of the three incentive systems from the model.

1. Fee-for-service: The payment was pf = $0.56 for every edit made.

2. Capitation: The payment was pc = $1.54 for every quiz in which at least one edit

had been made.

3. Salary: The payment was ps = $24.64, independent of any edits.

Notice that this means that a participant that correctly edited exactly the questions

with incorrect answers would make $24.64 under all three incentive systems. There is

the potential to make up to $49.28 in fee-for-service if every answer was edited. Also,

payment is never dependent on accuracy of the edits (i.e. whether the incorrect answers

were changed to the correct answer). This feature comes from our service being a credence

good where quality is not observable and so contracts cannot condition on it.

Moving on to how we induce a mission firm, for participants who have selected into

12

Each row represents a

different scenario. Select your

preferred job type and payment

method for each scenario.

Job A Job B

Figure 2: Price List Screen Shot

the mission firm, the edits impacted the earnings of a customer who was a randomly

selected participant in the room. In the mission firm, the customer’s earnings decreased

by ν = $0.10 for each highlighted question in the quizzes that was wrongly edited by the

expert (an incorrect or correct answer was changed to an incorrect answer) and increased

by η = $0.30 for each question in the quizzes that was rightly edited (an incorrect answer

was changed to the correct answer).3 For participants who have selected into the non-

mission firm, the edits have no impact on customer earnings.

The customers were endowed with $24.00 less $.50 for each incorrect answer that we

had provided to the expert in one round of 4 quizzes. These earnings were then altered

as detailed above if the participant in the expert role selected into the mission firm. They

are the final earnings if the expert selected into the non-mission firm.

Phase 2: Incentive System and Mission Selection

In Phase 2, the participants decided when to select into the mission firm and, in the

treatment where multiple incentive systems were offered, selected the incentive system

that would be used in Phase 3. The incentive system in the non-mission firm was also a

salary. Henceforth, we will call this salary in the non-mission firm the outside option as

3 We think effort and signal generation works in approximately the following way. There are two plausible answers toeach question and effort narrows down the set of answers to these two and selects the right one with some probabilityp > 3/4. No effort means an edit is a random guess which (with 5 potential answers) is a beneficial service 1/8 of the time(1/4 chance of a right guess times 1/2 chance the provided answer was wrong). Then y1 = .5p(.3)+.5(1−p)(−.1) = .2p−.05,y2 = .5(.3) + .5(−.1) = .1, and y3 = .125(.3) + .875(−.1) = −.05.

13

Figure 3: Threshold Screen Shot

in the model in order to distinguish it from the the salary in the mission firm.

We used a “price list” strategy method design where participants were asked to choose

between the mission firm (and among the incentive systems offered) and the non-mission

firm for various outside options. The outside options ranged from k = $9.64 to k = $49.64

in increments of $5.00. Hence, they made 9 choices. A rational decision-maker would

choose a threshold, which we call a switching point, for which they would choose the

mission firm for all outside options below the threshold and the non-mission firm for

all outside options equal to or above the threshold. Figure 2 provides the screen the

participants saw when they could choose among all three incentive systems in the mission

firm or the non-mission firm. They had to choose one of the 4 options in each of the 9

rows.4

In addition to the 9 choices in the price list, we asked them to make one more choice.

This choice was a minimum outside option (discretized, using the same 9 outside options

in the price list) for which they would enter the non-mission firm rather than take the

salary of $24.64 and enter the mission firm. This choice was the same for a rational

decision-maker as choosing in the price list when the incentive system in the mission firm

was restricted to salary only. It (if properly understood) elicited the participants mission-

motivation abstracting away from the incentive system. For example, if the participant

chose a minimum outside option of $34.64, then this means that the value they attach to

helping others’ earnings was $10.00. Figure 3 provides the screen participants saw for this

choice. They were asked to click the right and left buttons to choose a minimum outside

option.

One of the 10 choices was randomly selected and implemented. For the 9 cases in the

price list, this means implementing the choice made by the participant for that case. If

the minimum outside option choice was selected, another random number was drawn that

determined the outside option and then the participant was put in the non-mission firm

if and only if this payment was greater than or equal to the outside option minimum they

had selected.

Prior to engaging in these decisions, participants completed a quiz to ensure task

4The mission firm was called Job A in the instructions while the non-mission firm was called Job B.

14

understanding. The quiz asked subjects to calculate their change in earnings and their

customer’s change in earnings in both the mission and non-mission firms. We did not

continue with the instructions until all participants had correctly made these calculations.

Expert and Customer Roles

All of the participants completed phases 2 and 3 in the role of expert. At the end

of the experiment, half of the participants were assigned to the passive role of customer.

Each of the customers were matched with one expert and their earnings were determined

as detailed above based on the decision of the expert they were matched to. The experts

were paid based on their decisions in all 4 rounds (all 16 quizzes) according to the incentive

system they had selected. Customers were paid for 1 round (4 quizzes) of edits made by

their matched expert, and that round was determined by a random number generator.

Phases 1, 4, and 5:

Phases 1, 4, and 5 were shorter secondary experiments to elicit three additional pa-

rameters useful for behavioral analysis.

Phase 1: Ability and Overconfidence

To assess ability at math tasks, at the start of the experimental session, the experts

were asked to answer a short 12 question SAT math quiz in 12 minutes (none of the

questions were used in Phase 3). Subjects received $0.20 for each correctly answered math

question and an additional $1.00 if they correctly predicted the number of questions they

got correct. Overconfidence was calculated as the difference between the subject’s ability

(number of correct answers) in the math quiz and their prediction of how many questions

they answered correctly (presumably this was negative, overconfidence, for most subjects

although it could also be positive, underconfidence, for some subjects). Subjects were

made aware of their overconfidence prior to Phase 2.

Phase 4: Altruistic Preferences

Altruistic preferences were measured through a dictator game. Participants were asked

to complete a $4 dictator game (giving allowed in increments of $.50) with an anonymous

individual from the experimental session. This individual was randomly selected from all

the participants in the room.

Phase 5: Other-Regarding Risk Preferences

Finally, other-regarding risk preferences were assessed as the final phase. Other-

regarding risk was assessed using the standard Holt-Laury multiple price list (MPL),

except that the selected lottery was run for an anonymous individual as in Chakravarty

et al. (2011). This individual was randomly selected from all the participants in the room.

15

Mission Firm

No Mission Firm

Flat Rate

Fee-for-service

Salary

Capitation

(a) Baseline

Mission Firm

No Mission Firm

Flat Rate

Treatment Payment

(b) Other

Figure 4: Treatments

4.2 Treatments

There were 4 main treatments (and 2 additional robustness treatments that we will de-

scribe later). The treatment variable was the set of available incentive systems in the

mission firm and we considered all three together (Figure 4(a)) and then each of the three

in isolation (Figure 4(B)). This leads to four treatments.

1. Baseline: Participants choose among fee-for-service, salary, and capitation in the

mission firm.

2. Fee-for-service: Participants receive fee-for-service in the mission firm.

3. Capitation: Participants receive capitation in the mission firm.

4. Salary: Participants receive salary in the mission firm.

We will use italics; Fee-for-service, Capitation, and Salary when referencing these

treatments and regular text; fee-for-service, capitation, and salary when referencing the

incentive systems below. Comparisons between Baseline and each treatment, and among

the treatments allows us to both identify the marginal effect of limiting available incentive

systems in a mission firm on agent preferences, and the marginal effect on performance

in the mission firm when available payment structures are limited.

4.3 Hypotheses

In this section, we return to the model and consider two comparative statics that provide

hypotheses to test in the experiment; labor supply to the mission firm (types with u ≥ k)

and quality conditional on entering the mission firm (average v conditional on u ≥ k).

Remark 1

The customers’ final payoffs are jointly determined by the underlying preference types

of experts and the incentive system in the mission firm. This will lead to two effects for

16

quality in the mission firm for different incentive systems: the distribution of underlying

preference types that select into the mission firm differs across incentive systems and

the behavior of each preference type that has selected in differs too. For example, the

profit-maximizing type will enter the mission firm in Fee-for-service but not in Salary or

Capitation when k > 24.64 (the distribution of underlying preference types that select into

the mission firm differs) and will do a different number of services with different ultimate

effects for the customers in all three when they all do enter the mission firm (the behavior

of each preference type differs). We will try to tease out which effects are important in

the experimental results, but for the following comparative statics, we consider the joint

effect.

Labor Supply

To get a broad overview of which preference types enter the mission firm, the following

table summarizes which agents enter (up to k = 49.28), by dividing up the preference type

space into the regions that characterize the behavioral type space in salary and capitation

and considering values of the outside option below the profit-maximizing value in salary

and capitation and above it.

Labor Supply

Types Baseline Salary Capitation Fee-for-service

k ≤ 24.64 α = β = 0 In In In In

α ≤ c/y1 In In Some In Some In

α > c/y1 In In Some In Some In

k > 24.64 α = β = 0 In Out Out In

α ≤ c/y1 Some In Out Out Some In

α > c/y1 Some In Some In Some In Some In

First, labor supply is always largest in Baseline. This is obvious as the agents choose

among the three incentive systems and so receive the maximum of the three utilities

by entering the mission firm. Second, labor supply is always larger in Salary than in

Capitation, because profit-maximizers make the same and uSAL ≥ uCAP for all other

types.

This just leaves us with figuring out where Fee-for-service fits in with Salary and

Capitation. We start with k ≤ 24.64. Then everyone enters in Salary and so labor supply

is greater in Salary than in Fee-for-service. The comparison between Capitation and Fee-

for-service is trickier. The conditions for entry depend on α, m, c, y1, y2, and y3 but in

different ways and so we make no prediction.

17

Now, we consider k > 24.64. All the comparisons here depend on the underlying

distribution of preference types and other variables just as above. Intuitively, large α

types should generally prefer Salary as they want to help the customer but may not be

able to do so and get 24.64 in the other treatments if they have a small m. However,

profit-maximizers and some low-α types only enter in Fee-for-service. We think the key

type here is profit-maximizers and make a distributional assumption that there are enough

of them to outweigh the other considerations. In particular, we predict that labor supply

is greatest in Fee-for-service.

The switching point for profit-maximizers to enter the non-mission firm is 49.28 under

fee-for-service but only 24.64 under capitation and salary. Our assumption that there are

sufficient profit-maximizers translates to the highest average switching point in Baseline

followed by Fee-for-service and then Salary and Capitation.

The following hypothesis summarizes.

Hypothesis 1. Comparative Statics for Quantity of Labor Supplied:

• The average switching point at which experts select into the non-mission firm is

ranked

Baseline ≥ Fee-for-service ≥ Salary ≥ Capitation

• For k ≤ 24.64, the proportion of experts who enter the mission firm is ranked

Baseline = Salary ≥ Capitation, Fee-for-service

• For 49.28 ≥ k > 24.64, the proportion of experts who enter the mission firm is ranked


Service Quality

The average of v is trickier, because of the two effects noted in Remark 1. The following

table then summarizes v for experts in the categories defined in the labor supply table.

The reason that we write ≈ 0 for α ≤ c/y1 in Fee-for-service is that only Max-Services-

Do-Good deliver a positive payoff but such types are a delicate balance between α ≤ c/y1

and α ≥ ((1− q)pf + c)/(y1 − y3) or α ≥ c/(y2 − y3). From a distributional perspective,

we assume few (if any) α types fit this bill.

Customer Payoff v

Types Baseline Salary Capitation Fee-for-service

Profit Max -4.40 0 -.80 -4.40

α ≤ c/y1 0 0 min{m, 32}y1 ≈ 0

α > c/y1 ≥ 0 my1 my1 ≥ 0

18

For k ≤ 24.64, the table suggests that Fee-for-service is the worst, and the comparison

of Salary and Capitation comes down to whether there are many profit-maximizers or

many low-α types. For all comparisons, we also need to be a little careful with the

high-α types, because they select in for different values of m so the expected value of m

conditional on entering is not equal across the two incentive systems.5 Nevertheless, we

maintain the assumption that the number of profit-maximizers is large, which means that

quality in Salary is better than in Capitation. The assumption also means that quality

will be worse in Baseline (where profit-maximizers choose fee-for-service) and Fee-for-

service, because the -4.40 will dominate any positive payoffs from the high-α types in

Capitation and Salary. Finally, quality will be better in Baseline than in Fee-for-service,

because higher-α types choose salary in Baseline where they can deliver a larger payoff

to the customer without sacrificing their own payoff.

For k > 24.64 only the altruistic types remain in Salary and Capitation while everyone

remains in Fee-for-service so the quality differences are only exacerbated.

Hypothesis 2. Comparative Statics for Quality of Services:

• The average of v conditional on entering the mission firm is ranked

Salary ≥ Capitation ≥ Baseline ≥ Fee-for-service

5 Results

5.1 Procedural Details

A total of 252 subjects were recruited from Chapman University and from the University

of Virginia. Approximately the same number from each university participated in each

of the treatments. All analysis was pooled across the universities. In total, we had

66 subjects in Baseline, 46 in Fee-for-service, 46 in Capitation, and 48 in Salary (the

remaining were in the robustness treatments described below).

The participants learned the number of questions they correctly answered in the math

quiz at the end of Phase 1 (and thus they could calculate their overconfidence). They

learned the draw for the strategy method in Phase 2, and therefore they knew whether

they would be in the mission firm or not and the incentive system under which they would

be paid in Phase 3. Finally, they saw how many right and wrong edits they made and

their own resulting payoff at the end of each round in Phase 3. No other feedback was

given until the end of the experiment.

5In fact, it is larger in Capitation than Salary, because the low m-types are the ones that select out.

19

The experiments took approximately 1 hour and 45 minutes to complete, and no one

participated in more than one session. The experiment was programmed and run in z-

tree (Fischbacher, 2007) with neutral framing. The instructions for Baseline and Fee-for-

service are in the appendix (and the other two treatments are similar to Fee-for-service).

Finally, subjects received their earnings from each Phase at the end of the experiment

which in total were $35.57 on average.

In our analysis, we dropped our measures of mission drive (the outside option minimum

determined alongside the price list in Phase 2) and altruism (dictator game giving in Phase

4). Upon review of the data there were several indications that the tasks did not properly

elicit these preferences. We discuss these indications in the appendix.

All analysis was conducted using subjects that had switching points for their prefer-

ences between the mission and non-mission firm. Again, a switching point is a cutoff

outside option for the non-mission firm such that the subject chose the non-mission firm

if and only if the outside option was at least this cutoff. There were 8 subjects that did

not have switching points leaving a total of 198 for the analysis. We will have to further

eliminate data for some of the analyses, and we will document this as needed below.

Table 1: Switching Points

Treatment Avg. Switching Point N

Baseline 5.75 63

Fee-For-Service 5.68 44

Capitation 4.56 45

Salary 4.57 46

Treatments Ranksum Test p-values N

Baseline vs.

Fee-For-Service 0.9615 107

Capitation 0.0063 108

Salary 0.0033 109

Fee-For-Service vs.

Capitation 0.0192 89

Salary 0.0125 90

Capitation vs.

Salary 0.9966 91

5.2 Labor Supply to the Mission Firm

Hypothesis 1. Comparative Statics for Quantity of Labor Supplied:

• The average switching point at which experts select into the non-mission firm is

ranked


20

• For k ≤ 24.64, the proportion of experts who enter the mission firm is ranked

Baseline = Salary ≥ Capitation, Fee-for-service

• For 49.28 ≥ k > 24.64, the proportion of experts who enter the mission firm is ranked


To test the first statement of our first hypothesis, we compared the average switching

points across treatments. The switching point we report is the option in the price list for

which the subject switched rather than the value of the outside option (i.e. 1 = $9.64,

2 = $14.64, etc.) The results are presented in Table 1. As the table indicates, participants

switched at larger outside options in Baseline and Fee-for-service than in Capitation and

Salary. That is, the mission firm was chosen more often in Baseline and Fee-for-service.

Statistics with Wilcoxon ranksum tests confirm these observations (see Table 1, Ranksum

tests). Overall, average changes to labor supply to the mission firm across varying outside

options were in line with the predictions made by Hypothesis 1.

Table 2: Selection into the Mission Firm

Proportion Who Choose the Mission Firm

Treatment $9.64 $14.64 $19.64 $24.64 $29.64 $34.64 $39.64 $44.64 $49.64 N

Baseline 0.9841 0.9841 0.9683 0.6825 0.4286 0.3175 0.2222 0.1111 0.0476 63

Fee-for-service 0.9545 0.9318 0.8864 0.6591 0.4773 0.3409 0.2273 0.1818 0.0227 44

Capitation 0.9111 0.9111 0.8889 0.5556 0.1778 0.0444 0.044 0.022 0 45

Salary 0.9783 0.9783 0.913 0.6087 0.087 0 0 0 0 46

Chi-square Test p-values

Treatments $9.64 $14.64 $19.64 $24.64 $29.64 $34.64 $39.64 $44.64 $49.64 N

Baseline vs.

Fee-for-service 0.362 0.16 0.092 0.799 0.618 0.799 0.951 0.3 0.504 107

Capitation 0.075 0.075 0.099 0.178 0.006 0.001 0.01 0.082 0.138 108

Salary 0.822 0.822 0.212 0.424 0 0 0.001 0.019 0.133 109

Fee-for-service vs.

Capitation 0.414 0.717 0.97 0.317 0.003 0 0.012 0.013 0.309 89

Salary 0.531 0.285 0.673 0.62 0 0 0.001 0.002 0.304 90

Capitation vs.

Salary 0.16 0.16 0.7 0.607 0.2 0.148 0.148 0.309 - 91

In order assess the latter statements and to get a more nuanced understanding of

switching behavior, we look at the proportion of participants that select into the mission

firm for each outside option. This is reported in Table 2 and Figure 5 (next page). Sev-

eral findings are worth highlighting. Most notably, the proportion was much larger in

Baseline and Fee-for-service than in Capitation and Salary for outside options between

$29.64 and $44.64. These are the outside options for which profit-maximizers select into

21

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

$9.64 $14.64 $19.64 $24.64 $29.64 $34.64 $39.64 $44.64 $49.64

Prop

ortio

n of

Sub

ject

s in

Mis

sion

Fir

m

Switching Point Value

Baseline Fee For Service Capitation Salary

Figure 5: Selection into Mission Firm By Outside Option

the mission firm in the former treatments and take the outside option in the latter treat-

ments, which suggests that there were a significant number of profit-maximizers in our

sample. The results are consistent with the third statement in Hypothesis 1. All of the

differences between proportions except one were statistically significant and that last was

only marginally significant (see Table 2, chi-square tests).

Two other results were also interesting. First, for the lowest outside options between

$9.64 and $19.64, essentially all of the participants selected the mission firm in Baseline

and Salary (slightly less than everyone in salary for $19.64), while some experts did not in

Capitation (and to a lesser extent, some do not in Fee-for-service). This is in line with the

second statement in Hypothesis 1 which based on the model prediction that experts who

wanted to do-no-harm and were low ability would have almost never selected the mission

firm when the incentive system offered was capitation or fee-for-service. This is because

these experts are not able to conduct enough services to receive capitation payments

or fees in an amount greater than the flat rate offered by the non-mission firm. As a

caveat, the differences in proportions across treatments were only marginally significant

between Baseline and Capitation for all outside options and between Baseline and Fee-

for-service for $19.64 (see Table 2, chi-square tests). We think this is due to a small effect

size, implying that this type of expert was not a large contingent of our sample in these

treatments.

Second, the proportion of participants that selected the mission firm in Fee-for-service

was still fairly large for outside options up to $44.64, but dropped essentially to zero when

the payment was $49.64. This is exactly the switching point where profit-maximizers select

out of the mission firm under this incentive system, providing further evidence in favor

22

of profit-maximizers in our sample. Indeed, there were no significant differences across

treatments when the outside option was $49.64.

As shown in Figure 5, most switching points in Salary and Capitation are at or near

$24.64. We had $5.00 increments to accommodate the fact that profit-maximization

required choosing the mission firm all the way until the outside option was $49.28 in

Fee-for-service. However, this could be the reason that selection behavior in Salary and

Capitation looks largely the same. To better understand whether any difference exists,

we lowered the increments to $1.00 and ran new treatments, Salary low and Capitation

low, where the outside option ranged from $21.64 to $29.64. We still did not see any

statistical difference between the two treatments. See Appendix B for the details.

Table 3: Switching Point Regressions

By Treatment By incentive system

Baseline Fee-for-service Capitation Salary fee-for-service salary

(1) (2) (3) (4) (5) (6)

Risk Switch. Pt. -0.155 0.0475 0.337∗ -0.0959 -0.0938 -0.0963

(0.155) (0.167) (0.177) (0.153) (0.143) (0.114)

Ability -0.0193 0.168 0.299 -0.0985 0.179 -0.147

(0.159) (0.169) (0.184) (0.128) (0.137) (0.105)

Overconfidence 0.610∗∗ 0.158 0.413 0.0681 0.108 0.427∗∗

(0.255) (0.220) (0.257) (0.267) (0.202) (0.193)

Female 0.000731 -0.430 0.0683 -0.159 -0.504 0.0685

(0.531) (0.570) (0.667) (0.617) (0.471) (0.461)

Non-profit -0.996 -0.545 -0.104 2.474∗ -0.547 1.453

(0.916) (0.700) (0.724) (1.351) (0.575) (1.049)

Fee-for-service 2.620∗∗∗

(0.670)

Capitation 25.42

(15991.5)

Baseline 0.871∗ -0.161

(0.493) (0.471)

N 57 40 43 45 60 79

Notes. Ordered logits with dependent variable Switching Point. Regressions (1)-(4) disaggregate the

data by treatment and dummy variables for the selected incentive system in Baseline are included.

Regressions (5)-(6) disaggregate the data by incentive system and a dummy variable for when it was

the chosen incentive system in Baseline is included. Standard errors are reported in parentheses.∗ p < .10, ∗∗ p < .05, ∗∗∗ p < .01

23

We now analyze regressions to further understand the potential drivers behind the

observed selection. This approach allows us to determine how drivers such as mission

preferences, risk attitudes to others, ability, and confidence jointly affected job selection

and draw comparisons across treatments and incentive systems. Table 3 presents the re-

sults of ordered logits where the dependent variable is switching point and the independent

variables are potential drivers. Regressions (1)-(4) disaggregate the data by treatment and

a dummy variable for the selected incentive system in Baseline is included. Regressions

(5)-(6) disaggregate the data by incentive system and a dummy variable for the whether

it was the chosen incentive system in Baseline or from the treatment is included. As there

are only two subjects who chose capitation in Baseline, capitation is excluded from the

latter set of regressions. Finally, we would have liked to use minimum outside option and

dictator game giving as measures of mission, however, as noted above, these measures

were problematic. Instead, we consider a dummy for considering a career in a non-profit

industry as a potential proxy.6

The positive effects associated with Overconfidence in the Baseline treatment and

salary incentive system indicate a stronger willingness to enter the mission firm. This

makes sense for subjects who think they can help others more than they actually can.

Similarly, the marginally positve effect associated with Non-profit for Salary indicates

more mission-driven subjects should be more willing to join the mission. On the other

hand, the marginally significant positive effect of Risk Switching Point in Capitation is

hard to understand as we expected more risk aversion for others would move participants

away from the mission firm.

The positive effect of fee-for-service in Baseline just confirms our previous findings

that switching points are largest when the incentive system is fee-for-service. However,

the marginally positive effect of the Baseline dummy when considering all the partici-

pants that consider fee-for-service is more subtle and interesting. In our model, when

given the choice, profit-maximizers choose fee-for-service while others choose salary. This

means that fee-for-service in Baseline is filled with profit-maximizers who all choose large

switching points while fee-for-service in Fee-for-service has a mix of people who choose

large and smaller switching points. The positive coefficient on the Baseline dummy is

consistent with exactly this behavior. It should be noted that the same story implies a

negative coefficient on this dummy for salary, which is what we find although it is not

significant.

6Thirteen additional subjects are dropped from the regression analysis; those who did not choose a switching point inthe risk aversion for others task and those who chose multiple incentive systems for the mission firm for different outsideoptions in Baseline.

24

$17.00

$17.50

$18.00

$18.50

$19.00

$19.50

$20.00

Round 1 Round 2 Round 3 Round 4

Fina

l Cus

tom

er P

ayof

f

Baseline Fee for Service Capitation Salary

Figure 6: Customer Payoff by Round

Table 4: Customer Payoffs

Avg. Customer Payoff By Round

Treatment 1 2 3 4 All N

Baseline 18.26 18.37 18.63 18.62 73.87 31

Fee-for-Service 18.01 17.95 18.26 18.04 72.27 17

Capitation 18.35 18.15 18.5 18.7 73.7 2

Salary 18.6 19 19.17 19.41 76.18 12

Fee-for-Service 18.07 18.16 18.67 18.34 73.24 21

Capitation 18.68 18.91 19.4 19.21 76.19 16

Salary 18.98 19.13 19.8 19.60 77.5 21

Ranksum Test p-values

Treatments 1 2 3 4 All N

Baseline vs.

Fee-For-Service 0.2733 0.2009 0.9553 0.3999 0.3509 52

Capitation 0.0870 0.0669 0.0226 0.0814 0.0105 47

Salary 0.0015 0.0037 0.001 0.0019 0.0001 52

Fee-for-Service vs.

Capitation 0.0186 0.0072 0.0269 0.0082 0.0028 37

Salary 0.0002 0.0001 0.0009 0.0001 0.0000 42

Capitation vs.

Salary 0.2467 0.5187 0.2896 0.1487 0.2629 37

25

5.3 Quality

Hypothesis 2. The average quality of service provided to the customer in the mission

firm is ranked

Salary ≥ Capitation ≥ Baseline ≥ Fee-for-service

We now turn to discussing the quality of service. We measured this by the customer’s

final earnings (equivalently, we could look at the net number of correct edits). Table 4 and

Figure 6 show the customers’ average earnings across the 4 rounds by treatment for all

participants who worked in the mission firm. As a point of comparison, customers start

with $18.50 in each round before services are provided. Customers in Salary did the best

and in Fee-for-service they did the worst. The comparative static predicted by Hypothesis

2 is supported in the behavior observed in each round. Also, by the last round, customers

are better off than if no services are provided in all the treatments except Fee-for-service.

This indicates that . Statistically, earnings in Salary are larger than in Baseline and Fee-

for-service in all 4 rounds while earnings in Capitation are larger than in Fee-for-service

in all 4 rounds and larger than in Baseline for Round 3 and marginally larger for the

other three rounds (see Table 4, Ranksum tests).

Following the approach to analyzing switching points, we also ran OLS regressions with

customer payoff as the dependent variable and potential behavioral drivers as independent

variables. We add several variables to the regressions with intuitive interpretations and

also add Job Number which is the randomly drawn outside option. This variable is

irrelevant once the mission firm has been selected, but we think there could be a framing

effect. The results are presented in Table 5 (next page).

Several results are evident in the table. First, the dummy for Baseline is marginally

significant and negative in regression 5, in which we restricted our analysis to those

subjects who selected fee-for-service in the mission firm. This confirms the prediction

of our model that customer payoffs would be better in Fee-for-service treatment than

in Baseline. This is because mission-motivated experts would join the mission firm in

Fee-for-service as they have no other way to contribute to the mission. In joining the

mission, they increase the average performance of the firm under fee-for-service. Whereas

in Baseline the mission-motivated experts would select either salary or capitation, leaving

only the profit-driven experts in the fee-for-service incentive system. Conversely, we find

no difference between the performance in Baseline salary and salary treatment.

Second, the only variable that has a systematic effect is Ability which increases quality

in the 4 specifications other than Capitation and Fee-for-service. The dummy for fee-for-

service in Baseline is significant confirming that fee-for-service negatively impacts quality.

26

Also, the coefficient on Job Number is negative for the treatment Fee-for-service and the

incentive system fee-for-service indicating a framing effect is present. In this case, the

negative sign indicates that larger foregone options led to lower quality which could be

due to subjects regretting the lost outside option and acting more selfishly in fee-for-service

to make up for it.

Table 5: Customer Payoff Regressions

By Treatment By incentive system

Baseline Fee-for-service Capitation Salary Fee-for-service Salary

(1) (2) (3) (4) (5) (6)

RiskSwitchPoint 0.132 -0.0206 0.987∗∗ -0.131 0.195 -0.215

(0.283) (0.293) (0.330) (0.234) (0.221) (0.166)

Ability 0.692∗∗ 0.468 0.564 0.636∗∗∗ 0.454∗∗ 0.636∗∗∗

(0.254) (0.311) (0.322) (0.215) (0.155) (0.0591)

Overconfidence 0.0585 0.212 0.831 0.0161 0.200 0.00648

(0.413) (0.315) (0.711) (0.329) (0.229) (0.150)

Job Number -0.202 -0.565∗ -0.296 -0.164 -0.567∗∗ 0.0944

(0.248) (0.262) (0.420) (0.408) (0.222) (0.271)

Fee-for-service -4.025∗∗∗

(1.012)

Capitation 0.0860

(2.777)

Baseline -1.100∗ -0.499

(0.533) (0.475)

cons 71.45∗∗∗ 72.47∗∗∗ 67.07∗∗∗ 73.96∗∗∗ 71.42∗∗∗ 73.64∗∗∗

(2.734) (2.520) (3.585) (3.001) (2.251) (1.228)

N 30 18 15 20 35 32

adj. R2 0.455 0.100 0.390 0.288 0.236 0.424

Notes: OLS with dependent variable Customer Payoff All Rounds. Regressions (1)-(4) disaggregate the data

by treatment and a dummy variable for the selected incentive system in Baseline is included. Regressions

(5)-(6) disaggregate the data by incentive system and a dummy variable for the whether it was the chosen

incentive system in Baseline or from the treatment is included. Standard errors are reported in parentheses.∗ p < .10, ∗∗ p < .05, ∗∗∗ p < .01

5.4 Expert Types

The final section of our results provides a closer look at individual behavior to determine

to what extent it looks like the behavioral types identified in our model. We used the

k-median algorithm with Euclidean distances (Calinksi and Harabasz, 1974). The way

27

it works is that we specify m characteristics and n clusters, and then the algorithm

groups individuals into the n clusters to minimize the sum of the distance between each

individual’s characteristic and the median in the cluster for each of the characteristics.

Here, it allows us to partition experts into groups based on similarities in their strategic

behavior. We clustered experts based on the two characteristics number of correct edits

and number of incorrect edits. This means that we will find groups of participants who

chose similar numbers of these edits.

Determining the correct number of clusters is a somewhat tricky task as too few clusters

leads to grouping together potentially different behaviors while too many clusters could

lead to difficulty interpreting the results. Based on the F-statistic, the optimal number

of clusters is 4. However, our goal in this section is to document behavior, and this

small number of clusters makes this really impossible. With only four clusters the within

group variation is huge, and it was immediately clear to us that the average behavior was

not at all reflective of individual behavior. For this reason, we increased the number of

clusters significantly, finding that 7-9 seemed about right. We will investigate 9 clusters

here because we think this better helps us identify different ability levels (although fewer

clusters does not change the main conclusions). Even with this number, average behavior

in a few of the clusters can be misleading as we discuss below.

Table 6 provides the summary statistics for correct and incorrect edits for each of

the nine groups as well as the number of participants in each treatment that fall into

the group. They are organized by the resulting average customer payoff, from lowest to

highest.

28

Table

6:

K-m

edia

nC

lust

erA

naly

sis

Sum

mary

Sta

tist

ics

Num

ber

of

Sub

ject

s

Clu

ster

Mea

nSD

Min

Max

Baseline

Baseline

Fee-for-service

Capitation

Salary

Tota

lC

ust

om

erP

ayoff

All

Fee

./C

ap./

Sal.

1

Corr

ect

8(3

.74)

312

3

3

01

04

68.4

Inco

rrec

t79.7

5(3

.3)

76

84

0

Tota

l87.7

5(0

.5)

87

88

0

2

Corr

ect

15.4

3(3

.5)

10

26

7

7

13

00

20

71.7

Inco

rrec

t70.1

0(3

.91)

62

75

0

Tota

l85.5

2(4

.11)

74

88

0

3

Corr

ect

8(2

.24)

611

4

2

01

16

73.3

Inco

rrec

t30.5

7(2

.51)

27

35

0

Tota

l38.5

7(3

.64)

35

46

2

4

Corr

ect

14.5

(4.1

)8

21

2

2

30

05

73.8

Inco

rrec

t23.5

8(4

.25)

18

30

0

Tota

l66.8

(10.6

9)

55

81

0

5

Corr

ect

4.4

(2.7

5)

08

4

1

03

512

74.5

Inco

rrec

t6.4

7(4

.94)

018

2

Tota

l10.8

7(5

.38)

020

1

6

Corr

ect

14.5

(4.1

)8

21

3

0

42

210

76.0

Inco

rrec

t23.5

8(4

.25)

18

30

0

Tota

l38.0

8(7

.46)

26

48

3

7

Corr

ect

11.2

9(2

.02)

915

6

1

13

414

76.8

Inco

rrec

t6.8

2(2

.6)

211

0

Tota

l18.1

2(3

.67)

11

24

5

8

Corr

ect

19.7

3(4

.31)

13

26

2

1

04

39

78.3

Inco

rrec

t13.1

8(2

.71)

918

0

Tota

l32.9

1(4

.7)

28

40

1

9

Corr

ect

19

(3.0

7)

15

25

1

0

03

711

79.3

Inco

rrec

t3.9

2(2

.75)

08

0

Tota

l22.9

2(4

.34)

16

29

1

29

Participants with the incentive system fee-for-service are primarily found in groups

1, 2, 4, and 6. The 23 participants in groups 1 and 2 do all (or almost all) the edits

and get 9% and 18% correct on average (recall that random guessing should get 12.5%

right) indicating that they are engaging in the profit-maximizing behavior of randomly

guessing (group 1 is probably the least lucky guessers while group 2 contains the luckier

guessers). The 5 participants in group 4 do fewer edits and the average customer payoff

is $73.78. This is remarkably close to the initial payoff of $74.00 indicating this behavior

is consistent with the not-max-services-do-no-harm type. Finally, the 4 participants in

group 6 have a positive impact on customer’s payoffs, and they also do significantly fewer

than 88 edits indicating that their behavior is consistent with the do-best type.

Participants with the incentive system capitation are primarily found in groups 5-9.

The 5 participants in group 5 appear to do less than 16 edits, but this is a case where the

number of clusters still makes the average edits 10.87 misleading. There are a number

of 0 edits in this group which come from the people in salary grouped here (see the next

paragraph), and actually the average number of edits for the 5 participants who are doing

capitation is 14.2, much closer to the 16 edits that are consistent with profit maximization.

Hence, this group probably captures the profit-maximizers. The 2 participants in group 6

are hard to identify with the model as they do more than 16 edits, but the quality of the

edits indicates that they do not appear to be do-best types. The 3 participants in group

7 do about 16 edits and do them mostly correctly which is consistent with the minimalist

type. Finally, the 7 participants in groups 8 and 9 do more than 16 edits, and do them

well, indicating they are consistent with the do-best type.

Finally, participants with the incentive system salary are primarily found in groups 3,

and 5-9. The behavior of the 3 participants in group 3 is a little hard to reconcile, but it

is consistent with profit-maximizers who do a few edits just to use up some time (rather

than do 0 edits as we predicted). The 6 participants in group 5 are mostly the ones who

do 0 edits and therefore are the profit-maximizers. Those in groups 6-9 all have a positive

impact on customer’s payoffs which is consistent with the do-best type. The difference

in groups can be attributed to ability going from low ability, in group 5, to high ability,

group 9.

6 Discussion and Conclusions

Similar to field experiments on how wages impact selection into the public sector, we found

that labor supply to mission firms increased when incentive systems with higher potential

earnings were offered. The higher earnings attracted both mission motivated experts and

30

experts motivated by monetary rewards. We also saw that while there were selection

differences across the incentive systems, the distribution of mission driven experts who

entered the mission firm did not seem to be the primary driver behind customer outcomes;

rather, it seemed that it was the differing motivations offered by the incentive systems.

However, the fact that there were distributional differences indicated that perhaps under

different circumstances (i.e., different parameters than we chose in the experiment for

earnings or a different mission than helping another subject in the room) selection would

have a greater impact on quality.

Our model and experiments produce results that parallel observations in real-world

credence goods markets and provide important considerations for policy makers. For ex-

ample, in our experiment there was a strong preference for salary and fee-for-service over

capitation when all incentive systems were offered in the mission firm. Profit-driven sub-

jects tended towards fee-for-service while mission-driven subjects tended towards salary,

leaving no one left to choose capitation. With the caveat that our parameters perhaps

made capitation less attractive, this provides a potential explanation for the move away

from capitation in the health care industry. While proposed as a solution to rising health

care costs in the 1990s, adoption of the payment system did not gain much momentum.

Our experiment suggests that the lack of uptake could have been partially due to a general

dislike of the incentive system relative to fee-for-service and salary. The strong preference

for fee-for-service and salary may have caused difficulty for health care providers reimburs-

ing under capitation to recruit new physicians and/or pushback from current physicians

causing the payment system to never become as popular as fee-for-service.

While one study is never enough to make strong policy recommendations, our results

speak to the broader goals of developing good policy in the healthcare market. In our

experiment, customers receive the highest quality of services from an expert paid by

salary. Hence, our results suggest that social planners should encourage firms producing

credence goods to offer salary, and only salary, to encourage mission-motivated agents

to perform quality services and profit-driven agents to enter a non-mission firm or, if

they do enter the mission firm, at least to not perform costly and unnecessary services.

However, fee-for-service dominates the healthcare industry, which is possibly attributable

to profit maximizing CEOs/insurance companies preferring the incentive system. This

tension between profit-maximization and social optimality underscores the importance

of our results, and more generally, the importance of studying the impacts of different

incentive systems in the healthcare setting.

31

Appendix A: Mission Drive Measure

We had a measure of mission drive (outside option minimum) and a measure or altruism

(dictator game giving) that we dropped from the analysis. In this appendix, we argue

that these measures were unfortunately not accurate.

Our first indication that we had a problem was for those subjects in Fee-for-service

who chose a Switching Point equal to 9. There are 7 such subjects and this behavior

is consistent with either profit-maximization or strong mission drive. If we look at the

outside option minimum, it was $24.64 for 3 of them and $44.64 or $49.64 for the other

4. This looks like the first 3 are profit-maximization and the last 4 are mission-driven.

However, 3/4 of those subjects actually did fee-for-service in the mission firm, and all 3

of those subjects delivered negative profits. This is inconsistent with mission drive. As

this is just 3 subjects, we did a broader analysis of customer payoff and the results are in

Table 7. The table compares average customer payoffs for those with an outside option

minimum at least 6 to those less than 6. If the measure is correct, those at least 6 are

motivated by mission-drive while those less than 6 are driven by profit-maximization.

Table 7: Customer Payoffs by Outside Option Minimum

Outside option minimum ≥ 6 Outside option minimum < 6

Treatment Customer Payoff N Customer Payoff N

Fee-for-service 73.23 13 73.25 8

Capitation 75.34 5 76.42 12

Salary 77.01 7 77.59 19

If outside option minimum truly reflected mission-drive then larger minimums would

correspond to larger customer payoffs in each treatment. The table indicates that this

is clearly not the case. If anything, customer payoffs are larger for those whose out-

side option minimum indicates they are maximizing profits. We think that some people

misinterpreted the difficult elicitation method.

The measure of altruism is also problematic. We again do the same exercise as in the

above table, but replace outside options with dictator game giving more than 0 and equal

to 0. The former are supposedly altruists while the latter are profit-maximizers. The

results are presented in Table 8.

Table 8: Customer Payoffs by Dictator Game Giving

Dictator Game Giving > 0 Dictator Game Giving = 0

Treatment Customer Payoff N Customer Payoff N

Fee-for-service 73.48 9 73.05 12

Capitation 77.43 7 75.17 10

Salary 76.60 6 77.69 20

32

With the possible exception of Capitation, it is clear that giving in the dictator game

is not associated with delivering larger payoffs to the customer. Rather than an issue

of misunderstanding (the dictator game is very simple to understand), we think that

probably what happened is that for some people, because the dictator game occurred

after the main task, it was seen as a way to make up for not helping the customer enough

in the main task (even though the customer and recipient of the dictator game were

different people).

33

Appendix B: Low increment Salary and Capitation

To better understand whether any difference between salary and capitation exists, we

lowered the increments to $1.00 and ran new treatments, Salary low and Capitation low,

where the outside option ranged from $21.64 to $29.64. We recruited 24 and 22 subjects

at Chapman and UVA for these additional treatments, and we drop 1 subject from each

treatment because they did not have a switching point. The average switching points were

4.52 in Capitation low and 5.26 in Salary low and the proportion of subjects who chose

the mission firm for each outside option for these treatments are presented in Table 9.

With the smaller increments, we still did not see a significantly larger average switching

point in Salary low than in Capitation low. We also see no significant differences between

the proportion who select in for any outside option. This indicates that there was no

difference in labor supply to our mission firm when payment was restricted to either

salary or capitation.

Table 9: Selection into the Mission Firm, Low Increment Treatments

Proportion Who Choose the Mission Firm

Treatment $21.64 $22.64 $23.64 $24.64 $25.64 $26.64 $27.64 $28.64 $29.64 N

Capitation low 0.9048 0.9048 0.9048 0.5238 0.1429 0.0476 0.0476 0.0476 0 21

Salary low 1 1 0.913 0.6087 0.2174 0.1739 0.1739 0.087 0.087 23

Chi-square Test p-values

Treatments $21.64 $22.64 $23.64 $24.64 $25.64 $26.64 $27.64 $28.64 $29.64 N

Capitation low vs.

Salary low 0.13 0.13 0.924 0.57 0.522 0.187 0.187 0.605 0.167 44

34

Introduction This is an experiment in decision-making. You will get 6 dollars for participating and you will have an opportunity to earn a considerable amount of cash through your participation in this experiment. You will be required to complete the experiment individually. You are not allowed to communicate with any other participant at any point during the course of the experiment. All responses and decisions will be anonymous. Before we begin, please set your cell phones to silent. Please put any personal calculators and other electronic devices away. We ask that you not make calls or send text messages until the experiment is complete. We also ask that you not talk to other participants in the experiment until after the experiment is complete.

Basic Overview: You will be asked to complete three tasks in this experiment. In the first task, you will be asked to complete a 12-question math quiz. In the second task, you will be asked to revise 16 short math quizzes. The number of quizzes that you revise and the number of revisions per quiz that you complete is up to you. In the third task you will be asked to provide your answer to two incentivized questions. In the end you will be asked to complete a brief survey. Your income earned in this experiment will be based on your choices in the tasks.

7 Appendix C: Instructions for Baseline

35

Task 1: You will complete a math quiz with 12 questions. For each problem that you correctly solve you will receive $0.20. The task is not graded based on completion, but instead on correctness. You will not be penalized for answering the questions incorrectly. For example:

• If you correctly solve 5 out of 12 questions, you will receive $1.00. • If you correctly solve 12 out of 12 questions, you will receive $2.40.

You have 12 minutes to complete the quiz. After you complete the quiz, we will ask you how many questions you think you got right and give you one dollar if you correctly guess the number you got right, so before clicking finish you might want to go over your quiz to count how many you think you got right. Please raise your hand if you have any questions at this time. If nobody has any questions, you may begin.

Task 2: We outline Task 2 in four sections. 1. The Task 2. Payment 3. Selecting Your Job and Method of Payment, and 4. Completion of Task 1. The Task In this task, you will act in a role we call Service Provider. You will proofread 16 math quizzes with 12 multiple choice questions in each. We have filled out an answer for each question, but some of the answers are wrong. Your task is to find these errors and correct them. You will be given assistance with your proofreading task. For each error in the quiz, we have highlighted 2 questions, only one of which has the actual error. For example, if the quiz has 3 errors you will see 6 highlighted questions, only 3 of which contain actual errors. The highlighted questions will have the initial answer highlighted in blue, and if you edit the answer, your new answer will be highlighted in orange. You cannot edit un-highlighted questions for which the initial answer is shaded light gray. The quizzes will be presented to you over 4 rounds; each round will have 4 quizzes. The quizzes across the rounds are the same, but the filled answers are different. You will have a total of 7 minutes to complete each round. During those 7 minutes you can toggle among the 4 quizzes for that round by clicking on the buttons on the left side of your screen. At the end of the 7-minute period you will no longer be able to proofread quizzes from that round and we will provide you feedback on how many correct and incorrect changes you made and then you will move on to the next round. You have a total of 28 minutes to revise all 16 quizzes. There is a total of 44 actual errors across the 16 quizzes and so you will see 88 total highlighted questions.

2. Payment Before engaging in the task, you will be prompted to select between Job A and Job B as well as among three alternative methods of payment for Job A which we discuss now.

Time remaining in seconds

Answers that you can edit are highlighted blue

Answers that you have edited are highlighted orange

To go between quizzes within the same round, click between these buttons.

Click “finish” to complete the current round.

Answer that cannot be edited are in gray

Job A Payment Method 1 ($0.56 per question): You will be paid $0.56 for each of the highlighted questions where you have changed the answer in the 28-minute time frame. You do not have to correctly revise the answer to be paid. For instance, if you make 2 changes to Quiz 1 and correct an error in 1 of the highlighted questions and make an incorrect change in 1 of the highlighted questions.

§ You will receive $0.56 for each the 2 changes made and your total payment for Quiz 1 will be $1.12.

Job A Payment Method 2 ($1.54 per quiz): You will be paid $1.54 for each quiz where you have changed at least one answer in the 28-minute time frame. You do not have to correctly revise the answer to be paid. For instance, in the same example as above:

§ You will receive $1.54 in total for the 2 changes and your total payment for Quiz 1 will be $1.54.

Job A Payment Method 3 ($24.64 Total): You will be paid $24.64 for participating in this task. Job B Payment Method (Flat Rate Total): You will be paid a flat rate for participating in this task. We will discuss the value of the flat rate in Section 3. Customer’s Earnings: In addition to your own payment, if you select Job A, then your choices in the task will affect another participant in this room’s earnings. We call this participant the Customer and they start with $24.00 less $.50 for each error in 1 round of 4 quizzes. For instance:

• If there are 5 errors in Quiz 1, 4 errors in Quiz 2, 1 error in Quiz 3, and 4 errors in Quiz 4, then the total number of errors is 14 and the Customer will start with $24.00-$0.50*14=$17.00.

Your revisions can help them earn a portion of the lost earnings back. For each error that you correctly revise they will be given back $.30. However, for every edit you make that is incorrect, they will lose an additional $.10. For instance:

• In the example above where you made 2 changes, 1 error was corrected and 1 answer was incorrectly changed, the customer whose quiz you are revising will receive $0.30 for the correctly identified error and lose $0.10 for the incorrectly changed answer, so their earnings will change by $0.30-$0.10=$0.20.

If you select Job B, your choices do not affect the customer’s earnings and they receive the earnings they started with: $24.00 less $.50 for each error in 1 round of 4 quizzes. Instructions Quiz: Before we discuss payment selection, we would like you to complete a quiz to ensure your understanding of how your payment will be calculated. Please answer the following questions to the best of your ability and raise your hand once you are done. The instructor will come around and check your answers for correctness. Everybody must answer all of the questions correctly before we move on to the next section.

3. Selecting Your Job and Method of Payment:

We will now explain how you will decide on your Job and Method of Payment. Pay close attention as this decision is complicated and will be made once and only once at the beginning of the task and will have a highly significant impact on your earnings. You make the decision for 10 different scenarios, and then we will randomly pick 1 scenario to use to determine your job and payment method.

Scenario 1: In scenario 1, you will be asked to determine the minimum amount for which you would prefer Job B over $24.64 in Job A. You will be presented with a row of amounts (see figure above) and you select the minimum dollar amount that you would have to be paid in Job B to choose Job B over $24.64 in Job A by using the Left and Right buttons to increase and decrease your selection. When making your selection, you start all the way to the left and move right by thinking as follows: If I am offered $X.XX (next amount to the right) in Job B, I would prefer Job B over Job A. If that is true, click the Right button to move your selection to the right. Stop when that is false. You don’t really need to use the Left button, although we have included it in case you change your mind and want to go back. Do you have any questions about how to make this choice? Scenarios 2-10: In scenarios 2-10, the first three options (columns) are the different methods of payment for Job A ($0.56 per question, $1.54 per Quiz, and $24.64 total) and only the last option is for Job B ($X.XX Total). For each scenario, you select which of the four methods of payment you would like to be paid with. The only difference among scenarios 2-10 is the amount offered in Job B. When making a selection for each of the scenarios, you should select the job/payment method that most appeals to you. For example, in scenario 5 where you can pick between Job A: $0.56 per question, Job A: $1.54 per Quiz, Job A: $24.64 Total, or Job B: $24.64 Total, you should click on the payment method and Job type that you would want in that scenario.

Do you have any questions about how to select your preferred payment method in scenarios 2-10? Once you have made your selection for the 10 different scenarios, we will use a random number generator to select one of the 10 scenarios that we will use for your job and payment method in the task. Each of the scenarios is equally likely, so you should treat each scenario as if it will be the one we select. If the random number generator selects the first scenario, we will run another random number generator to pick the payment amount for Job B. If the amount is above where you placed the slider, you will go to Job B and be paid this amount, if it is below, you will go to Job A and be paid $24.64. This method ensures that you should truthfully select the minimum amount you would be willing to accept to do Job B so that you end up with the correct job. Example 1: If the random number generator produced option 4 and you selected Job B ($19.64) for scenario 4, you would be paid $19.64 and the customer would not be impacted based on your decisions. On the other hand, if you had selected Job A and $1.54 per quiz, you would be paid $1.54 per quiz in which you made at least 1 edit and your actions would impact the customer as detailed above. Example 2: If the random number generator produced option 1, we will run another random number generator to determine the amount offered in Job B. If you stated that you would prefer Job B at amounts greater than or equal to $14.64 and the random number generator determined the payment was $24.64, then you would be paid $24.64 and do Job B. If, on the other hand, the random number generator determined the payment would be $9.64 for Job B, you would be paid $24.64 and do Job A. Who is a Service Provider and who is a Customer? Everyone will complete the task as Service Provider, but at the end of the task, we will randomly assign half of the participants in this room to the role of Customer. Each of the customers will be matched with one Service Provider and their earnings will be determined as detailed above based on the decision of the Service Provider they are matched to. The Service Providers will be paid based on their decisions in all 4 rounds (all 16 quizzes) according to the payment scheme they





Job A Job B

have selected. Customers will be paid for 1 round (4 quizzes) of edits made by their matched Service Provider, and that round will be determined by a random number generator. 4. Completion of Task: You will be given 7 minutes to complete each of the 4 rounds of the proofreading task, however if you are satisfied with the number of corrections that you have made you can leave the experiment by clicking the “Finish” button on the lower left-hand side of your screen before the 7 minutes are up. If you decide to click the “Finish” button and are in one of the first 3 rounds, you will automatically move to the next round. If you are in the 4th round and click the “Finish” button, you will be directed to two incentivized decision-making questions to complete. Once you have completed the incentivized questions, please raise your hand and wait quietly at your seat. Once we have confirmed you are done, you are welcome to work on homework, surf the web on your phone, etc. while we wait for everybody to finish before having you complete a short survey and dispersing payments.





Job A Job B

Introduction This is an experiment in decision-making. You will get 6 dollars for participating and you will have an opportunity to earn a considerable amount of cash through your participation in this experiment. You will be required to complete the experiment individually. You are not allowed to communicate with any other participant at any point during the course of the experiment. All responses and decisions will be anonymous. Before we begin, please set your cell phones to silent. Please put any personal calculators and other electronic devices away. We ask that you not make calls or send text messages until the experiment is complete. We also ask that you not talk to other participants in the experiment until after the experiment is complete.

Basic Overview: You will be asked to complete three tasks in this experiment. In the first task, you will be asked to complete a 12-question math quiz. In the second task, you will be asked to revise 16 short math quizzes. The number of quizzes that you revise and the number of revisions per quiz that you complete is up to you. In the third task you will be asked to provide your answer to two incentivized questions. In the end you will be asked to complete a brief survey. Your income earned in this experiment will be based on your choices in the tasks.

8 Appendix D: Instructions for Fee-for-service

42

Task 1: You will complete a math quiz with 12 questions. For each problem that you correctly solve you will receive $0.20. The task is not graded based on completion, but instead on correctness. You will not be penalized for answering the questions incorrectly. For example:

• If you correctly solve 5 out of 12 questions, you will receive $1.00. • If you correctly solve 12 out of 12 questions, you will receive $2.40.

You have 12 minutes to complete the quiz. After you complete the quiz, we will ask you how many questions you think you got right and give you one dollar if you correctly guess the number you got right, so before clicking finish you might want to go over your quiz to count how many you think you got right. Please raise your hand if you have any questions at this time. If nobody has any questions, you may begin.

Task 2: We outline Task 2 in four sections. 1. The Task 2. Payment 3. Selecting Your Job and Method of Payment, and 4. Completion of Task 1. The Task In this task, you will act in a role we call Service Provider. You will proofread 16 math quizzes with 12 multiple choice questions in each. We have filled out an answer for each question, but some of the answers are wrong. Your task is to find these errors and correct them. You will be given assistance with your proofreading task. For each error in the quiz, we have highlighted 2 questions, only one of which has the actual error. For example, if the quiz has 3 errors you will see 6 highlighted questions, only 3 of which contain actual errors. The highlighted questions will have the initial answer highlighted in blue, and if you edit the answer, your new answer will be highlighted in orange. You cannot edit un-highlighted questions for which the initial answer is shaded light gray. The quizzes will be presented to you over 4 rounds; each round will have 4 quizzes. The quizzes across the rounds are the same, but the filled answers are different. You will have a total of 7 minutes to complete each round. During those 7 minutes you can toggle among the 4 quizzes for that round by clicking on the buttons on the left side of your screen. At the end of the 7-minute period you will no longer be able to proofread quizzes from that round and we will provide you feedback on how many correct and incorrect changes you made and then you will move on to the next round. You have a total of 28 minutes to revise all 16 quizzes. There is a total of 44 actual errors across the 16 quizzes and so you will see 88 total highlighted questions.

2. Payment Before engaging in the task, you will be prompted to select between Job A and Job B in 10 scenarios. You will be paid differently in Job A and Job B for scenarios 2-10 as we describe here.

Time remaining in seconds

Answers that you can edit are highlighted blue

Answers that you have edited are highlighted orange

To go between quizzes within the same round, click between these buttons.

Click “finish” to complete the current round.

Answer that cannot be edited are in gray

Job A Payment Method ($0.56 per question): You will be paid $0.56 for each of the highlighted questions where you have changed the answer in the 28-minute time frame. You do not have to correctly revise the answer to be paid. For instance, if you make 2 changes to Quiz 1 and correct an error in 1 of the highlighted questions and make an incorrect change in 1 of the highlighted questions.

§ You will receive $0.56 for each the 2 changes made and your total payment for Quiz 1 will be $1.12.

Job B Payment Method (Flat Rate Total): You will be paid a flat rate for participating in this task. We will discuss the value of the flat rate in Section 3. Customer’s Earnings: In addition to your own payment, if you select Job A, then your choices in the task will affect another participant in this room’s earnings. We call this participant the Customer and they start with $24.00 less $.50 for each error in 1 round of 4 quizzes. For instance:

• If there are 5 errors in Quiz 1, 4 errors in Quiz 2, 1 error in Quiz 3, and 4 errors in Quiz 4, then the total number of errors is 14 and the Customer will start with $24.00-$0.50*14=$17.00.

Your revisions can help them earn a portion of the lost earnings back. For each error that you correctly revise they will be given back $.30. However, for every edit you make that is incorrect, they will lose an additional $.10. For instance:

• In the example above where you made 2 changes, 1 error was corrected and 1 answer was incorrectly changed, the customer whose quiz you are revising will receive $0.30 for the correctly identified error and lose $0.10 for the incorrectly changed answer, so their earnings will change by $0.30-$0.10=$0.20.

If you select Job B, your choices do not affect the customer’s earnings and they receive the earnings they started with: $24.00 less $.50 for each error in 1 round of 4 quizzes. Instructions Quiz: Before we discuss payment selection, we would like you to complete a quiz to ensure your understanding of how your payment will be calculated. Please answer the following questions to the best of your ability and raise your hand once you are done. The instructor will come around and check your answers for correctness. Everybody must answer all of the questions correctly before we move on to the next section.

3. Selecting Your Job and Method of Payment: We will now explain how you will decide on your Job and Method of Payment. Pay close attention as this decision is complicated and will be made once and only once at the beginning of the task and will have a highly significant impact on your earnings. You make the decision for 10 different scenarios, and then we will randomly pick 1 scenario to use to determine your job and payment method.

Scenario 1: In scenario 1, you will be asked to determine the minimum amount for which you would prefer Job B over $24.64 in Job A. You will be presented with a row of amounts (see figure above) and you select the minimum dollar amount that you would have to be paid in Job B to choose Job B over $24.64 in Job A by using the Left and Right buttons to increase and decrease your selection. When making your selection, you start all the way to the left and move right by thinking as follows: If I am offered $X.XX (next amount to the right) in Job B, I would prefer Job B over Job A. If that is true, click the Right button to move your selection to the right. Stop when that is false. You don’t really need to use the Left button, although we have included it in case you change your mind and want to go back. Do you have any questions about how to make this choice? Scenarios 2-10: In scenarios 2-10, the first option (column) is for the Job A ($0.56 per Question) and second option (column) is for Job B ($X.XX Total). For each scenario, you select which of the job/payment method you would like (Job A at $0.56 per Question or Job B at $X.XX Total). The only difference among scenarios 2-10 is the amount offered in Job B. When making a selection for each of the scenarios, you should select the job/payment method that most appeals to you. For example, in scenario 5 where you can pick between Job A: $0.56 per question or Job B: $24.64 Total, you should click on the Job type that you would want in that scenario.

Do you have any questions about how to select your preferred payment method in scenarios 2-10? Once you have made your selection for the 10 different scenarios, we will use a random number generator to select one of the 10 scenarios that we will use for your job and payment method in the task. Each of the scenarios is equally likely, so you should treat each scenario as if it will be the one we select. If the random number generator selects the first scenario, we will run another random number generator to pick the payment amount for Job B. If the amount is above where you placed the slider, you will go to Job B and be paid this amount, if it is below, you will go to Job A and be paid $24.64. This method ensures that you should truthfully select the minimum amount you would be willing to accept to do Job B so that you end up with the correct job. Example 1: If the random number generator produced option 4 and you selected Job B ($19.64) for scenario 4, you would be paid $19.64 and the customer would not be impacted based on your decisions. On the other hand, if you had selected Job A ($0.56 per Question), you would be paid $0.56 per question in which you made at least 1 edit and your actions would impact the customer as detailed above. Example 2: If the random number generator produced option 1, we will run another random number generator to determine the amount offered in Job B. If you stated that you would prefer Job B at amounts greater than or equal to $14.64 and the random number generator determined the payment was $24.64, then you would be paid $24.64 and do Job B. If, on the other hand, the random number generator determined the payment would be $9.64 for Job B, you would be paid $24.64 and do Job A.



preferred job type/payment


Job A Job B

Who is a Service Provider and who is a Customer? Everyone will complete the task as Service Provider, but at the end of the task, we will randomly assign half of the participants in this room to the role of Customer. Each of the customers will be matched with one Service Provider and their earnings will be determined as detailed above based on the decision of the Service Provider they are matched to. The Service Providers will be paid based on their decisions in all 4 rounds (all 16 quizzes) according to the payment scheme they have selected. Customers will be paid for 1 round (4 quizzes) of edits made by their matched Service Provider, and that round will be determined by a random number generator. 4. Completion of Task: You will be given 7 minutes to complete each of the 4 rounds of the proofreading task, however if you are satisfied with the number of corrections that you have made you can leave the experiment by clicking the “Finish” button on the lower left-hand side of your screen before the 7 minutes are up. If you decide to click the “Finish” button and are in one of the first 3 rounds, you will automatically move to the next round. If you are in the 4th round and click the “Finish” button, you will be directed to two incentivized decision-making questions to complete. Once you have completed the incentivized questions, please raise your hand and wait quietly at your seat. Once we have confirmed you are done, you are welcome to work on homework, surf the web on your phone, etc. while we wait for everybody to finish before having you complete a short survey and dispersing payments.



preferred job type/payment


Job A Job B

References

Agrawal, A., Green, E. P., and Lavergne, L. (2019). Gender effects in the credence goods

market: An experimental study. Economics Letters, 174:195–199.

Banuri, S. and Keefer, P. (2016). Pro-social motivation, effort and the call to public

service. European Economic Review, 83:139–164.

Barfort, S., Harmon, N., Hjorth, F., and Olsen, A. L. (2015). Sustaining honesty in

public service: The role of selection. In Midwest Political Association Meeting annual

conference, Chicago, IL, pages 16–19.

Bejarano, H., Green, E. P., and Rassenti, S. (2017). Payment scheme self-selection in

the credence goods market: An experimental study. Journal of Economic Behavior &

Organization, 142:396–403.

Besley, T. and Ghatak, M. (2005). Competition and incentives with motivated agents.

American economic review, 95(3):616–636.

Cadsby, C. B., Song, F., and Tapon, F. (2007). Sorting and incentive effects of pay

for performance: An experimental investigation. Academy of management journal,

50(2):387–405.

Carpenter, J. and Gong, E. (2016). Motivating agents: How much does the mission

matter? Journal of Labor Economics, 34(1):211–236.

Chakravarty, S., Harrison, G. W., Haruvy, E. E., and Rutstrom, E. E. (2011). Are you

risk averse over other people’s money? Southern Economic Journal, 77(4):901–913.

Dal Bo, E., Finan, F., and Rossi, M. A. (2013). Strengthening state capabilities: The role

of financial incentives in the call to public service. The Quarterly Journal of Economics,

128(3):1169–1218.

Delfgaauw, J. and Dur, R. (2007). Incentives and workers’ motivation in the public sector.

The Economic Journal, 118(525):171–191.

Deserranno, E. (2019). Financial incentives as signals: experimental evidence from the

recruitment of village promoters in uganda. American Economic Journal: Applied

Economics, 11(1):277–317.

Dohmen, T. and Falk, A. (2011). Performance pay and multidimensional sorting: Pro-

ductivity, preferences, and gender. American Economic Review, 101(2):556–90.

49

Dulleck, U. and Kerschbamer, R. (2006). On doctors, mechanics, and computer specialists:

The economics of credence goods. Journal of Economic literature, 44(1):5–42.

Dulleck, U., Kerschbamer, R., and Sutter, M. (2011). The economics of credence goods:

An experiment on the role of liability, verifiability, reputation, and competition. Amer-

ican Economic Review, 101(2):526–55.

Emons, W. (1997). Credence goods and fraudulent experts. The RAND Journal of

Economics, pages 107–119.

Fischbacher, U. (2007). z-tree: Zurich toolbox for ready-made economic experiments.

Experimental economics, 10(2):171–178.

Frakt, A. B. and Mayes, R. (2012). Beyond capitation: how new payment experiments

seek to find the ‘sweet spot’in amount of risk providers and payers bear. Health Affairs,

31(9):1951–1958.

Franzoni, C., Scellato, G., and Stephan, P. (2011). Changing incentives to publish. Sci-

ence, 333(6043):702–703.

Green, E. P. (2014). Payment systems in the healthcare industry: an experimental study

of physician incentives. Journal of economic behavior & organization, 106:367–378.

Hennig-Schmidt, H., Selten, R., and Wiesen, D. (2011). How payment systems affect

physicians’ provision behaviour—an experimental investigation. Journal of Health Eco-

nomics, 30(4):637–646.

Huck, S., Lunser, G., Spitzer, F., and Tyran, J.-R. (2016). Medical insurance and free

choice of physician shape patient overtreatment: A laboratory experiment. Journal of

Economic Behavior & Organization, 131:78–105.

Lazear, E. P. (2000). Performance pay and productivity. American Economic Review,

90(5):1346–1361.

Macpherson, D. A., Prasad, K., and Salmon, T. C. (2014). Deferred compensation vs.

efficiency wages: An experimental test of effort provision and self-selection. Journal of

Economic Behavior & Organization, 102:90–107.

Official SAT Practice Test (2014-2015). The college board. www.collegeboard.org. used

with permission.

Tonin, M. and Vlassopoulos, M. (2010). Disentangling the sources of pro-socially moti-

vated effort: A field experiment. Journal of Public Economics, 94(11-12):1086–1092.

50

agent sorting by incentive systems in mission firms...

Documents