1
Field Experiment on Incentive of Communication, Compilation and Evaluation
1. Motivation for the study2. What is the Field Experiment3. Framework of the experiment4. The flow of the experiment5. Analysis of Results
Takuya Nakaizumi 2009.6.10
2
Motivation for the study 1. Does monetary Incentive affect the
conversation or editing? 1. Editors or writers have their own opinion.They
will reject the opinion of others on their writing or compilation especially when they do not agree with or do not like it.
2. Even in such circumstances, incentives may make them edit in a way different from what they would have done otherwise.
2. In this experiment, we tested whether a monetary incentive affected the editing of a conversation in the BBS.
3
Evaluation of the editing
1. To give an editor adequate incentive, we need an evaluation of the performance on which the reward depends
2. The editing must have a true value. However it is quite difficult to estimate.
3. Thus in this study we employ a peer review system. The editing was evaluated by all the participants of the
conversation in the BBS. And the rewards of the editor depended on the evaluation
of the participants of the conversation of the BBS.
4
Evaluation Behavior
1. Participants of the conversation of the BBS, might appreciate the editing more if the editing attracts the participants.
2. The reward for the editor might affect just the effort of the editor. The reward itself for the editor might not affect the evaluation from the other participants.
3. Thus higher rewards lead to more attractive editing. →We check it by experiment
5
What is a Field Experiment?Harrison and List [2004] : Our primary point is that dissecting the characteristics of field experiments helps define what might be better called an ideal experiment, in the sense that one is able to observe a subject in a controlled setting but where the subject does not perceive any of the controls as being unnatural and there is no deception being practiced.
The distinction between the laboratory and the field1.In a field experiment, sample populations are drawn from many different domains, while sample is student of ordinal laboratory experiment.
2.The laboratory environment might not be fully representative of the field environment. c.f. winner’s curse
6
Determine the Field context of an Experiment
Harrison and List [2004] : the subject (participant) pool, the information that the subjects bring to the task,the commodity,the task or trading rules applied,the stakes,the environment in which the subjects operate.
7
Taxonomy of Field ExperimentHarrison and List [2004] :
1. ‘artefactual’ field experiment:‘non-standard’ subjects, or experimental participants from the market of interest.
2. ‘framed field experiment’: as the same as an artefactual field experiment but with field context in the commodity, task, stakes, or information set of the subjects, the commodity,
3. ‘natural field experiment’: ‘natural field experiment’ is the same as a framed field experiment but where the environment is one where the subjects naturally undertake these tasks and where the subjects do not know that they are participants in an experiment.
Our field experiment
8
The Flow of the Experiment 1
1. Pre-survey of the topic and attitude of the participants
2. Discussion on BBS on specific topic Religion, Politics , Marketing, Finance (Japan only) Discussion is rewarded if they contribute more than four
times.
3. Choice of the editor of the discussion by Calculating Social Influence Score
4. Editing of the editor5. Evaluation of the editing by the participants and the
editor is then rewarded according to the evaluation.
9
The Flow of the Experiment 21. Pre-survey
1. Opinions of the participants on which topic they prefer to discuss.
2. Variance of the opinions in each domain
2. Discussion on BBS1. Religion, Politics (Marketing, Finance in Japan
only)2. The most and least conflicting topic based on
the variance of opinions in each domain.
10
The Flow of the Experiment 3 1. Calculate (SIS) Social Influence Score
l Based on mutual evaluationl Like in Google, the SIS evaluation carries more weight if the
evaluator has a higher SIS.l Long tail distribution
2. Editing/SummarizingAssigned to the discussant with the highest SISTwo types of reward for the editing, $20 or $80.If others’ assessment is low, then the rewards is reduced.“How satisfied are you with the summary by the selected editor? In order to guarantee the quality of the discussion summary, if the editor is rated less than 'not satisfactory' (A 2 out of a scale of 7) on average, then the bonus for editing will be reduced by half.”
11
Experimental conditions Domains
Economic: Marketing, Investment (in Japan only) Non-Economic: Politics, Religions
Difficulties of topics: variance in survey High: The most conflicting topic Low: The least conflicting topic
Amount of reward to a editor High:$80 (¥8,000 in Japan) Low:$20 (¥2,000 in Japan)
12
Incentive to discuss and edit Discussion on BBS
If they post more than four comments, they get the participant fee.
Higher rewards for a editors becomes an incentive for participants to discuss the topic.
Editing of BBS High:$80 (¥8,000 in Japan) or Low:$20 (¥2,000 in Japan) Higher rewards let the editor flatter the participants more
and that may raise the actual score. Evaluation of Editing
There is no monetary incentive to evaluate the editing!!
13
Summary of Experiment DataIn Japan
reward to each editor
Doma-ins
Difficulty of topics
Total
BBS
Partici-pants
In each
group
Partici-pants at the beginning
Partici-pants who continue to the end
¥8,000 4 2 types 8×3=24 10 240
(24 editor)
148+
20(editor)
¥2,000 4 2 types 8×3=24 10 240
(24 editor)
137 +
20(editor)
total 4 2 types 48 10 480
(48 editor)
285+
40(editor)
14
Summary of Experiment DataIn the U.S.(and combined total)
reward to each editor
Domain
religion,
politics
Difficulty of topics
Total
BBS
Partici-pants
In each
group
Partici-pants at the beginning
Partici-pants who continue to the end
$80 2 2 types 4×3=12 20 240
(12 editor)
138 +
12(editor)
$20 2 2 types 4×3=12 20 240
(12 editor)
122 +
8(editor)
total 2 2 types 24 20 480
(24 editor)
260 +
20(editor)
U.S., and
Japan
2,4 2 types 24
or 48
20
or 10
840
(72 editor)
545 +
60(editor)
15
Basic Model (1)Evaluation depends on the quality of editing by
editor 0, that is X0
Assumption 1: i participant’s utility is
Thus the evaluation score of i by j,
When editor edit the conversation, the editor does not know the evaluation of the other participants. Thus she/he face uncertainty and we describe it by ε,
ε〜
( )0Xuu ii =
( )( )00 Xuss ii =
( ) εε +=+= ∑ ij
ji ers
ns
10
€
F( )
€
′ F ( ) = f ( )( )
16
Basic Model (2)
Quality of editing is assumed to be the function of the effort of editor: X0=X0 (x)=x and reward of the editor, W(s0) is
And Cost Function is αc (x), (α>0, c'(x)>0, c" (x)>0, )Thus editor’s expected benefit of editing, E[B (x)] is €
W s0( ) =w if s0 ≥ s
w /2 if s0 < s
⎧ ⎨ ⎩
( )[ ] ( )( ) ( )[ ]xcxrWExBE αε −+=
17
Basic Model(3)
€
E B x0( )[ ] = wP s0 ≥ s ( ) + w /2P s0 < s ( ) −αc x( )
= wP ε ≥ s − r x( )( ) − w /2P ε < s − r x( )( ) −αc x( )
= w − w /2F s − r x( )( ) −αc x( )
Then (1)
€
E B x( )[ ]
dx= w /2 f s − r x( )( ) ′ r x( ) −α ′ c x( ) = 0
€
f s − r x( )( ) > 0 , ′ r > 0 , α ′ c > 0From
→Proposition 1 There is interior solution of (1)
18
Basic Model(4)
That means x0 is non decreasing function of w
And non-decreasing function of α
and Thus and
Hypothesis : 1) Higher rewards with easy topic lead to highest score. 2)Lower rewards with difficult topic lead to lowest score. 3)The score of Higher reward with difficult topic and Lower reward with easy topic are between them
€
∂si
∂wi
≥ 0
€
∂x w ,α( )∂w
≥ 0
€
E s0( ) = r x w ,α( )( )
€
∂e w , α( )∂α
≤ 0
€
∂si
∂α≤ 0
19
Results of Experiment (1): Evaluation Score (1)
Difficulty Reward for editing Difference of Topic: 20 80 (p-value of GWT)
Easy 0.889 0.817 -0.072 1.5356 1.612 (0.1635) 117 131 Difficult 1.127 0.690 -0.437* 1.315 1.577 (0.0017) 142 155 Total 1.019 0.748 -0.271* 1.421 1.592 (0.03) 259 286
20
Results of Experiment (2): length of editing (effort)
Japan the U.S. adjusted total20 high 1158.666671130.449431068.26715320 Low 2065.8751035.573111514.72560580 High 1467.454552582.210761735.24333280 Low 1323.444444764.942281635.854311average of total 1462.11684.84211 1488.5226
21
Results of Experiment (3): Analysis of Evaluation Score(1)
Higher reward with easy topic
Ordinal incentive theory
lower reward with difficult topic
lower reward with easy topic
higher reward with difficult topic
Experimental results
1.127 lower with difficult
0.889 lower with easy
0.817 higher with easy
0.748 Higher with difficult
22
Results of Experiment (4): Analysis of Evaluation Score(2)
Higher reward with easy topic
incentive theory with spite
lower reward with difficult topic
lower reward with easy topic
higher reward with difficult topic
Experimental results
1.127 lower with difficult
0.889 lower with easy
0.817 higher with easy
0.748 Higher with difficult
23
Results of Experiment (5): Analysis of Evaluation Score(3)
Higher reward with easy topic
incentive theory with fairness evaluation
lower reward with difficult topic
lower reward with easy topic
higher reward with difficult topic
Experimental results
1.127 lower with difficult
0.889 lower with easy
0.817 higher with easy
0.748 Higher with difficult
24
Results of Experiment (6): Evaluation Score (2)
Low reward
($20,or ¥2,000)
High reward
($80,or ¥8,000)
Difference
(p-value of GWT)
Japan
(average, variance
sample)
0.854
1.401
137
0.365
1.591
148
-0.489*
The U.S. 1.205
1.420
122
1.160
1.490
138
-0.045
Total 1.019
1.421
259
0.748
1.591
286
-0.402*
(0.03)
25
Results of Experiment (7): Evaluation Score of The U.S.
Difficulty Reward for editing
20 80 Difference
Easy 1.28 1.29 0.01 1.35 1.44 53 65
Difficult 1.14 1.04 0.1 1.48 1.53 69 73
Total 1.20 1.16 0.04 1.42 1.491 122 138
26
Results of Experiment (8): Evaluation Score of Japan
Difficulty Reward for editing
20 80 Difference
Easy 0.563 0.348 -0.215 1.553 1.653
41 68 Difficult 1.109 0.378 -0.731* 1.603 1.697 58 62 Total 0.854 0.365 -0.489* 1.578 1.669 99 130
27
Results of Experiment (9): Evaluation Score (3)
low high differenceGWT testProduct 1.036 0.806 - 0.2302 *
1.347 1.628 36
Investment 0.556 0.6 0.044441.501 1.567
36 30Religion 0.957 0.982 0.02501
1.537 1.52394 114
Politics 1.238 0.519 - 0.7188 *1.266 1.652
101 106Total 1.019 0.748 - 0.2711 *
1.421 1.592259 286
28
Results of Experiment (10): Evaluation Score of The U.S.
Domain Reward for editing
20 80 Difference
Religion 0.963 1.429 0.466 1.553 1.325
54 70 Politics 1.397 0.882 -0.515 * 1.603 1.697 68 66 Total 1.205 1.159 -0.046 1.578 1.669 122 138
29
Analysis of the resultsHypothesis is rejected. ThenHow the participants evaluate the editing?→Spite or altruistic ?
Valuation of the editing by outsiders or Can the market value of the editing be
estimated? Evaluation of Evaluators
30
Possibility of the behavioral evaluation
Suppose the evaluation function depends on not only the quality but also the rewards the editor gets.
The evaluation score is based on both the effort function of the editor and the evaluation function of the other participants as an evaluator.