pantelis p. analytis march 11, 2018 decision making pantelis p. analytis introduction one-shot time...

Dynamicdecisionmaking

Pantelis P.Analytis

Introduction

One-shot timeallocationdecisions

Dynamic e↵ortallocation

Extensions ofthe framework

Dynamic decision making

Pantelis P. Analytis

March 11, 2018

1 / 29


Pantelis P.Analytis

Introduction




1 Introduction

2 One-shot time allocation decisions

3 Dynamic e↵ort allocation

4 Extensions of the framework

2 / 29


Pantelis P.Analytis

Introduction




Chasing life-changing goals

3 / 29


Pantelis P.Analytis

Introduction




Festinger (1942) - A theoretical interpretation ofshifts in aspiration level

SHIFTS IN LEVEL OF ASPIRATION 241

success very strongly and failure rather weakly, the formercurve will be high and the latter low. The reverse situationmay also be the case. This lowering and raising of the heightsof these curves does not affect the point at which the resultantforce will reach its maximum as long as the slope of the curve

Easy- Difficult

I

Curve of Resultant Force

Fie. 2. Derivation of the resultant force (J*p, t) from a set of valenceand potency curves of given value.

is not altered. It does, however, affect the height of the re-sultant curve. The higher the Fa, curve, the higher will bethe resultant curve; the higher the Fa/ curve, the lower willbe the resultant curve. The height of the resultant curve ata given point shows the strength of the force toward thisdifficulty level (L). This resultant force will be positive if

4 / 29


Pantelis P.Analytis

Introduction




Atkinson (1957) - Motivational determinants ofrisk taking behavior

RISK-TAKING BEHAVIOR 365

.50Probability of Success

1.00

FIG. I . Strength of motivation to achieveor to avoid failure as a function of the sub-jective probability of success, i.e., the difficultyof the task.

again in the literature on level of as-piration (12). Typically, groups of per-sons for whom the inference of greateranxiety about failure seems justified onthe basis of some personality assess-ment show a much greater variance inlevel of aspiration than persons whosemotivation is inferred to be more nor-mal or less anxious. When the detailsof behavior are examined, it turns outthat they are setting their aspirationlevel either defensively high or defen-sively low.

Without further assumptions, the the-ory of motivation which has been pre-sented when applied to competitive-achievement activity implies that therelationship of constrained performanceto expectancy of goal-attainment shouldtake the bell-shaped form shown inFig. 1, whether the predominant motiveis to achieve or to avoid failure. Fur-ther, the theory leads to the predictionof exactly opposite patterns for settingthe level of aspiration when the pre-dominant motivation is approach andwhen it is avoidant, as shown in Fig. 2.

Both of these hypotheses have beensupported in recent experiments. Thewriter5 offered female college students

6 Atkinson, J. W. Towards experimentalanalysis of human motivation in terms of

a modest monetary prize for good per-formance at two 20-minute tasks. Theprobability of success was varied by in-structions which informed the subject ofthe number of persons with whom shewas in competition and the number ofmonetary prizes to be given. Thestated probabilities were %0> %> Vz>and %. The level of performance washigher at the intermediate probabilitiesthan at the extremes for subjects havinghigh thematic apperceptive n Achieve-ment scores, and also for subjects whohad low n Achievement scores, presum-ably a more fearful group.

McClelland6 has shown the diametri-cally opposite tendencies in choice oflevel of aspiration in studies of childrenin kindergarten and in the third grade.One of the original level-of-aspiration ex-periments, the ring-toss experiment, wasrepeated with five-year-olds, and a non-verbal index of the strength of achieve-

+85

-25

Motive to Achieve

Motive toAvoid Failure

.50

Probability of Success

1.00

FIG. 2. Relative attractiveness of taskswhich differ in subjective probability of suc-cess (i.e., in difficulty). The avoidance curvehas been inverted to show that very difficultand very easy tasks arouse less fear of failureand hence are less unattractive than moder-ately difficult tasks.

motives, expectancies, and incentives. To ap-pear in Motives in fantasy, action, and society.Princeton: Van Nostrand (in preparation).

« McClelland, D. C. Risk-taking in childrenwith high and low need for achievement. Toappear in Motives in fantasy, action, and so-ciety. Princeton: Van Nostrand (in prepa-ration) .

5 / 29


Pantelis P.Analytis

Introduction




A field in search of valence

Di↵erent perspectives on valence

Lewin (1942)

Rotter (1954)

Edwards (1954)

Tolman (1955)

Atkinson (1957)

Vroom (1964)

Relating expected utility theory and the di↵erent expectancytheories

Siegel (1957)

Feather (1959)

Atkinson (1982) 6 / 29


Pantelis P.Analytis

Introduction




Decisions from description

Do you prefer 10 Euros with 10 % probability

or 1 Euro for sure?

7 / 29


Pantelis P.Analytis

Introduction




Setting up the problem

The agent’s productivity:

q(t) = �t + �Wt

(1)

Make-or-break rewards:

r1(x1) =

(B , if x � �

0, otherwise(2)

Safe rewards:

r2(x2) = v · x (3)

8 / 29


Pantelis P.Analytis

Introduction




Rewards as a function of invested time

r1(q1(t1)) = r1(�1t1 + �1Wt1)

=

(B , if �1t1 + �1Wt1 � �

0, otherwise(4)

r2(q2(t2)) = v · q(t2) = v · (�2t2 + �2Wt2) (5)

The expectation of rewards can be written as:

E[r1(q1(t1))] = B · Pr(�1t1 + �1Wt1 � �)

= B · Pr✓

1pt1W

t1 �� 1t1

�1pt1

◆

= B ·✓1� �

✓�� 1t1

�1pt1

◆◆(6)

E[r2(q2(t2))] = v · �2t2 (7)

9 / 29


Pantelis P.Analytis

Introduction




Rewards as a function of skill and uncertainty

Defaults: �1 = �2 = 5,B = 1000,� = 50,T = 10, v = 10,� = 5

10 / 29


Pantelis P.Analytis

Introduction




One shot time allocation

h(t1, t2) = E[r1(q1(t1))] + E[r2(q2(t2))]

= B

✓1� �

✓�� 1t1

�1pt1

◆◆+ v · �2t2. (8)

An agent should then optimize this function (Eq. 8) under timeconstraints, which is written formally as:

maximizeh

h(t1, t2) (9)

subject to T = t1 + t2

11 / 29


Pantelis P.Analytis

Introduction




Returns from di↵erent time allocation policies

12 / 29


Pantelis P.Analytis

Introduction




Time allocation patterns (�2 = 3)

0.00

0.25

0.50

0.75

1.00

0.0 2.5 5.0 7.5 10.0Skill LevelPr

op. T

ime

to M

ake−

Or−

Bre

ak

σ0510

Optimal Time Allocation

B · Pr(�1T + �1WT

> �) = v · �2T . (10)

[E[r1(q1(t1))]]0 = v · �2 and [E[r1(q1(t1))]]00 < 0, (11)

13 / 29


Pantelis P.Analytis

Introduction




Generalizing Kukla’s theory of e↵ort allocation

0.00

0.25

0.50

0.75

1.00

0.0 2.5 5.0 7.5 10.0Skill LevelP

rop.

Tim

e to

Mak

e−O

r−B

reak

σ0510

Optimal Time Allocation460 ANDY KUKLA

E"

X0

PERCEIVED FACILITY

K K I . 2. Theoretical relationship between perceived taskfacility and intended effort (£*).

ceived to be more difficult than Xt. Theeffort-outcome relationship for Xj is a dis-continuous step function similar to the oneshown in Figure 1. However, by the rela-tionship between effort and perceiveddifficulty postulated in the previous section,the least effort which results in success atXj will be greater than that which resultsin success at Xt. That is to say, the pointof discontinuity in the effort-outcomefunction will be further to the right on theeffort scale for Xj than it was for Xf. Againthe theory predicts that the selected effortlevel Ej* for task Xj will be the point onthe effort scale at which the discontinuityoccurs. Thus, subjects will decide to ex-pend more effort on Xj than on X,.

As the tasks considered become moredifficult, it is clear that the selected effortlevels increase in magnitude. Eventually,the task is perceived to be so difficult thatonly the maximum possible exertion isexpected to result in success. For anytask Xk which is still more difficult thanthis, all possible effort levels are associatedwith the same failing outcome /*. In thatcase, all possible effort levels also result inthe same SEU. It follows that subjectswill select the least such effort level, that is,

Ek* = 0. Evidently, the selected effortfor any task more difficult than Xk willalso be zero.

It is then possible to graph the relation-ship between perceived task difficulty andintended effort (Figure 2). As long as thetask is perceived to be so difficult that noteven maximal effort will result in success,the selected effort will be E* = 0. Whenthe task is just easy enough for maximaleffort to lead to success, the intended ex-penditure of effort will be E* = 1. Astasks become easier after this point, E*goes down (although not necessarily inthe linear fashion shown in the figure).For the easiest possible task Xn, successis assured with no effort at all and thesubject again does not exert himself.

Since it has been assumed that intensityis a positive linear function of intendedeffort, Figure 2 depicts the shape of thefunctional relationship between perceivedfacility and behavioral intensity as well asthat between facility and E*.

Figure 2 is the intended effort and theintensity function for an individual subject.In a group of subjects, it is to be expectedthat different individuals will perceive thesame task as having varying degrees of

14 / 29


Pantelis P.Analytis

Introduction




Allocating e↵ort dynamically

0

25

50

0 5 10Time Invested in Make-Or-Break

Pefo

rman

ce Q

ualit

y

15 / 29


Pantelis P.Analytis

Introduction




Allocating e↵ort dynamically

0

25

50

0 5 10Time Invested in Make-Or-Break

Pefo

rman

ce Q

ualit

y

Start with the risky task (if at all).

Switch to the safe task in case of success or if the chancesof success appear bleak.

16 / 29


Pantelis P.Analytis

Introduction




Deriving the optimal giving-up threshold

The expected reward of the optimal policy for the last step willbe:

R

n�1(x) = max

⇢Z 1

�1r1(y) · fx(y)dy , v · �2 ·

T

n

�. (12)

We denote by c

n�1 the break-even point, for which theexpected rewards of the two tasks are equal. That is, c

n�1

solves the equation

Z 1

�1r1(y) · fc

n�1(y)dy = v · �2 ·T

n

. (13)

17 / 29


Pantelis P.Analytis

Introduction




Deriving the optimal giving-up threshold

The expected reward of the optimal policy at t = n � 2

R

n�2(x) = max

⇢Z 1

�1R

n�1(y) · fx(y)dy , v · �2 ·2T

n

�. (14)

The above procedure can be continued inductively to get:

R

k

(x) = max

⇢Z 1

�1R

k+1(y) · fx(y)dy , v · �2 ·(n � k)T

n

�,

(15)

for any k = 0, . . . , n1.For any k , the performance break-even point c

k

can becalculated as in Eq. 16: it is the unique solution of the equation

Z 1

�1R

k+1 · fck

(y)dy = v · �2 ·(n � k)T

n

. (16)

18 / 29


Pantelis P.Analytis

Introduction




19 / 29


Pantelis P.Analytis

Introduction




Deriving a simpler myopic giving-up threshold

Uncertainty unfolds and is replaced by an actual outcomey = q1(t

0) experienced until time t

0 2 [0,T ]. The agentneeds to find the optimal allocation for T

r

= T � t

0 andthreshold �� y .

The expected returns from t1 invested in the MOB can beexpressed as:

E[r1(q1(t1))] = B · Pr(�1t1 + �1W1,t1 � �� y)

= B ·✓1� �

✓�� y � �1t1

�1pt1

◆◆(17)

and for the safe E [r2(q(t2)] = v · �2t2. A myopic agentseeks to optimize the sum of these rewards, under theconstraint T

r

= t1 + t2.

20 / 29


Pantelis P.Analytis

Introduction




Threshold comparison

0

25

50

0 5 10Time Invested in Make−Or−Break

Pefo

rman

ce Q

ualit

y

SampleMyopicOptimalSuccess

21 / 29


Pantelis P.Analytis

Introduction




Threshold comparison

λ = 4 λ = 5 λ = 6

σ=

5σ=

10

0 5 10 0 5 10 0 5 10

02550

02550

Time Invested in Make−Or−Break

Pefo

rman

ce Q

ualit

y

22 / 29


Pantelis P.Analytis

Introduction




Play to win

Pursue stubbornly the make-or-break goal and never give up.

The time ⌧ can be defined mathematically as:

⌧ = min{t : q1(t) � �} = min

⇢t : W 1

t

� �� 1t

�1

�.

The expected reward from the make-or-break task is then

E [r1(t1)] = B · P(⌧ T )

and the expected reward from the safe-reward task is

E [r2(t2)] = E [�2 · v · (T � ⌧) · 1⌧T

].

23 / 29


Pantelis P.Analytis

Introduction




Lancaster - Stochastic model of duration of strikes(1972)

260 LANCASTER - Stochastic Model for Duration of a Strike [Part 2,

The density function is unimodal and positively skew-some Inverse Gaussian density functions are sketched in Fig. 2 for values of ,u and a consistent with the observations on U.K. strike durations. The hazard or age specific settlement rate defined as b(t) =f/(l - F) =-dlog (1- F)/dt increases from zero at time zero to a

Agreement barrier

Difference of the parties (X)

Time (days) 1 2 3

FIG. 1

single maximum located in the time interval 1/3a2 < Tm < 2/3a2, then falls approaching the value A2/2C2 as t-> oo.t Some plots of -log (1-F), whose slope is +(t), are also given in Fig. 2.

3. THE DATA

The data consist of the list of strikes recorded by the Ministry of Labour as commencing in 1965 in the United Kingdom. The Ministry attempts to record all strikes other than those lasting less than a day or involving fewer than 10 men unless one of these involves a loss of more than 100 man days. This recording rule means that, if the proportion of stoppages involving fewer than 10 men varies systematically with duration, then the proportion of stoppages of different durations that are recorded will itself vary, thus biasing the recorded duration frequency distribution away from the true one. This point was tackled by attempting to see whether there was any evidence of lack of independence of duration and number of men involved, size, to be seen in the lists of recorded strikes. The data were divided into eight industries, whose definitions are provided in the appendix, and the scatter diagrams of duration and size inspected. For seven of the eight industries the evidence was consistent with stochastic independence of duration and size. The exception was the Construction industry whose scatter diagram is shown as Fig. 3 together with the slope coefficient and standard error in the regression of log size on log duration.4

t For an extensive account of the Inverse Gaussian distribution see Tweedie (1957). + In four industries the regression coefficient of log size on log duration was positive and

in four negative. Only in construction did it differ significantly from zero on a two-tailed test.

This content downloaded from 192.76.177.125 on Thu, 14 Sep 2017 23:01:39 UTCAll use subject to http://about.jstor.org/terms

24 / 29


Pantelis P.Analytis

Introduction




Performance comparison

0

500

1000

Myopic

Optimal

Play

to w

in

Reward

Defaults : �1 = �2 = 5,B = 1000,� = 50,T = 10, v = 10,� = 5

25 / 29


Pantelis P.Analytis

Introduction




Performance comparison

λ1 = 4 λ1 = 5 λ1 = 6

σ=5

σ=10

Myopic

Optimal

Play t

o win

Myopic

Optimal

Play t

o win

Myopic

Optimal

Play t

o win

0

500

1000

0

500

1000

Reward

26 / 29


Pantelis P.Analytis

Introduction




Possible experimental paradigms: Learning to wait

Oprea and Friedman

Based on Dixit’s model of investment under uncertainty. OPREA ET AL. LEARNING TO WAIT 1 1 09

Figure 2

Computerized display used in the experiment using Medium B parameters. The jagged line shows present (time =0) and previous values of V in the current period. The subject invests by clicking the button near the bottom of the screen. The window on the right displays current period information and toggles to summarize results from previous periods. The screenshot is taken at the end of a period and shows feedback on the subject's decision and earnings. Because the software has the capacity to run experiments in which subjects compete for the investment opportunity, the information panel on

subjects' screens notifies them of how many competitors they have - in this case, none.

approximation of Brownian motion described in the previous section. When a subject clicks the button to invest, she earns V - C points. Even after clicking, the subject can see the V line evolve until the random expiration time, controlled by the parameter q. Costs, value paths, and ending times were randomly generated in real time.

In addition to the graphical display, the subject's screen shows the numerical values of V and C, earnings in the current period, and cumulative earnings so far ("total score"). It also allows each subject to view her previous decisions and earnings at any time.

Subjects were 69 students (primarily) at the University of California, Santa Cruz, who were randomly assigned to the experiment using an online recruitment software. The subject pool included students with many different majors. Participation was purely voluntary. No subject participated in more than one treatment.

At the beginning of each of 10 sessions, each subject was seated at a visually isolated com- puter terminal and assigned to a treatment (e.g., Medium B). Instructions were read aloud and the interface was displayed on a wall screen. The binomial parameters for the chosen treatment were explained and written on a white board. Subjects participated in six practice periods. Each subject then participated in 80 periods for pay with no change in treatment. Sessions lasted between 80 and 120 minutes, depending on the treatment and the random draws made over the course of the session.

Nonoverlapping groups of 17 subjects participated in each of the Low, Medium A, and Medium B treatments and 18 in the High treatment. A subject with cumulative payoff n over all periods received a(n - b) cents in cash at the end of the session. To reduce the wide variation in expected earnings across treatments, we chose b == 200 in Low and Medium treatments and

© 2009 The Review of Economic Studies Limited

This content downloaded from 192.76.177.125 on Thu, 14 Sep 2017 23:23:43 UTCAll use subject to http://about.jstor.org/terms

27 / 29


Pantelis P.Analytis

Introduction




The way ahead

28 / 29


Pantelis P.Analytis

Introduction




Extensions of the framework

Bounded rationality and risk taking in the wild

Subjective value of successes and failures

Self e�cacy and self-fulfilling prophecies

Learning processes

Deadlines and motivation

Contests

29 / 29

pantelis p. analytis march 11, 2018 decision making pantelis p. analytis introduction one-shot time...

Documents