3. decision making under uncertainty certainty and …maykwok/courses/fin_econ_05/fin_… ·  ·...

Post on 30-Mar-2018

217 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

3. Decision making under uncertainty

Certainty and Uncertainty

Economic agents choose actions on the basis of consequences that

the chosen actions produce. Other factors may interact with an

action (state of the world) to produce a particular consequence.

A = set of feasible actions

S = set of possible states of the world

C = set of consequences

A combination of an action a ∈ A and a state s ∈ S will produce a

particular consequence c ∈ C.

(s, a) → c = f(s, a).

Uncertainty about the state of the world is often modelled by a

probability measure on S.

? Choosing an action “a” determines a consequence for each state

of the world, f(s, a). The decision over actions in A can therefore

be viewed as a decision over state-dependent consequences.

Write (c11, c21, · · · , cs1) as the state-contingent consequences as-

sociated with action a1. Choosing a1 over a2 is the same as

choosing (c11, · · · , cs1) over (c12, · · · , cs2).

? If f is constant with respect to the state of the world, then the

decision is taken under certainty .

Alternative viewpoint – choice of probability distribution over out-

come

The relationship among actions, states of the world and conse-

quences is described by f : S ×A → C.

Since a probability distribution measure is defined on S, there is

an induced probability distribution on the set of consequences for

each action. Consider action a ∈ A, and any (measurable) subset of

consequences K ⊂ C,

Prob {K} := prob{s ∈ S|f(s, a) ∈ K}.The probability of a particular consequence is equal to the probability

of the states of the world which lead to this consequence given a

particular action.

Hence, the choice of an action amounts to the choice of a probabil-

ity distribution on consequences (like the choice of different gambles

or investment choices – choice among alternative probability distri-

butions).

Example

Consider a price-taking firm which maximizes profit by choosing

single input, labor `. Let φ(`) denote the production function, w

and p are the prices for input and output.

The firm’s profit function: π(`) = pφ(`)− w`.

Action = choice of input level `; consequence = profit π(`).

Assume there are 2 states: s1 and s2 and two levels `1 and `2.

prob {π(s1, `)} = prob {s1},prob {π(s2, `)} = prob {s2}.Let the production function be

φ(s, `) =

{ √` for s = s1 (rainy)

2√

` for s = s2 (sunny).

Assume prob {s1} = 3/4, prob {s2} = 1/4; p = 2 and w = 2.

Choosing ` = 1 implies prob {π = 1} = 3/4; prob {π = 3} = 1/4.

Choosing ` = 4 implies prob {π = 0} = 3/4; prob {π = 4} = 1/4.

In the state-space approach, a choice of action `i is a choice of the

state-contingent profit level (π(s1, `i), π(s2, `i)).

Objects of choice can be viewed either as

• state-contingent outcomes

• probability distributions.

Formalism

Given a set of outcomes C and a probability distribution on the

set of states, each action induces a probability distribution on the

outcomes in C.

If the set of consequences is finite, C = {c1, · · · , cn}, then each action

determines a vector of probabilities from the set

∆n =

(p1, · · · pn) ∈ Rn

+

∣∣∣n∑

i=1

pi = 1

with pi = prob ({s ∈ S|f(s, a) = ci}). Here, ∆n is a (n − 1)-

dimensional simplex.

Von Neumann-Morgenstern utility index – utility function over out-

comes u(ci).

Given a von Neumann-Morgenstern utility function u, one can treat

the expected utility representationn∑

i=1

piu(ci) as a function of the

probability distribution (p1, · · · , pn). Define a utility function on prob-

abilities as

U(p1, · · · , pn) =n∑

i=1

piu(ci).

Theorem (existence of representation of preferences by a continu-

ous utility function)

Assumptions on a preference ordering over probability distributions:

1. Completeness requires the ordering to order any pair of proba-

bility distributions in ∆n.

2. Transitivity: p  q, q  r then p  r.

3. Continuity: For a continuous transformation of a probability

distribution p into another probability distribution q, where q  p,

the course of transformation leads to a probability distribution

that is indifference to any probability distribution ranked between

p and q. That is, preference for probability distributions do not

change abruptly.

Existence of utility function on ∆n

If a preference ordering over the probability distributions in ∆n sat-

isfies completeness, transitivity and continuity, there exists a utility

function U : ∆n → R that represents this preference ordering. The

utility function U(·) is unique up to monotone transformation.

[One can take any strictly increasing function : R → R, say, f(x) =

exp(x), to obtain another equivalent utility function U(p) = f(U(p)].

This representation evaluates a probability distribution P (p1, · · · , pn)

over outcomes (c1, · · · , cn) by forming a weighted average of the util-

ities u(ci) derived from the different outcomes using the probabilities

as weights i.e. computing the expected utility.

Independence axiom

The preference relation on ∆n represented by the utility function

U(·) satisfies for any p, q, r ∈ ∆n and any α ∈ [0,1]

U(αp + (1− α)r) ≥ U(αq + (1− α)r) iff U(p) ≥ U(q).

? One can decompose any two probability distributions into parts

that are identical and parts that are different.

º

⇐⇒ º

º

Ranking of different parts of a compound probability distribution

Example

Investor is indifferent between X and Y ; Z is a third prospect. In-

vestor should be indifferent to these 2 gambles;

X with prob p and Z with prob 1− p

Y with prob p and Z with prob 1− p

If a person were indifferent between having a Ford or a Datsun, she

would be indifferent to buy a lottery ticket for $10 that gave a 1

in 500 chance of winning a Ford or a ticket for $10 that gave the

same change of winning a Datsun.

Allais Paradox (1952)

C1 = 5 million, C2 = 1 million, C3 = 0

prob {C1} prob {C2} prob {C3}p 0 1 0q 0.1 0.89 0.01r 0.1 0 0.9s 0 0.11 0.89

Most people prefer p over q (did not consider the 10% chance of win-

ning 5 million worth the risk of losing one million with 1% chance).

Most people prefer r over s. According to the independence axiom,

one of the following must be true for the preferences.

(i) U(p) > U(q) and U(s) > U(r), or

(ii) U(q) > U(p) and U(r) > U(s), or

(iii) U(p) = U(q) and U(r) = U(s).

The actual behaviors in the experiment violate the independent ax-

iom.

Theorem (expected utility representation)

A utility function U on ∆n satisfies the independence axiom iff there

is a utility function over outcomes u : C → R such that for all p and

q ∈ ∆n

U(p) ≥ U(q) iffn∑

i=1

piu(ci) ≥n∑

i=1

qiu(ci).

Remark

1. Unlike expected utility functions, utility indexes are unique only

up to a linear affine transformation: V (x) = a + b U(x), for

a, b ∈ R and b > 0.

2. Utility indexes must be bounded in order that a well-defined

expected utility function exists. An example is the failure in the

St. Peterbury Paradox when the linear utility index: U(x) = x is

used.

Certainty equivalent, risk premium and risk aversion

1. The certainty equivalent of a probability distribution F is the

real number C(F ) that satisfies

u(C(F )) =∫

Cu(x) dF (x)

4= U(F ).

2. The risk premium is the real number q(F ) that satisfies

q(F ) = µ(F )− C(F )

where µ(F ) =∫

Cx dF (x) = expected value of F .

Would he prefer to receive the expected value of a lottery with

certainty than to receive the lottery itself?

risk-averserisk-neutralrisk-loving

if

q(F ) > 0q(F ) = 0q(F ) < 0

for all probability distribution F .

Alternative approach: whether the investor prefers a probability dis-

tribution to its expected value.

Consider u(µ(F ) − q(F )) = u(C(F )) =∫

u(x) dF (x)4= U(F ), since

u(x) is strictly increasing, we have

q(F )>=<

0 ⇐⇒ u(µ(F ))>=<

U(F ),

where µ(F ) denotes the expected value of the distribution F and

U(F ) is the expected utility of the distribution F .

Consider an arbitrary distribution F that is concentrated on the two

outcomes x1 and x2

u(µ(F )) = u(p1x1 + p2x2)>=<

p1u(x1) + p2u(x2) = U(F )

depending on whether the agent is

risk-averserisk-neutralrisk-loving

.

A function u : R→ R is

concavelinearconvex

u(λx1 + (1− λ)x2)>=<

λu(x1) + (1− λ)u(x2), 0 ≤ λ ≤ 1.

Conclusion: An expected-utility maximizing agent is

risk-averserisk-neutralrisk-loving

if u(x) is

concavelinearconvex

.

Stochastic dominance

? Knowing the utility function, we have the full information on

preference. Using the maximum expected utility criterion, we

obtain a complete ordering of all the investments under consid-

eration.

? What happens if we have only partial information on preferences

(say, prefer more to less and/or risk aversion)?

? For example, in the First Order Stochastic Dominance Rule,

we only consider the class of utility functions, call U1, such that

u′ ≥ 0. This is a very general assumption and it does not assume

any specific utility function.

Dominance in U1

Investment A dominates investment B in U1 if for all utility functions

such that u ∈ U1, EAu(x) ≥ EBu(x), or equivalently, U(FA) ≥ U(FB),

and for at least one utility function, there is a strict inequality.

Efficient set in U1 (not being dominated)

An investment is included in the efficient set if there is no other

investment that dominates it.

Inefficient set in U1 (being dominated)

The inefficient set includes all inefficient investments. An inefficient

investment is that there is at least one investment in the efficient

set that dominates it.

The partition into efficient and inefficient sets depends on the choice

of the class of utility functions. In general, the smaller the efficient

set relative to the feasible set, the better for the decision maker.

First order stochastic dominance

Can we argue that Investment A is better than Investment B? It

is still possible that the return from investing in B is 11% but the

return is only 8% from investing in A.

? By looking at the cumulative probability distribution, we observe

that for all returns and the odds of obtaining that return or less,

B consistently has a higher or same value.

Recall that for each action a ∈ A, there is an induced probabil-

ity distribution on C (the set of all consequences). To compare

two choices of action, we examine their corresponding probability

distribution.

Definition

A probability distribution F dominates another probability distribu-

tion G according to the first-order stochastic dominance if

F (x) ≤ G(x) for all x ∈ C.

Lemma

F dominates G by FSD if and only if∫

Cu(x) dF (x) ≥

Cu(x) dG(x)

for all strictly increasing expected utility indexes u(x).

Proof

Let a and b be the smallest and largest values F and G can take on.

Consider∫ b

au(x) d[F (x)−G(x)] = u(x)[F (x)−G(x)]ba︸ ︷︷ ︸

zero since F (a) = G(a) = 0and F (b) = G(b) = 1

−∫ b

au′(x)[F (x)−G(x)] dx

Cu(x) dF (x) ≥

Cu(x) dG(x) ⇔ −

∫ b

au′(x)[F (x)−G(x)] dx ≥ 0.

Thus, for u′(x) > 0,

F (x) ≤ G(x) ⇐⇒∫

Cu(x) dF (x) ≥

Cu(x) dG(x).

A≥

FSD B iff rAd= rB + α, where α ≥ 0.

That is, asset A’s rate of return is equal in distribution to asset B’s

rate of return plus a non-negative random variable α.

This arises from the relation

E[u(1 + rA)] = E[u(1 + rB + α)] ≥ E[u(1 + rB)].

Second order stochastic dominance

If both investments turn out the worst, the investor obtains 6%

from A and only 5% from B. If the second worst return occurs, the

investor obtains 8% from A rather than 9% from B. If he is risk

averse, then he should be willing to lose 1% in return at a higher

level of return in order to obtain an extra 1% at a lower return level.

If risk aversion is assumed, then A is preferred to B.

Definition

A probability distribution F dominates another probability distribu-

tion G according to the second order stochastic dominance if for all

x ∈ C ∫ x

−∞F (y) dy ≤

∫ x

−∞G(y) dy.

According to SSD, A is preferred over B since the sum of cumulative

probability for A is always less than or equal to that for B.

Theorem

If F dominates G by SSD, then∫

Cu(x) dF (x) ≥

Cu(x) dG(x)

for all increasing and concave expected utility indexes u(x).

Proof∫ b

au(x) d[F (x)−G(x)] = −

∫ b

au′(x)[F (x)−G(x)] dx

= −u′(x)∫ x

a[F (y)−G(y)] dy|ba

+∫ b

au′′(x)

∫ x

a[F (y)−G(y)] dydx

= −u′(b)∫ b

a[F (y)−G(y)] dy

+∫ b

au′′(x)

∫ x

a[F (y)−G(y)] dydx.

Given that u′(b) > 0 and u′′(x) < 0,∫

Cu(x) dF (x) ≥

Cu(x) dG(x) if

∫ x

a[F (y)−G(y)] dy ≤ 0,∀x.

Example

F (x) =

0 if x < 1x− 1 if 1 ≤ x ≤ 21 if x ≥ 2

, G(x) =

0 if x < 0x/3 if 0 ≤ x ≤ 31 if x ≥ 3

.

F dominates G by SSD since∫ x

−∞F (y) dy ≤

∫ x

−∞G(y) dy.

F (x) is seen to be more concentrated (less dispersed).

Sufficient rules and necessary rules for second order stochastic

dominance

Sufficient rule 1: FSD rule is sufficient for SSD

Proof : If F dominates G by FSD, then F (x) ≤ G(x), ∀x.

This implies∫ x

a[G(y)− F (y)] dy ≥ 0.

Remark

The efficient set according to SSD is larger than that of FSD.

Since SSD rule requires risk aversion in addition to FSD rule, some

elements in the inefficient set according to FSD may not stay again

in the inefficient set of SSD.

Sufficient rule 2 :

MinF (x) > MaxG(x) is a sufficient rule for SSD.

Example

F Gx p(x) x p(x)5 1/2 2 3/410 1/2 4 1/4

MinF (x) = 5 ≥ MaxG(x) = 4 so that F (x) ≤ G(x). Hence, F

dominates G.

MinF (x) ≥ MaxG(x) ⇒ FSD ⇒ SSD ⇒ EFu(x) ≥ EGu(x),∀u ∈ U2.

Necessary rule 1 (Geometric means)

Given a risky project with the distribution (xi, pi), i = 1, · · · , n, the

geometric mean, Xgeo, is defined as

Xgeo = xp11 · · ·xpn

n =n∏

i=1

xpii , xi ≥ 0.

Taking logarithm on both sides

lnXgeo = Σpi lnxi = E[lnX].

Xgeo(F ) ≥ Xgeo(G) is a necessary condition for dominance of F over G by SSD.

Proof

Suppose F dominates G by SSD, we have

EFu(x) ≥ EGu(x),∀u ∈ U2.

Since lnx = u(x) ∈ U2,

EF lnx = lnF Xgeo ≥ EG lnx = lnG Xgeo;

we obtain lnXgeo(F ) ≥ lnXgeo(G). Since the logarithm function is

an increasing function, we deduce Xgeo(F ) ≥ Xgeo(G). Therefore,

F dominates G by SSD ⇒ Xgeo(F ) ≥ Xgeo(G).

Necessary rule 2 (left-tail rule)

Suppose F dominates G by SSD, then

MinF (x) ≥ MinG(x),

that is, the left tail of G must be “thicker”.

Proof by contradiction: Suppose MinF (x) < MinG(x), and write

xk = MinF (x). At xk, G will still be zero but F will be positive.

Observe that∫ xk

−∞[G(y)− F (y)] dy =

∫ xk

−∞[0− F (y)] dy < 0,

implying F is not dominated by G by SSD. Hence, if F dominates

G, then MinF (x) ≥ MinG(x).

Skewness and portfolio analysis (Third-order stochastic dominance)

Skewness is a measure of asymmetry of a distribution, defined byµ3

σ3, where µ3 is the third order moment. Say, the normal distribution

has zero skewness.

log-normal return distribution exhibits positive skewness

Empirical studies show that investors should prefer positive skew-

ness. All else constant, they should prefer portfolio with a higher

probability of very large payoffs.

Portfolio analysis is based on the first three moments of return

distribution rather than just mean and variance.

The utility of an agent can be constructed in terms of the moments

of the probability distributions.

For any distribution function p

µ(p) =∫

w dp(w)

σ2(p) =∫

[w − µ(p)]2 dp(w)

Question How can a utility function V (µ, σ) be justified in terms

of the expected utility theory?

Two possibilities

1. Placing restrictions on the probability distribution p.

2. Placing restrictions on the expected utility function u(·) defined

on consequences.

Formulation

Consider an expected utility function defined on wealth levels u(w)

and a wealth distribution function p. Write p(·|M) to represent

the distribution function p determined by M , where M is the set of

moments of the distribution. For the expected utility of a probability

distribution

V (M)4= U(p(·|M)) =

∫u(w) dp(w|M).

The expected utility is a function of all moments of the distribution

p.

If a distribution is completely described by its first two moments

(µ, σ), then the expected utility function based upon this distribution

will be a function of these two moments. The normal distribution

is the only distribution that is fully characterized by its first two

moments.

Quadratic utility

— places no constraints on the distribution function p.

u(w) = αw2 + w, α ∈ R.

For an arbitrary p,∫

u(w) dp(w) = α∫

w2 dp(w) +∫

w dp(w)

= α[σ2(p) + µ(p)2

]+ µ(p).

Hence, for a quadratic expected utility index u(w), the expected

utility function depends exclusively on µ(p) and σ2(p).

When α < 0, u(w) is decreasing in w for w > −α

2(violates the axiom

of non-satiation). Also, for α < 0, the quadratic utility demonstrates

increasing absolute risk aversion:

Ra(w) = − 2α

2αw + 1and R′a(w) =

4α2

(2αw + 1)2> 0.

Two-asset portfolio analysis – risky asset and riskfree asset

* absolute risk aversion and demand function for risky asset

Let a denote the number of units of risky asset,

b denote the number of units of riskfree asset.

rs = return from the risky asset in state s

R = return from the riskless asset.

Return from the portfolio (a, b) in state s

Ws(a, b) = rsa + Rb.

Let the price of the risky asset be q and the price of the riskless

asset be the numeraire.

The investor’s budget constraint is W0 = aq + b, where W0 is the

initial wealth of the investor; b = W0 − qa. We assume no short

selling so that a > 0.

Assume a finite set of states S = {1, · · · , s} with probability distri-

bution p = (p1, · · · , ps).

The optimization problem of an expected utility-maximizing investor:

choose (a, b) to maximize∑

s∈S

psu(Ws(a, b))

subject to qa + b = W0.

Choose a to maximize∑

s∈S

psu(RW0 + (rs −Rq)a).

The first order condition is∑

s∈S

psu′(RW0 + (rs −Rq)a)[rs −Rq] = 0.

If the investor is risk-averse, u′′(·) is strictly negative, then the sec-

ond order condition is∑

s∈S

psu′′(RW0 + (rs −Rq)a)(rs −Rq)2 < 0.

A solution to the first order condition must be a maximum if the

investor is risk averse.

Question

Is the demand for number of units of the risky asset increasing or

decreasing in initial wealth?

Define a(W0) = demand function for the risky asset, which is the

optimal solution to the portfolio choice problem.

Lemma

a′(W0) > 0 if R′a(x) < 0

a′(W0) = 0 if R′a(x) = 0

a′(W0) < 0 if R′a(x) > 0

Proof

Consider the derivative with respect to W0 of the first order condi-

tion:∑

s∈S

psu′′(RW0 + (rs −Rq)a(W0))(rs −Rq)R

+∑

s∈S

psu′′(RW0 + (rs −Rq)a(W0))(rs −Rq)2a′(W0) = 0.

Solving for a′(W0):

a′(W0) = − ∑

s∈S

psu′′(RW0 + (rs −Rq)a(W0))(rs −Rq)2

−1

R

s∈S

psu′′(RW0 + (rs −Rq)a(W0))(rs −Rq)

.

If the investor is risk-averse, u′′(·) < 0. Hence, the sign of a′(W0)

should be the same as the sign of∑

s∈S

psu′′(RW0 + (rs −Rq)a(W0)) (rs −Rq)︸ ︷︷ ︸

can be positive or negative

.

Recall the definition: Ra(x) = −u′′(x)u′(x)

; the above term can be ex-

pressed as

−∑

s∈S

psu′(RW0 + (rs −Rq)a(W0))(rs −Rq)

Ra(RW0 + (rs −Rq)a(W0)).

For all s ∈ S

(rs −Rq)Ra(RW0) S (rs −Rq)Ra(RW0 + (rs −Rq)a(W0))

if and only if R′a(x) T 0.

Take the case R′a(x) < 0,

(i) for rs −Rq > 0

Ra(RW0) > Ra(RW0 + (rs −Rq)a(W0))

(ii) for rs −Rq < 0

Ra(RW0) < Ra(RW0 + (rs −Rq)a(W0)).

Easier to visualize if we write

y = rs −Rq, x0 = RW0, λ = a(W0) > 0.

We have

yRa(x0) > yRa(x0 + λy) iff R′a(x) < 0.

Lastly, consider R′a(x) < 0, the sign of a′(W0) depends on the sign

of

−∑

s∈S

psu′(RW0 + (rs −Rq)a(W0))(rs −Rq)

Ra(RW0 + (rs −Rq)a(W0))

> −Ra(RW0)∑

s∈S

psu′(RW0 + (rs −Rq)a(W0))(rs −Rq)

= 0 [due to the first order condition]

Hence, a′(W0) > 0. When the absolute risk aversion is a decreasing

function, investors would invest more on risky asset when the initial

wealth level is higher.

Consider the two-asset portfolio again, where one asset is riskfree

and the other is riskless. Define the elasticity of demand of the risky

asset with respect to the wealth by

η =daa

dW0W0

.

For risk averse investor, show that

η S 1 ifdRR

dW= −W

u′′(W )

u′(W )T 0.

Proof

Recall that W = W0(1 + rf) + a(r − rf) and

η = 1 +

(da

dW0

)W0 − a

a.

From previous result onda

dW0, we have

η = 1 +W0(1 + rf)E[u′′(W )(r − rf)] + aE[u′′(W )(r − rf)

2]

aE[−u′′(W )(r − rf)2]

= 1 +E[u′′(W ){W0(1 + rf) + a(r − rf)}(r − rf)]

aE[−u′′(W )(r − rf)2]

= 1 +E[u′′(W )W (r − rf)]

aE[−u′′(W )(r − rf)2]

= 1 +E[RR(W )u′(W )(r − rf)]

aE[u′′(W )(r − rf)2].

Since u′′(W ) < 0 for concave utility function, we have

sign (η − 1) = −sign (E[RR(W )u′(W )(r − rf)]).

Suppose RR(W ) is an increasing function, then

RR(W ) = RR(W0(1 + rf) + a(r − rf)){≥ RR(W0(1 + rf)) when r ≥ rf< RR(W0(1 + rf)) when r < rf .

By the rule of conditional probability, we have

E[RR(W )u′(W )(r − rf)]

= E[RR(W )u′(W )(r − rf)|r − rf ≥ 0]Prob (r − rf ≥ 0)

+ E[RR(W )u′(W )(r − rf)|r − rf < 0]Prob (r − rf < 0).

Consider E[RR(W )u′(W )(r − rf)|(r − rf) < 0], since u′(W ) > 0 and

RR(W ) > 0, we have

RR(W )(r − rf) > RR(W0(1 + rf))(r − rf) for r − rf < 0

so that

E[RR(W )u′(W )(r − rf)]

> RR(W0(1 + rf))E[u′(W )(r − rf)] = 0 so that η < 1.

top related