efficiency in repeated games revisited the role of private strategies

Upload: mohammad-movahedi

Post on 07-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    1/22

    Efficiency in Repeated Games Revisited: The Role of Private StrategiesAuthor(s): Michihiro Kandori and Ichiro ObaraSource: Econometrica, Vol. 74, No. 2 (Mar., 2006), pp. 499-519Published by: The Econometric SocietyStable URL: http://www.jstor.org/stable/3598808Accessed: 12/05/2009 10:02

    Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available athttp://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR's Terms and Conditions of Use provides, in part, that unlessyou have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and youmay use content in the JSTOR archive only for your personal, non-commercial use.

    Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained athttp://www.jstor.org/action/showPublisher?publisherCode=econosoc .

    Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printedpage of such transmission.

    JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with thescholarly community to preserve their work and the materials they rely upon, and to build a common research platform thatpromotes the discovery and use of these resources. For more information about JSTOR, please contact [email protected].

    The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to Econometrica.

    http://www.jstor.org

    http://www.jstor.org/stable/3598808?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/action/showPublisher?publisherCode=econosochttp://www.jstor.org/action/showPublisher?publisherCode=econosochttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/3598808?origin=JSTOR-pdf
  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    2/22

    Econometrica, Vol. 74, No. 2 (March, 2006), 499-519

    NOTES AND COMMENTS

    EFFICIENCY IN REPEATED GAMES REVISITED: THE ROLEOF PRIVATE STRATEGIES

    BY KANDORI, MICHIHIRO AND OBARA, ICHIRO 1

    Most theoretical or applied research on repeated games with imperfect monitoringhas focused on public strategies: trategies that depend solely on the history of publiclyobservable signals. This paper sheds light on the role of private trategies: trategies hatdepend not only on public signals, but also on players' own actions in the past. Ourmain finding is that players can sometimes make better use of information by usingprivate strategies and that efficiency n repeated ames can be improved. Our equilibriumprivate strategy or repeated prisoners' dilemma games consists of two states and hasthe property that each player's optimal strategy is independent of the other player'sstate.

    KEYWORDS: fficiency, imperfect public monitoring, mixed strategy, partnershipgame, private equilibrium, private strategy, repeated game, two-state machine.

    1. INTRODUCTION

    THE THEORY OF REPEATED GAMES under imperfect public monitoring pro-vides a formal framework with which to explore the possibility of cooperationin long-term relationships, where each agent's action is indirectly and imper-fectly observed through a public signal. It has been successfully applied to anumber of economic problems: cartel enforcement, internal labor markets,and international policy coordination, to name a few. However, almost all ex-isting works (including Abreu, Pearce, and Stacchetti (1990) and Fudenberg,Levine, and Maskin (1994)) focus on a simple class of strategies known as pub-lic strategies. In this paper, we show how players can make better use of infor-mation by using a more general class of strategies, private strategies, nd showthat efficiency n repeated ames can often be improved.

    A public strategy specifies a current action conditional only on the history of

    the public signal.A

    private strategy, by contrast, dependson

    one's own actionsin the past as well as on the history of the public signal. To see why this gen-eralization helps to improve efficiency, consider a simple repeated partnershipgame with two actions {C, D} and two outcomes of public signal {good, bad},where the stage game payoffs have the structure of the prisoners' dilemma.Assume that the signal is rather insensitive to a deviation at the cooperative

    1We are grateful to a co-editor and two anonymous referees for their thoughtful comments.We also thank Drew Fudenberg and David Levine for an informative discussion. This paper stemsfrom two independent papers: "Check Your Partner's Behavior by Randomization" by Kandori

    (1999) and "Private Strategy and Efficiency: Repeated Partnership Game Revisited" by Obara(1999). An earlier version of this paper is included n Chapter 1 of Obara's 2001) doctoral thesis.Obara s grateful o George Mailath and Andrew Postlewaite or their advice and encouragementthroughout his project.

    499

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    3/22

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    4/22

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    5/22

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    6/22

    EFFICIENCY IN REPEATED GAMES REVISITED

    LEMMA 1: Suppose the efficient PPE that maximizes vl + v2plays q E Q in thefirst period.8 The maximum otal payoff vl + v2 is bounded above by 2v*, where

    v* is defined by

    (5) v* max g(C, q) - (q)qE[0,1] L(q) - 1

    We denote the maximizer of the right-hand side of (5) by q*. Provided thatv* > 0 and 6 is large enough, the upper bound (5) is exactly achieved by thefollowing symmetric rigger-strategy quilibrium: a) (q*, q*) is played in the firstperiod, (b) permanent reversion to (D, D) is triggered with a certain probabil-

    ity pafter

    observing X,and

    (c) pis chosen to

    satisfyhe incentive constraint

    2)with equality. Note that we have vi = v' with this strategy profile and (2) bindswith equality. Hence the upper bound formula (3) is exactly achieved. Whenq* = 0, (5) reduces to APM's (1991) formula for the best pure trigger-strategyequilibrium payoff.

    Our formula (5) clarifies why mixed (public) strategies may help to improveefficiency. We can interpret g(C, q) = 1 - q - hq as the stage game payoff tobe sustained and the last term d(q) = (1q)d+qh as the efficiency loss associ-L(q)- L(q)-ated with the inefficient punishment. While aking he inefficient action D witha

    larger probability qreduces the

    stage game payoff (g(C, q)),it

    may improvethe quality of monitoring increase L(q)) and reduce the inefficiency ssociatedwith the punishment (it may or may not reduce the deviation gain d(q)). Thusq* may not be 0 in general. That is, a mixed trigger strategy may achieve abetter outcome than the pure one.

    When D makes the signal more informative, however, we may further im-prove efficiency by aprivate trategy hat imposes penalty only after D is played.In what follows, we construct private equilibria hat take advantage of this po-tential source of efficiency.

    3.2. An Example of Efficient Private Equilibrium

    We first present a special case of the model above, where private equilibriaasymptotically chieve full efficiency, while the PPE payoffs are bounded awayfrom the efficient frontier.

    Let us assume 0 < p(YIC, C) < 1, 0 < p(YID, D) < 1, and p(YIC, D) =p(YID, C) = 0. Consider the following private strategy, which starts at pha-se (I) below.

    (I) Mix C and D. Choose D with a (small) probability q E (0, 1).

    8The most efficient PPE may not use an action profile in Q in the first period. Corollary 1addresses this issue and derives an upperbound n a special class of our model.

    503

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    7/22

    M. KANDORI AND I. OBARA

    (II) If the signal is Y and one's own action was D, play D forever. Other-wise, go to (I).

    Note that, under the assumption that p(YIC, D) = p(YID, C) = 0, when aplayer chose D and observes Y, it is common knowledge that the other playeralso chose D (and, of course, observes Y). The symmetric equilibrium payoffv satisfies

    (6) v = (1 - 8)(1 - q - qh)+ 6v

    and

    (7) v = (1 - )(1 - q)(1 + d) + {1 - qp(YID, D)}v.

    Equation (6) represents the average payoff when a player plays C today (whilethe opponent is employing the above strategy). Note that punishment s surelyavoided in this case. Equation (7) shows the average payoff when the playerchooses D today. In this case, punishment s triggered when the opponent alsochooses D and the signal is Y, which happens with probability qp(YID, D).Equations (6) and (7), taken together, imply that the player is indifferent be-tween choosing C and D. From (6), we have

    (8) v = 1 - q-qh.

    Also, by (6) and (7) we obtain (1 - 8){(1 - q)d + qh} = 8qp(YID, D)v.This and (8) result in a quadratic equation in q, (1 - 8){(h - d)q + d} =

    8qp(YID, D)(1 - q - qh). Direct calculation shows that there is a root of thisequation in (0, 1), which tends to 0 as 8 -* 1. Equation (8) then implies that,as q tends to 0, the average payoff tends to 1, the payoff from full cooperation.

    Now we show that all the PPE payoffs are bounded away from (1, 1). Con-sider the most efficient PPE

    (v?,v2)that maximizes v1+

    v2,and let us define

    V(Q) = {(gl(q)g2(q))lq E Q). Note that V(Q) contains all feasible points suffi-ciently close to (1, 1). Suppose that (v?, v?) lies in this neighborhood of (1, 1).Now consider the current and continuation payoffs associated with the mostefficient PPE: (v?, v2) = (1 - 5)(g?, gO) + (v'1, v). Since v1 + v2 < v? + v? bydefinition, we have g? + go > vo + v2. The last inequality mplies that (gO,gO)also lies in the neighborhood of (1, 1), and hence in V(Q). In other words, themost efficient PPE plays a profile q E Q in the first period. Then, the payoff up-per bound (5) applies, which contradicts our premise that v? + v? is arbitrarilyclose to 1 + 1. Hence we obtain the following result.

    PROPOSITION : In the game defined above, there s a private equilibrium hatasymptotically ttains the efficient point (1, 1) as 8 -- 1, while any perfect publicequilibrium ayoff profile s bounded away rom (1, 1) for all 8.

    504

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    8/22

    EFFICIENCY IN REPEATED GAMES REVISITED

    Whereas it is much easier to detect the opponent's defection when one de-fects herself, it is more efficient to trigger a punishment only after such a

    (private) history. More precisely, private strategies allow players to start a pun-ishment after the realization of the action-signal pair for which the likelihoodratio (with respect to a defection) is maximized. This high likelihood ratio helpsto reduce the inefficiency term in the PPE payoff formula (5). For this partic-ular example, the inefficiency term indeed vanishes completely because thelikelihood ratio p(YID, D)/p(YIC, D) is infinite.

    In this example, the public signal does not have a full support. As a result,it becomes common knowledge to start a mutual punishment after D is playedand Y is observed. If the public signal has full support, neither player is surewhether the

    opponentis in the

    punishmentmode. Therefore

    specifyingthe

    optimal action at each history can potentially be a formidable ask. We addressthis issue next.

    3.3. Two-State Machine Private Equilibrium

    In this section, we demonstrate how to construct a private equilibriumwhen the signal has full support. We assume 0 < p(XICC) < p(XIDC) =p(XICD) < p(XIDD) < 1, which implies that the bad signal X is more likely

    to realize when more players defect. This assumption s commonly employedin partnership games, including RMM (1986). We also assume that the oppo-nent's defection is easier to detect when one is playing D:

    p(XID, C) p(XID, D)p (XIC, C) p(XIC, D)

    This is a natural assumption, which implies a form of decreasing returnsto scale.9 Let us denote p(XICC) = po, p(XICD) = p(XIDC) = pi, and

    p(XIDD)= p2 in this section. With this notation, the likelihood ratio defined

    by (4) is expressed as

    L(q) (1 - q)pl + qp2(1- q)po + qpl

    which is strictly ncreasing n q under our likelihood assumption (9).Now consider the following private strategy, which we call a two-state ma-

    chine. The strategy has two states, R and P, and it begins with state R. Further-more, it has the following structure see Figure 1):

    9Condition (9) implies that the probability of "success" Y increases more for the first inputof "effort" C than the second input of effort. That is, p(YICD) - p(YIDD) > p(YICC) -p(YICD).

    505

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    9/22

    M. KANDORI AND I. OBARA

    (C,X) (C,X)

    (C,Y)~

    (C,Y)

    (D,Y) (D,X) (D,X)

    i-p

    R p

    p -pp

    qRsvery mall

    qpisvery arge

    (D,Y)R: reward state P: punishment state

    FIGURE 1.

    * State R (state o reward he opponent): Choose D with probability R (a smallnumber). Go to state P with probability R E (0, 1) if D was taken and X wasobserved; otherwise, stay in state R.

    * State P (state to punish the opponent): Choose D with probability qp (a largenumber). Go to state R with probability pp E (0, 1) if D was taken and Y wasobserved; otherwise, stay in state P.First note that this private strategy shares a feature similar to the strategy

    described in the previous section. Each player moves to state P only after(D, X): the most informative action-signal pair of defection. Similarly, theplayers move to state R only after (D, Y), which can be shown to be the mostinformative action-signal pair of cooperation. Second, note that there is alwaysstrategic uncertainty. Neither player knows exactly what the other player's cur-rent state is and her belief is going to be updated all the time. How can we

    check if this machine is a best response to itself at every history given suchever-changing beliefs? To resolve this problem, we choose (qR, qP, PR, pP) insuch a way that no matter which state player 2 is in, player 1 is always ndifferentbetween choosing C and D. This means that any repeated game strategy is abest response to the machine, hence so is the machine itself.

    Since the game and the strategy are symmetric, we suppress subscript when-ever possible. A set of parameters (qR, qp, PR, PP) is chosen to satisfy the fol-lowing four equations.

    When player 2 is in state R, the equilibrium conditions for player 1 are asfollows.

    * Player 1 plays C today:

    (10) VR = (1 - 5)(1 - qR - qRh) + 8{(1 - qRppR)VR + qRP1PRVP).

    506

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    10/22

    EFFICIENCY N REPEATED AMESREVISITED

    * Player 1 plays D today:

    (11) VR= (1 - )( - qR)(1 + d) +{(1 - qRP2PR)VR qRP2pRVp}.

    When player 2 is in state P, the equilibrium conditions for player 1 are asfollows.* Player 1 plays C today:

    (12) Vp= (1 - )(1- qp - qph)

    + 8[qp(l - P1)pPVR + {1 - qp(1 - pI)pp}Vp].

    * Player 1 plays D today:

    (13) Vp= (1 - 8)(1 - qp)(l + d)

    + 8[qp(1 - p2)pPVR + {1 - qp(1 - p2)PP}VP],

    where Vz is the average payoff for player 1 when player 2 is in state Z = R, P.

    Equations (10) and (11) imply that player 1 is indifferent between C and Dwhen player 2 is in state R. A similar explanation applies to (12) and (13). Thissystem of equations implies that player 2's state alone determines player l'scontinuation

    payoff completelyand vice versa.

    Each solution for this system of equations corresponds to an equilib-rium two-state machine. Since the above conditions consist of four equa-tions (10)-(13) and six unknowns (Vz, q*, pk, Z = R, P), there is a manifoldof two-state machine equilibria. Here we pick the solution that corresponds othe most efficient two-state machine.

    In the Appendix, we prove the following results. When 8 is close to 1, thissystem has a solution such that (i) the probabilities (qz, p}, Z = R, P) are in[0, 1], (ii) qp can be set to 1 and qR - 0 as 8 -> 1, and (iii) VR > Vp under theassumption p2 - P1 > p?d + (1 - p2)h. By a manipulation similar to the one

    used to derive the formula for the trigger-equilibrium ayoff (5), we can obtain

    (1 - q*)d + q*hV =-1q -qR h- R qRhRR L RR L(1)-l

    Hence property (ii) means that the payoff arbitrarily close to 1 - L'- can beachieved as a PE as 8 -+ 1. In summary we obtain the following result.

    PROPOSITION : Suppose P2 - pi > pid + (1 - p2)h. Then there is a two-statemachine private equilibrium whose payoff tends to 1 -

    L)-Ias 8 - 1.

    Note that our equilibrium payoff uses L(1), the highest likelihood ratio todetect deviation: L(1) = maxqE[o,] L(q). Otherwise it is exactly like the for-mula (5) for the best trigger-strategy equilibrium payoff. The advantage of the

    507

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    11/22

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    12/22

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    13/22

    M. KANDORI AND I. OBARA

    i= 1, 2). Thus the Fudenberg-Levine-Maskin folk theorem applies. In thiscase, efficiency can be supported by asymmetric punishments n public strate-

    gieswithout

    usinga mutual

    punishment.On the other

    hand,this

    exampleis

    similar to the one in Section 3.2, where signal Y arises only when both playerstake the same action.12 Therefore, by a similar construction, (1, 1) can also beachieved by a PE in the limit as 8 -- 1. (See the on-line supplemental material(Kandori and Obara (2006)) for details.)

    Hence, in this example, both PPE and PE asymptotically achieve efficiencyas 6 -- 1. However, we can show that PE does better than any PPE for allsufficiently arge 8 < 1 if E is small enough. Note that our PE triggers a punish-ment after a realization of Y, whose probability s independent of e. Hence thepayoff associated with our PE is not affected by s. On the other hand, smallerE requires a larger 8 to approximate he efficiency with PPE for the followingreason. When 8 is small, different players' defections are statistically discrimi-nated only by a small margin. This means that, for example, when signal X1 isrealized, a large uture payoff should be transferred rom player 1 to player 2.For this to be feasible, the discount factor should be very close to 1. This isshown in Figure 2: the solid curve represents the private equilibrium payoff,while all the PPE payoffs are below the dotted line if 8 is less than a certainthreshold.

    0.8-

    0.6

    Payoff /

    0.4-

    /

    0.2-/

    0 0.995 1

    FIGURE 2.

    '2We can modify this example so that the signal has full support and still show that a PE dom-inates any PPE by using the two-state machine equilibrium from Section 3.3.

    510

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    14/22

    EFFICIENCY IN REPEATED GAMES REVISITED

    Now we derive the upper bound of PPE in Figure 2. Let (v?, v?) be a max-imizer of v1 + v2 over all PPE average payoff profiles. When v? - v?, the

    best symmetric PPE is achieved by publicly randomizing between (v?, v?) and(v?, vf) (the latter exists because of the symmetry of the game). Let g'(ql, q2)

    i)1/L"~t ICL~I \c~A13L3 V~L?CLU3L? i L? 3yillll~llr VI l~ fj(llllLf. LL?L ~jijYf)

    Y210and v(wo) be the current payoffs and the continuation payoffs to achieve v?.The best symmetric PPE payoff v satisfies

    (14) v = ( 1 - ) gl' (ql, q2) + g'(ql, q2)(14) =(1i5)12)2

    2

    + 8E,(vl(o) + v;2())p(woiql, q2)2

    When the best symmetric PPE does not need public randomization, the sameformula still applies. In either case, the best symmetric PPE payoff v satisfies

    (15) 2v-=v?+v2= max v1+v2,(V ,V2)EVPPE

    where VPPE is the set of all PPE payoff profiles.Rearranging formula (14), we obtain

    (16) 2v = g'(ql, q2) + g2(ql, q2) + (Ad( ) + A2(w))p(wo\ql, q2),

    where 4i(o) = - (vio(w) - v?) for i = 1, 2. The continuation payoffs (vl(to),v2(w)) must be in VPPE and, by the definition of Ai(o), this requirement isexpressed as

    1 -8(17) Vow, (Al(o), 2()) + (v,0 V) E VPPE

    The definition ofAi(w)

    and v'(o) + v(wo) < v? +vo

    (by (15)) implyA1(w) + A2(W)) < 0 for all w. From equation (16), we can also derive a lowerbound of 1(Wo) + A2(W): for some positive constant L > 0 that is indepen-dent of ?, we have -L < A1(wo) + A2(W) for all o. This is because (i) 2v andg'(ql, q2) + g2(q1, q2) are bounded and (ii) p(wolql, q2) is bounded below bya positive constant that is independent of e (Lemma 1 in Kandori and Obara(2006)).

    Next we show that large variations n (A1(w), A2(co)) are required to sustainv > 0 when the distinguishability arameter e is small. Consider

    D = {(Al, A2)I -L< l + A2 < 0 and A, > K for i = 1 or 2}.For any (large) constant K, we can find a (small enough) s > 0 such that sus-taining any v > 0 requires (A1(o), A2(W)) E D for some o if e < 8 (Lemma 2in Kandori and Obara (2006)).

    511

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    15/22

    M. KANDORI AND I. OBARA

    Hence, the feasibility condition (17) implies that v = (v? + v)/2 > 0 is sus-tained only if (1D + (v?, v?)) n VPPE i 0. This, in turn, is satisfied only if

    ( 1-^D + (v?, v?)) n VF = 0, where VF is the feasible payoff set. For any givenv > 0, direct calculation shows that a lower bound of 8 to satisfy the conditionis given by 8(v) = (3K/K - L + 8(1 - V)} (the on-line proof containsthe derivation). Then the inverse function of 8(.),

    1-83K -L

    u(8)= - 28

    provides an upper bound of PPE payoffs. Note that, by construction, K canbe arbitrarily large by a suitable choice of e, while L is independent of s. Thedotted line in Figure 2 depicts u(8) with (3K - L)/8 = 500.

    4. GENERAL TWO-STATE MACHINE

    Is our two-state machine, defined so far for the prisoner's dilemma, applica-ble to a more general two-person game? More specifically, n any given game,which action profiles (if any) can be supported by a two-state machine equi-librium? To address those issues, we present a systematic way to find a privateequilibrium or general two-person games.

    We use R and P to denote "reward" and "punishment" tates as in Sec-tion 3.3.13 Let g: A1 x A2 --> I2 be the stage game payoff function and letAZ denote the support of the equilibrium mixed action aZ for state Z = R, Pand Al = AR U AP. As before, player i moves between the states based on real-izations of the public signal and her private action so that player j is indifferentamong A* = A' U A' (and does not gain by playing any other action). In the

    partnership game example in Section 3.3, AR = A* = {C, D} and AP = {D}.Our characterization esult shows that such a two-state machine is an equi-

    librium or large 8 if and only if the following simple system of linear nequalitiesholds:

    (LI) For i, j = 1, 2 and j + i, there exist ViR,Ve G X, R: 2 x A -> [0, oo),and xP: [2 x AP - [0, oo) such that

    (18) Vai E A*, V = gi(ai, a) - E[xi (o, aj)ai ai],

    (19) Va, ? A, VR > gi(ai, a ) - E[xiR(, aj) ai, aR],

    13We an think of a more general machine that consists of Ni < oo, i = 1, 2, states. In state n,player 1 randomizes over A' and moves to the other states based on a realization of an action-signal pair so that player 2 is indifferent between all the actions in A* - UN 1A at every stateand vice versa. In the discussion version of this paper (Kandori and Obara (2003)), we showedthat such a machine with countable (possibly nfinite) states can be reduced to two-state machinewithout loss of generality.

    512

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    16/22

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    17/22

    M. KANDORI AND I. OBARA

    violated because

    maxgi(ai, iaP)> min

    maxgi(ai, aj)aiwai ajEAAj aiEAi

    13

    = max min gi(ai, aj)ajEjAAj aiEAT(=Ai)

    > min gi(ai, a),

    whereAAj

    is the set of mixed actions ofplayer j.

    5. RELATED LITERATURE AND COMMENTS

    Private Monitoring

    Our two-state machine also works for the private monitoring ase, where eachplayer i observes a private signal ti. If we modify our two-state machine strat-egy for player i in such a way that player i uses ci in place of the public sig-nal ow,we can see that this constitutes an equilibrium under private monitoring

    as long as the marginal distribution of wi is identical to the public signal distri-bution. Our private strategies work because neither player needs to know theother player's state. Ely and Valimaki (2002) independently found a similartwo-state machine strategy in the framework of repeated games with privatemonitoring. As in this paper, a player is indifferent among all the repeatedgame strategies, regardless of the state the opponent is in. The idea behindthese strategies goes back to Piccione (2002), which is essentially based on amachine with a countably nfinite number of states.

    However, there is an important difference between our paper and that by Ely

    and Valimaki (2002). In their paper, a pure action is played at each state. Notethat it is difficult o embody our efficient punishment (which occurs only whenthe "monitoring" ction D is played) in such a formulation. This is because aplayer would be more tempted to defect when the opponent is not likely to bein the state to play monitoring action D. To utilize the most informative action-signal pair to sustain cooperation without being noticed by the opponent, oneneeds to play a mixed action at the reward state. Indeed, Ely and Valimaki'stwo-state machine, which uses a pure action in each state, can sometimes bestrictly mproved by using a mixed action.

    A subsequent work by Ely, H6rner, and Olszewski (2005) generalizes theabove ideas for two-player repeated games with private monitoring. Their for-mulation is more general than ours given by (LI). In our two-state machine,AR and AP are fixed throughout he game, whereas Ely, Horner, and Olszewski(2005) allow them to vary over time or according to a realization of a public

    514

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    18/22

    EFFICIENCY IN REPEATED GAMES REVISITED

    correlation device. With such a generalization, they showed that the folk the-orem holds for some class of stage games (but not for a general game) when

    monitoring s private and almost perfect.

    Private StrategyExercise 5.10 in Fudenberg and Tirole (1991) is an early example of a game

    in which a private strategy makes a difference. It is a two-period epeated gamewith three players. Using a combination of private actions and public signals inthe first period as a correlation device, players 1 and 2 canpunish player 3 withthe correlated minmax action profile in the second period. This severe punish-ment, which is not available for any PPE, deters player l's deviation from the

    efficient action in the first period.Lehrer (1991) also used a private strategy as an endogenous correlation de-

    vice (internal correlation) in repeated games without discounting to supportcorrelated equilibrium payoffs.

    Mailath, Matthews, and Sekiguchi (2002) showed three examples of two-period repeated games with imperfect public monitoring n which a PE dom-inates any PPE. In their first example, the first period serves as a correlationdevice, as in Fudenberg and Tirole's example, but the first period generatescorrelated signals to support the efficient correlated equilibrium n the second

    period (asin Lehrer

    (1991)).In their second and third

    examples,the efficient

    action profile can be supported in the first period only for some PE, because aharsh punishment s available for the PE. These two examples differ from Fu-denberg and Tirole's example because they are based on an incorrect belief offthe equilibrium path, not on correlated punishments. In both examples, when aplayer deviates, the other player responds with a tough punishment due to herincorrect belief. Our equilibrium s different from these and the Fudenberg andTirole example, because we focus on the efficient use of information, whereastheir examples emphasize the magnitude of possible punishments. Note thatthe magnitude of possible punishments becomes less of a problem as 6 -- 1

    for infinitely repeated games.Some papers use a certain type of private strategy whereby players random-

    ize over their actions and send a message based on their realizations. Kandori(2003) used such a strategy o show that Fudenberg, Levine, and Maskin's suf-ficient condition for the folk theorem can be relaxed when players can com-municate. Ben-Porath and Kahneman 2003) used a similar strategy o prove afolk theorem with costly private monitoring and communication. Obara (2003)also used a similar rick to achieve full surplus extraction with adverse selectionand moral hazard n the context of mechanism design.

    Open Issue

    One open question remains. Although we were able to show that a PE canbe far more efficient than any PPE, we have not yet characterized he best pri-

    515

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    19/22

    M. KANDORI AND I. OBARA

    vate equilibrium payoff. This is due to the lack of recursive structure n privatemonitoring equilibria, which makes characterization of all private equilibriadifficult

    (seeKandori

    (2002)).In

    general,when PPE are inefficient, is there

    also an efficiency bound for private equilibria? Alternatively, do private equi-libria achieve full efficiency? This is an important opic for future research.

    Faculty of Economics, The University f Tokyo, 7-3-1 Hongo, Bunkyo-ku,Tokyo, 13-0033, Japan; kandori@e. -tokyo.ac.jp

    andDept. of Economics, UCLA, 405 Hilgard Ave., Los Angeles, CA 90095-1477,

    U S.A.; iobara @econ. ucla. edu; http://www. con.ucla. edu/iobara/.

    Manuscript received February, 2004; final revision received November, 2005.

    APPENDIX: PROOFS

    PROOF OF PROPOSITION : From (10) and (11), we can obtain

    (24) (1 - 8){(1 - qR)d + qRh} = qRpR(2 -p)(VR - VP).

    As before, we can use this equation to derive, from (10), the equation

    (1 - qR)d + qRh(25) VR 1 -qR - qRh -

    L(1)- 1

    Similarly, we can derive two equations from (12) and (13).

    (1 - qp)d + qph l - pi(26) Vp = 1 - qp - qph + L(1)- 1 P

    (27) (1 - 8){(1 - qp)d + qph} = 8qppp(p2 - p1)(VR - Vp).

    This system of equations is equivalent to (10)-(13).First note that PR should be set equal to 1. If there exists a solution of these

    equations with PR < 1, then we can reduce qR to increase VRvia (25) whileincreasing PR and reducing pp so that (24) and (27) are still satisfied. Note alsothat qp can be set equal to 1. If not, you can increase qp to reduce Vpvia (26),while lowering PR and pp so that (24) and (27) are satisfied. This leads to

    (1- p2)hVp =P2- P1

    from (26).Now we are left with three

    equations ((24), (25),and

    (27)),and three un-

    knowns (qR, PP, VR). Once qR is obtained, VR s also obtained from (25) and

    qRhpp (-q [d, 1](1 - qR)d+qRh

    516

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    20/22

    EFFICIENCY IN REPEATED GAMES REVISITED

    is obtained from (24) and (27). Thus we need only to find qR n [0, 1].These three equations reduce to a quadratic equation for qR: c2(8)q2 +

    Cl(5)qR + Co(S)=

    0, where c2(6) = 8{p2(1 + h)-

    p(li + d)}, c1(8)=

    (1-

    8)(h - d) + 5{pld + (1 - p2)h - (P2 - pl)), and Co(8) = (1 - 5)d. Oneroot of this quadratic equation is clearly qR = 0 when 5 = 1. WhereasdFldqRl(qR, )=(o,1) / 0 by the assumption p2 - pI > pld + (1 - p2)h, the im-

    plicit function theorem can be applied to obtain a C1 function qR(8) around5 = 1 such that

    dqR(1) _ l I(qR,8)=(O,1) d

    d65~ F

    I(qR =(0, ) pld + (1 - P2)h - (P2 - Pl)

    which is negative by assumption. Thus there exists a qR(8) e (0, 1) forlarge enough 8 such that qR(5) --0 as 8 - 1. Hence we obtain a solutionfor (24)-(27) parameterized by 8 around 6 = 1. Because lim^1 VR(8) = 1 -

    L(1 is larger than Vp = ((1 - p2)h)/(p2 - pl) by the assumption p2 - pl >

    pld + (1 - p2)h, this two-state machine is a sequential equilibrium or large 5,combined with the belief obtained via Bayes' rule.14 Therefore, for any r7> 0,we can find 5 such that the payoff of this PE exceeds 1 - L() - r for any8 E (S, 1). Q.E.D.

    PROOF OF PROPOSITION 3: Consider the following transition rule for

    player j: go to state P with probability p (co, aj) when the current state (for j)is z = R, P; otherwise, go to state R. The equilibrium condition for player iwhen j is in state z = R, P is

    (28) V > (1- 8)gi(ai, aj) + 8E[(1 - p(co, aj))iR + pj(co, aj)Viai, aj],

    where the equality should be satisfied for ai E A* = AR UAP. Consider irst the

    case z=

    R. Subtracting SVR rom both sides and dividing by (1 - 8), we obtain-

    8Vi > gi(ai, a - E 1 -p a) V

    where the equality holds for ai E A*. A similar manipulation for state z = Pleads to

    j 1- 8 j

    14Belief an be simply derived by Bayes' rule at any history. Because no deviation s observableto the opponent, a player always updates her belief by assuming that the opponent has neverdeviated.

    517

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    21/22

    M. KANDORI AND I. OBARA

    where the equality holds for ai E A*. Hence, if we have a two-state machinestrategy equilibrium, conditions (18)-(22) are satisfied with

    (29) x/R(o, aj)= 1 pR(W^ aj)((R - VP)

    and

    (30) xp(o, aj) (1 - p(t, aj))(R - ).

    Note that VR > V7P or a nontrivial wo-state machine equilibrium.

    Conversely, suppose that conditions (18)-(22) are satisfied. Then (29)and (30) can be satisfied for pj(w, aj) E [0, 1], z = R, P, for sufficiently high 6.Hence we obtain the equilibrium condition (28) and the two-state machineequilibrium o support payoffs (VR, VP) for i = 1,2. Q.E.D.

    REFERENCES

    ABREU, D., P. MILGROM, AND D. PEARCE (1991): "Information and Timing in Repeated Part-

    nerships," Econometrica, 9, 1713-1733.ABREU, D., D. PEARCE, AND E. STACCHETTI 1990): "Toward a Theory of Discounted Repeated

    Games with Imperfect Monitoring," Econometrica, 8, 1041-1063.BEN-PORATH, E., AND M. KAHNEMAN (2003): "Communication in Repeated Games with Costly

    Monitoring," Games and Economic Behavior, 44, 227-250.ELY, . C., J. HORNER, NDW. OLSZEWSKI 2005): "Belief-Free Equilibria n Repeated Games,"

    Econometrica, 3, 377-415.ELY, J. C., AND J. VALIMAKI 2002): "A Robust Folk Theorem for the Prisoner's Dilemma,"

    Journal of Economic Theory, 02, 84-105.FUDENBERG, D., D. K. LEVINE, AND E. MASKIN (1994): "The Folk Theorem with Imperfect

    Public Information," Econometrica, 2, 997-1040.FUDENBERG, D., AND J. TIROLE (1991): Game Theory. Cambridge, MA: MIT Press.KANDORI, M. (1999): "Check Your Partners' Behavior by Randomization: New Efficiency Results

    on Repeated Games with Imperfect Monitoring," Technical Report CIRJE-F-49, University ofTokyo.

    (2002): "Introduction o Repeated Games with Private Monitoring," oural of EconomicTheory, 02, 1-15.

    (2003): "Randomization, Communication, and Efficiency n Repeated Games with Im-perfect Public Monitoring," Econometrica, 1, 345-353.

    KANDORI, M., AND I. OBARA 2003): "Efficiency n Repeated Games Revisited: The Role ofPrivate Strategies," UCLA Working Paper 826.

    (2006): "Supplement o 'Efficiency n Repeated Games Revisited: The Role of PrivaceStrategies'," Econometrica Supplementary Material, 74, http://www. conometricsociety. rg/ecta/supmat/5074Ex2.pdf.

    LEHRER, . (1991): "Internal Correlation n Repeated Games," Interational Journal of GameTheory, 9, 431-456.

    MAILATH, G. J., S. A. MATTHEWS, ND T SEKIGUCHI (2002): "Private Strategies in Finitely Re-

    peated Games with Imperfect Public Monitoring," Contributions o Theoretical conomics, 2,http://www. epress. om/befte/contributions/vol2/issl/art2.

    518

  • 8/6/2019 Efficiency in Repeated Games Revisited the Role of Private Strategies

    22/22