acts and conditional probabilities

23
HENRY E. KYBURG ACTS AND CONDITIONAL PROBABILITIES 1 The budget of problems and considerations raised by Gibbard and Harper [ 1] is a rich and entertaining one. Since I believe neither in the usefulness of counterfactuals, nor in conditionalization, but rather in the deep importance of a distinction that Gibbard and Harper flatly reject - that between epi- stemic and stochastic independence - it is both interesting and curious that my framework leads to many of the same conclusions they are led to. But in addition, it seems to me, I can throw light on matters that remain shadowed (or, what amounts to the same thing, a matter of 'intuition') for them. And I do this in a classical, simple, naive, simple-minded, framework [2]. I construe a rational corpus K to consist of a set of statements in a language L. Since rational corpora are just sets of statements, we may consider arbitrary ones without indulging in counterfactuality; ifK is Smith's corpus at a certain time, we may consider the corpus consisting of the deductive consequences of K U {S} ; this set of statements 'exists' just as much as K exists. For the sake of simplicity, I shall suppose here that rational corpora are deductively closed, though I would not always make this assumption. Thus I shall represent the corpus consisting of the deductive closure of the set of statements consisting of K and S by 'K and S': K and S = {x :x E Cn(K U {S})}, provided S is con- sistent with K. What Harper and Gibbard represent as A c~-~B, I represent as 'B EK and A'. Their axioms, of course, do not hold for my notion, though there is a trivial analogue of Axiom 1 (IfA CK, and B EK and A, then BCK) and a partial analogue of Axiom 2 (If ~SEK and A, then it is not the case that S E K and A). The converse of this analogue does not hold - nor is it very plausible as a principle governing counterfactuais. The crucial distinction for me is that between stochastic and epistemic conditional probability. In order to explain this distinction, I must say Theory and Decision 12 (1980) 149-171. 0040-5833/80/0122-0149502.30. Copyright 1980 by D. Reidel Publishing Co., Dordrecht, Holland, and Boston, U.S.A.

Upload: henry-e-kyburg

Post on 06-Jul-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

HENRY E. KYBURG

A C T S A N D C O N D I T I O N A L P R O B A B I L I T I E S

1

The budget o f problems and considerations raised by Gibbard and Harper [ 1]

is a rich and entertaining one. Since I believe neither in the usefulness of

counterfactuals, nor in conditionalization, but rather in the deep importance

of a distinction that Gibbard and Harper flatly reject - that between epi-

stemic and stochastic independence - it is both interesting and curious that

my framework leads to many of the same conclusions they are led to. But in

addition, it seems to me, I can throw light on matters that remain shadowed

(or, what amounts to the same thing, a matter of 'intuition') for them. And I

do this in a classical, simple, naive, simple-minded, framework [2].

I construe a rational corpus K to consist of a set of statements in a language

L. Since rational corpora are just sets of statements, we may consider arbitrary

ones without indulging in counterfactuality; i f K is Smith's corpus at a certain

time, we may consider the corpus consisting of the deductive consequences of

K U {S} ; this set o f statements 'exists' just as much as K exists. For the sake

of simplicity, I shall suppose here that rational corpora are deductively closed,

though I would not always make this assumption. Thus I shall represent the

corpus consisting of the deductive closure of the set of statements consisting

of K and S by 'K and S': K and S = {x :x E Cn(K U {S})}, provided S is con-

sistent with K.

What Harper and Gibbard represent as A c~-~B, I represent as 'B E K and

A' . Their axioms, of course, do not hold for my notion, though there is a

trivial analogue of Axiom 1 ( I f A CK, and B E K and A, then B C K ) and a

partial analogue of Axiom 2 (If ~ S E K and A, then it is not the case that

S E K and A). The converse of this analogue does not hold - nor is it very plausible as a principle governing counterfactuais.

The crucial distinction for me is that between stochastic and epistemic

conditional probability. In order to explain this distinction, I must say

Theory and Decision 12 (1980) 149-171. 0040-5833/80/0122-0149502.30. Copyright �9 1980 by D. Reidel Publishing Co., Dordrecht, Holland, and Boston, U.S.A.

150 HENRY E. KYBURG

something about probability in general. We say that the probability that the next toss of this coin will land heads is a half; for me, such probabilities are

based on statistical knowledge - in this case, knowledge that half the tosses

of coins yield heads. It is not based on any knowledge I have about the fre-

quency with which this coin lands heads; and it need not be based on fre-

quencies concerning coin tosses at all. For example, I might assign the prob-

ability 0.9 to heads on the next toss of this coin on the grounds that Swami

X predicts heads, and he is right 90% of the time. The probability is based on

a frequency, but it is not the frequency in any class mentioned in the prob-

ability assertion. The probability that I will go to the movies next Saturday

may be a half, not because I go to the movies half the time on Saturdays, but

because I have decided to toss a coin Saturday night, and to go to the movies

if and only if it lands heads.

Now consider the probability that a toss of a coin will land heads. I regard

such probabilities as equally epistemic; talk about their being unknown, or

changing, or being different, is to be construed as talk about various rational

corpora subject to various conditions. But they are different from the pre-

vious statements in a significant way. The use of the indefinite article directs

our attention to a specific reference class: tosses of coins, in the example.

Furthermore, if we suppose that the indefinite article in 'a toss' has the sense

of 'a described toss which is random relative to K ' , we can show that the

probability of 'an A is a B' is the interval (p, q) relative to K if and only i fK

contains a statement to the effect that the measure of B 's among A's lies

betweenp andq, and contains no stronger statement. So we might be tempted

to call these statements 'stochastic' as opposed to 'properly epistemic'.

2. AN A L T E R N A T I V E S T R U C T U R E

Let K be a set of statements in a language L. We construe K as a rational corpus and assume that it is consistent and deductively closed:

A-1 C n K C K ;

A-2 ~ 7 0 = 17EK.

We def'me the expansion of K by the statement S, K and S to be the deduc-

tive closure of K U {S}, if S is consistent with K, and the empty set otherwise:

ACTS AND C O N D I T I O N A L P R O B A B I L I T I E S 151

D-1 K a n d S = Cn(KU{S}) i f ~ r - 0 = 17ECn(KU{S})

= 0 otherwise.

L does not contain a counterfactual connective. It may contain intensional

expressions in the form of statistical laws, but it need not. It does contain

frequency statements, of course, but most o f these can be construed as per-

fectly extensional. Intensional or not, statistical statements will have the form

vS(A, B, p, q)7, and be interpreted as 'the frequency with which (propensity

with which) A's are B's lies between p and q' . We want to focus on the

strongest such statements in K: we write:

D-2 rS(A,B,p,q)-l*EKifandonlyifC-S(A,B,p,q)TEKandif rS(A,B, r, s) -1 EK then V(p, q) C (r, s) 7 is a theorem.

For reasons that by now should be well understood, the terms that may

appear in the place o f A and B in vS(A,B,p,q)~ are limited to a certain recursively defined class: we don ' t want to bother our heads about the fre-

quency of tails in the union of the set of tosses that yield tails with the set of tosses with prime ordinal numbers.

Following the example of Harper and Gibbard, I shall simply appeal to the

reader's 'ordinary understanding' of the relation: a is a random member of B

with respect to C relative to the corpus K, written RAN(a,B, C,K). As a

guide to that understanding, note that the relation obtains just when B is an

appropriate reference class for assessing the probability of 'a E C ' given the

background knowledge K.

Probability is defined for equivalence classes of statements:

D-3 Prob (K, S) = (p, q) if and only if there exist terms a, B and C of

L such that:

( 1 ) r a E C ~ S - q E K ;

(2) rS(B, C,p,q) 7 . EK;

(3) RAN(a, B, C, K).

We introduce into L a special operator a. Combined with terms appropriate to the position of A in FS(A,B,p, q)7 _ i.e., terms that may plausibly be

taken to denote reference sets - it forms terms that are taken to be random

members of those sets, with respect to membership in any set denoted by a

152 HENRY E. KYBURG

term appropriate to the place of B in rS(A ,B ,p , q)7. More informally, it corresponds to one use of the indefinite article in English - that employed in the sentence 'the probability that a coin toss will land heads is a half ' , as

well as in 'a tiger is a carnivore' or at least one interpretation o f the latter

sentence. There are two axioms characterizing the operator or:

A-3 I f B and C are terms appropriate to C-S(B, C, p, q)7 , then

RAN(aB, B, C, K).

A-4 If B, VA f~ B -q, and C are appropriate terms, and FaA E B 7 E K, then RAN(aCA 0 BT), r A N B ~ , C, K).

Conditional probability is defined in the obvious way: the probability of S

given T relative to K is just the probability of S relative to K and T:

D-4 Prob(K, T,S) = Prob(K and T,S).

The distinction between stochastic and epistemic probabilities relative to

K is now a simple syntactic one: i fS has the form FaA EB-q thenProb(K, S) is stochastic; otherwise it is epistemic. (Note that both are relativized to a

body of knowledge K.) If S has the form raA EB 7 and T has the form

raA E C 7 , then Prob(K, S, T) is a stochastic conditional probability; if the

operator a does not occur in either S or T it is epistemic. If the operator c~

occurs, but not in the way first described, we shall suppose that Prob (K, S, T) is neither epistemic nor stochastic; this case will not arise in what follows.

Two simple theorems, whose proof would depend on a more detailed

characterization of randomness will help to clarify matters:

T-1 Prob(K,r~A EB 7) = (p,q) i fandonlyi fFS(A,B,p,q)-q*CK

T-2 Prob (K, VaA E B 7, F~A E C -q) = (p, q) if and only if ;-S(A OB, C,p ,q)7*EK.

This theorem follows from T-1 with the help o l D - 4 and A-4.

It will be argued that in deliberation stochastic conditional probability

plays the role that conditional probability plays for Harper and Gibbard, and that epistemic probability plays the role of 'the probability of the conditional'.

But there are differences as well.

We say that S is independent of T, given K, if the probability of S is unaffected by the addition o f T t o K:

A C T S AND C O N D I T I O N A L P R O B A B I L I T I E S 153

D-5 Slnd(K, T) if and only ifProb(K, S) = Prob(K, T, S).

It should be observed that independence is not generally symmetrical: 'this

coin is biassed 0.6 for heads' is independent of 'the tenth toss of this coin

yielded heads', since a single toss does not provide enough evidence to alter

the probability of the statement concerning bias. But equally clearly, to add 'this coin is biassed 0.6 in favor of heads' to your body of knowledge will alter the probability of 'the tenth toss of this coin yielded heads' from 0.5 to 0.6.

But we may distinguish between proper epistemic independence and sto- chastic independence. If S and T have the forms c-ctA ~ B -1 and r-aA E C ~ , respectively, then we may interpret S lnd(K, T) as asserting stochastic independence. On the basis of T-2 we may prove that stochastic indepen- dence is symmetrical:

T-3 F-aA ~ B -11nd (K, r-aA E C 7) if and only if VaA ~ C -q Ind(K, F-aA C BT).

Neither sort of independence has anything to do with 'causal' indepen- dence; but it wilt turn out that we have no need to introduce the notion of causality, except as it appears in the guise of statistical knowledge in K.

3. U T I L I T Y

Since probabilities are interval valued, so will expected utilities be interval

valued. This fact will play a role in a general decision theory, but is of relatively minor importance here. For notational convenience, we suppose Prob is vector valued and Des (desirability) is scalar valued. Utility will thus

be vector valued. Suppose that A = {Ay} represents the alternative types of actions and 0 = {Oi} the alternative types of outcomes.

Relative to A and O, we may define expected utility as follows:

D-6 Ua,o(Fd @Ay 7 ,K) = ~ Des(Fd ~ Oi (3Ay 7) i

eroa (K, ra ~ A~ 7 , rd ~ Op).

Again we may distinguish between proper epistemic and stochastic utility.

154 HENRY E. KYBURG

UA, o ( r - d E A j T , K ) is stochastic if d has the form r-aPT; it is properly

epistemic otherwise.

The basic decision rule is to act in such a way as to maximize expected

utility. Clearly what is intended here is the proper epistemic utility o f the

agent. Since utility is interval-valued, this rule will not solve all problems -

but the ways of choosing actions when this rule does not provide guidance

are not our concern here.

The sure thing principle will be a consequence of the basic decision role

under certain circumstances which we will characterize in due course.

4. DAVID AND B A T H S H E B A

Harper and Gibbard do not seem to distinguish between specific actions and

types or classes of actions, or between specific outcomes and types or classes

of outcomes. I shall suppose that the actions open to the agent in a situation

d can be represented as making true statements of the form r d ~ Aj 7 , where

Aj represents a type or class o f action. Similarly, the entities on which utility

functions will be defined are statements of the form Cd E Oj -~, where Oj

represents a type of outcome. We may also consider instances of types of situations, represented by

terms of the form VaST o

Let d be David's particular situation, P the set of similar political situations,

A = {B, B} the set of actions open to David (sending or not sending for the

woman at the well), and O = {R, R} the set of outcomes to be contemplated

(revolution or not). We suppress A and O in the notation, since they do not

play a role in the analysis of this case. KD we suppose to be David's corpus

o f knowledge. Relation to David's corpus, anyone, including David, can com-

pute the stochastic utilities o f the two courses o f action:

U(r-aP E B 7 , KD) = Des( VaP E B f7 RT)Prob (KD, raP E B 7 , raP E R 7) + Des C a P E B ~ ff~) Prob (KD, l a P C B 7 , %ec~7).

U(r-aP E B -1 , KD) = Des (raP E B C7 R 7) Prob (KD, raP ~ ~7 , raPeR 7) + Des(rae~B nR 7)Prob(KD, r a P ~ 7, rap~-~).

Since a is an operator that insures randomness, the probabilities are just

ACTS AND CONDITIONAL P R O B A B I L I T I E S 155

the conditional measures taken to be known in KD. A political situation in which the woman is sent for is more desirable than one in which she is not,

given the assignment of values made by Harper and Gibbard, just in case the

frequency with which revolution then ensues, plus the frequency with which

revolution fails to ensue following abstention, is less than 10/9. These are

straight-forward conditional measures, assumed to be known by David.

What David wants to know, however, is not the utility an an act of sending for the women at the well, but the utility of his act of sending for the woman at the well. He wants to know U(Fd E BT, KD) and U(rd E ~7, KD). These

will be equal to U(-oaPEBT,KD) and U~-aPEB-1,KD) respectively, just in case two conditions are met:

(i)

(ii)

RAN(d, r-p N B 7, R, K and rd E BT), from which it follows that RAN(d, Fp 0 B 7, _R, KD and rd E B 7).

RAN(d, rp N if7, R, KD and r-d E fin), from which it follows

thatRAN(d, rPnffT,R,KD and Fd E fin).

(In general (i) neither entails nor is entailed by (ii); usually both will be true

or both false, however.)

Are these two conditions met? In the story as told by Hart~er and Gibbard

it seems that they are. This can ordinarily be insured by choosing P in such a

way that they are met. Stochastic and proper epistemic utility can thus be made to coincide.

5. SOLOMON

Let s be Solomon's particular situation, P,B and R as before. Cis the subset

of P in which the leader is charismatic, C its complement in P. K s is

Solomon's rational corpus, except that it also includes the knowledge that

revolts depend largely on the lack of charisma of the leader, and not on the

performance of unjust act, such as B, as well as the knowledge that charis- matic kings tend to act justly and uncharismatic kings tend to act unjustly.

(Note parenthetically that this latter information does not entail, as Gibbard and Harper seem to suppose, that justice is evidence of charisma.

Suppose that 20% of kings are charismatic and just; 10% charismatic and unjust; 40% uncharismatic and unjust; and 30% uncharismatic and just. Then

156 HENRYE. KYBURG

charismatic kings tend to act justly (0.2/(0.2 + 0.1) > 0.1/(0.2 + 0.1)) and uncharismatic kings tend to act unjustly (0.4/(0.3 + 0.4) > 0.3/(0.3 + 0.4)), but justice is not evidence of charisma, but of the lack of it (0.2/(0.2 + 0.3) <

0.3/(0.2 + 0.3))! This same blunder, as committed by another author, was pointed out in Levi [3] .)

Let us add the constraint clearly intended, that the frequency of charisma among Bathsheba abstaining kings is greater than that among non-abstaining kings and that the frequency of non-charismaticity among Bathsheba prone kings is greater than that among abstaining kings.

Furthermore, the relation between justness and sending for Bathsheba is not clear. One unjust act does not unjustice make. Even if unjustness is a sign of lack of charisma, a single instance of unjustness might well not be regarded as significant evidence regarding lack of charisma. For example, if unjustness is a tendency to act unjustly on at least 50% of the occasions on which one is presented with the choice between justness and unjustness, a

single instance of unjustness will not make that hypothesis probable, much less acceptable. This is an instance of the epistemic asymmetry of indepen-

dence mentioned earlier. Since in this case having charisma and having a revolution are indepen-

dent of the act of sending for Bathsheba (but not necessarily conversely - kings lacking charisma might be known to be prone to covet other men's

wives), both the stochastic expectation of the generic act of sending for another man's wife, and the particular act of Solomon's sending for

Bathsheba, are given by

UCaP E BT, K s ) = Des(-aP E B n R ) Prob(K s , raP E B -n,

raP E R 7) + DesCaP E B n g l ) Prob(Ks, raP E B -7,

raP~g~). U~-aP E ~-7, Ks) may be computed similarly.

Note that charisma doesn't enter into the computation at all, even in this general case. What Solomon is interested in, of course, is the particular case: he is interested in U~-s E B -7, Ks) and UCs @ g7, Ks), and if P is chosen wisely so that RAN(s, rp N B 7, R, K s and r-s E B 7) and RAN(s, rp N ~'7, R, Ks and rs E g 7) hold, these will be the same as the stochastic utilities just

computed.

ACTS AND CONDITIONAL PROBABILITIES 157

Under the assumption that Prob(Ks and r a P E B - I , r a p E R q ) = Prob(K and FaPEBT,raPER-q)= Prob(Ks,rapERT), and that RAN(s, rpNBT,R,Ks and rsEB~), we also have Prob(Ks,rsEB~,VsER-1)= Prob(Ks, rs E R 7): the eventuality rs E R 7 is properly epistemically indepen- dent of the decision to make r-s E B -q true.

To obtain an example that will illustrate the point at issue, we may sup- pose that it is given to each king to make a Bathsheba decision once in his reign. B denotes the set of instances in which the other man's wife is sent for, and not merely the general character of unjustness. Let us represent the statistical knowledge of the class of situations P by the following table:

B N C N R : Pl

/~n C A R : P2

B N C A R : P3

B N C N R : P4

B NCN/~: Ps

/~ N CN/~: P6

B N C N R : P7

B N C N R : P8.

Keeping in mind that these frequencies can, in reality, only be known approxi-

mately, the assumed body of knowledge incorporates the following constraints:

(i) (PB + P4)/(/)3 + P4 -1- P7 + P8 ) is high (uncharismatic kings are frequently revolted against)

(this entails that P7 + Ps is small relative to P3 + P4)-

(ii) (/92 q-P6)/(P2 q-P6 +Px +Ps)>P2 +P4 +P6 +Ps

(charismatic kings are prone to make the just Bathsheba decision).

(iii) (P3 -F Pv)/(P3 +P7 -t-p4 +P~) > P , -t-P3 +Ps +P7

(uncharismatic kings are prone to make the unjust Bathsheba decision).

158 HENRY E. KYBURG

(iv) (P2 +Pa)/(P2 +P6 +P4 + P s ) > P ~ +P2 +Ps +P6

(kings who make the just Bathsheba decision tend to be charismatic).

(v) (Pa +PT)/(Pl +P3 +Ps + P T ) > P 3 +P4 +P7 +P8

(kings who make the unjust Bathsheba decision tend to be uncharismatic).

It follows that

(Vi) (Pl +Pa)/(Pl +P3 +Ps + P T ) > P l +P2 +P3 +P4

(kings who fail the Bathsheba test are more often revolted

against than others).

(vii) (/)2 +P4)/(P2 +P4 +-P6 + P s ) < p l +P2 +P3 +P4

(kings who pass the Bathsheba test are less often revolted against than others).

Now Harper and Gibbard are talking about subjective probabilities and not

about frequencies. Nevertheless, every assignment of subjective probability must be consistent with some set of frequencies, on pain of incoherence. (We leave to one side the probabilities of the counterfactuals, attending only to the indicative sentences.) A natural assumption, given the claim that 'Unjust acts themselves.. , do not cause successful revolts' is that the outcome of the Bathsheba test is irrelevant to the frequency of revolt; put in other words

Pl = P2, P3 ---= P4, Ps = P6, and P7 = P8. Since this is flatly inconsistent with the claim that 'charismatic kings tend to be just' (ii) together with 'uncharis- matic kings tend to be unjust' (iii); and with the (implied) claim that 'just kings tend to be charismatic' (iv) and 'unjust kings tend to be uncharismatic' (v), we must assume that Harper and Gibbard do not intend that these probabilities (Pl &P2 ,P3 &P4, etc.) are equal. And there is no reason why they should: their probabilities are subjective, so there is no reason why one frequency cannot be known to be equal to another (revolts under charis- matic kings who fail the Bathsheba test and revolts under charismatic kings who pass the Bathsheba test, for example), and at the same time a higher probability be assigned to one (revolt under a specific king who fails the test) than to the other (revolt under a king who passes the test). This seems to me to be patently irrational, but that may merely reflect my prejudices about probability and rational belief.

ACTS AND C O N D I T I O N A L P R O B A B I L I T I E S 159

So far so good, until we are given the distinction between an 'indication'

and a 'cause'. We are told that abstention 'for this reason' (i.e., to bring about

an indication of charisma) would be useless in avoiding a revolt. But in the

earlier discussion there was no mention of motivation one way or the other.

If it is significant, it should be taken account of. It is quite true that if we are thinking of frequencies, to bring about an indication of an outcome is not 'in

any way' to bring about the outcome itself. But surely if this is the case we cannot take the probability of the outcome given the indication to be differ-

ent from the probability of the outcome. In Bayesian terms this contradicts the previous assertion that the probability of the indication given the outcome

is greater than the prior probability of the indication. Relevance, for the

Bayesian must be symmetrical: if A is positively relevant to B then B is posi-

tively relevant to A. Rather than to try to sort out what the authors have in mind, let us consider both cases.

Case (a): Solomon knows that the frequency of revolt is independent of

the frequency of positive results on the Bathsheba test: Pl = P 2 , P3 = P4,

p5 = P6, and P7 = Pa- This appears to be the situation that Harper and Gibbard have in mind when they talk of causal independence. But then we

may easily compute that the frequency of Revolution in B, (Pl +P3)/ (Pl + P 3 + P s +PT) , is just the same as that in /~, (P2 + P 4 ) /

(/)2 +P4 +P6 +Ps) , and indeed in P in general, (Pl +P2 + P 3 + P 4 ) :

U(-mP E B -q, Ks ) = Des(raP E B N R 7) Prob(Ks and r a P ~ B 7,

I-aP E R ~) + Des(-aP E B (1 ~-1) Prob(Ks and r a P E B 7, Fap E / ~ 7 )

= Des(-ap~B ART)Prob(Ks,FaPER 7) + + Des(-aP E B (1 ~7 Prob(Ks, raP E_~).

V(-aP ~ ffq, Ks) = Des(-aP E B VI R 7) Prob(Ks, FaP E R ~) + + Des(aP E B (3 R-q Prob(Ks, vaPE ~7).

Since in both cases the same probability is involved, and whether or not there is a revolution Solomon prefers having Bathsheba to not having her, he should send for her forthwith provided RAN(s, r-p (~ B 7, R, Ks and r-s E B -q)

and RAN(s, P A B, R, Ks and Fs E if-l). The analysis boils down to that for David, so long as revolution is independent of Bathsheba both in the presence of charisma and in the absence of charisma.

Case(b): Pl r or Pa r orps :/:P6 o r p 7 =/:Ps-This is to say that

160 H E N R Y E. K Y B U R G

we abandon the assumption that injustice is known to be irrelevant to revolt

and replace it with knowledge of relevance. But that does not make the

case uninteresting. Consider first the stochastic expectation of the generic

decision raP E B -q. Under the new circumstances we have

U(WaP E B -1 , Ks) = Des(aP E B A R -q) Prob(Ks, rap ~ B 7 ' cap E R 7 ) + D e s ( a p E B (1K 7) Prob(Ks, raP ~ B 7, raP E ~ ~).

U (-aP e f f n , K s ) = D e s (-aP e ~ n RT ) Pro b ( K s , F aP ~ f i , r-aPER~) + Des(-aPEB (1R 1) Prob(K s, raP ~ gn , raP ~ ~ 7).

Since we are now assuming that

Prob(Ks,VaPEB-I,r-aPER -1 > Prob(Ks,FaPERT), and

Prob(Ks, raP E fin, raP E R ~) < Prob(Ks , raP E R~) ,

it is clear that the new utilities may not be the same as the previous utilities.

That is, under the new assumptions the relative stochastic expectations of

F a P E B 7 and c - a P E f i may be reversed. Remember, though, that we are

talking of the expected utility of a king who falls (or passes) the Bathsheba

test, and the probabilities that enter into these computatations are simply

the counterparts of the general relative frequencies or propensities known in

Solomon's corpus Ks. Next let us look at the utilities of r-s E B 7 and r-s C f i evaluated from an

outsider's point of view. By this I mean that we consider Solomon's corpus

K s and add to it merely r-s E B -n or r-s ~ / ? q . Solomon naturally knows much more than this: he will know, for example, whether it would be lust that would tempt him to make r-s C B -7 true, or whether it would be selfless

consideration for Uriah whom he knows to be unhappy with Bathsheba.

(David knows much more than Ku and r-d E B 7 , too, but in his case we

assumed that knowledge to be irrelevant to his deliberations.) From the

outsider's point of view, we may well have RAN(s,FPC3B~,R,Ks and

r-sEB-1) and RAN(s , r -PC3f i ,R ,Ks and r-s EB-~). That is, relative to an

outsider's point of view, Solomon's situation s may well be a random member of r-p C3 B 7 or r-p (~ g n as the case may be with respect to R. For the outsider,

then, the stochastic utilities may correspond to the proper epistemic utilities.

ACTS AND C O N D I T I O N A L P R O B A B I L I T I E S 161

Next, suppose that Solomon himself truncates his c o r p u s - i . e . , he

brackets the motives, noble and ignoble, that incline him one way or the

other. When he does this, he is adopting the point of view of the outsider:

the randomness conditions are still met; the utilities o f r s E B -q and rs E f t q are still the same as those of the general events r aP C B 7 and raP E fin. It

seems quite appropriate to regard these utilities, as Harper and Gibbard do,

as the values of r-s ~ f i n and C-sEB7 'as news'; the deliberative element has

been wiped out. These would be the appropriate epistemic utilities for

Solomon i f he were to be tom what he was going to decide to do. But he is

not then deeming what to do: the utilities are not relevant to his decision,

since they represent utilities based on the hypothesis that he has made one

or another of the decisions.

Now let us allow Solomon to deliberate. We need not follow the course

of deliberations, but may abbreviate what he learns in the course of those

deliberations by 'cc'. Thus what Solomon really wants to evaluate are

u(r-s ~ B 7 , Ks and ee) and

U(rs ~ 7 , Ks and ce).

It is unlikely that s is a random member of either r p N B 7 or r p C7/~-7 with

respect to R relative to Ks and ce and r-s E B 7 or relative to Ks and ee and

r-sEffT. In fact with respect to determining the probability of r-s E R -7

relative to Ks and ce and rs E B 7 , it might turn out that the relevant random-

ness relation obtained between three entirely new objects: X, Y, and Z. That

is, we might have :

rX E Z +--+ s E R 7 E K s and cc and Fs EBT;

R A N ( X , Y, Z, K s and cc and rs E BT);

rS( y , Z, p , q )7 * E K s and cc and r-s CBT;

and therefore

Prob(K s and cc and V-s C BT,C-s C R 7 ) = (p, q).

There is yet another way of dealing with this problem. Suppose that Solomon's conscious motives in making his decision are a mixture of the

noble and the ignoble, so that whichever decision he makes, s will belong

both to a class in which revolution is rare and to a class in which it is frequent;

162 HENRY E. KYBURG

and it may well be that Solomon does not have the information to determine

the frequency of revolution in the intersection of these classes. There is never-

theless, a course of action open to him which will leave his expectations unchanged: Suppose he declines to make the decision between r-s C B 7 and

r s @fin, and instead employs a chance mechanism with frequency of 'Go'

equal to Pl q- P3 + Ps + P7- Since this is the overall frequency of B in the

reference class P, it will not indicate anything positive or negative about the

likelihood of revolution. The outcome, by definition, is independent of

whether or not he has charisma, and thus of the occurrence of revolution.

This may explain the propensity of many people to flip a coin to decide

between two courses of action, one of which would be more fun, and one

of which would reveal strength of character (to go to the movies or to study).

6. D E L I B E R A T I O N AND CHOICE

Finally, let us suppose that K s contains just Solomon's knowledge that he

is deliberating. Let us further take this as implying that he believes that he

has freedom of choice, and that therefore there is no connection between

how he decides and whether or not he has charisma. Now he cannot assume

that this is true of all the kings involved in the reference class P - since in

that class there is revealed a connection between passing the Bathsheba test

and being revolted against. But he is perfectly free to believe that there is a

subset o f P, say FP, in which the frequency of revolt is the same for rFp N B 7 and for rFp A f t 7. Furthermore, since his knowledge of the statistical

relations between C and R is perfectly general, we may suppose that it applies

to FP as well as to P. (We could justify this assumption through more detailed

considerations of the weight of inductive evidence, but that would lead us

astray from the main point.) We are then supposing that Solomon's corpus

Ks contains

r-S(FP, R, q, q)7,

rS(Fe n e, R, q,

r-S(FP n Jg, R, q, q)7,

where q = Pl + P2 + P3 + P4. Relative to this corpus, it may well be the case

that

ACTS AND CONDITIONAL P R O B A B I L I T I E S 163

RAN(s, FFPNB-q,R,Ks andr sEBT) , and

RAN(~, FFP n ffq, R, Ks and rs E if7),

and, therefore, that rs E R-7 is properly epistemically independent of r-s @ B-q.

If we compute the utilities in this case it is clear that we will find that the

utility of sending for Bathsheba is greater than the utility of abstaining.

7. R E B O A M

Let r be Reboam's special situation, P the set of like situations, C those in

which the king is charismatic, S those in which the king is severe, and D

those in which the king is deposed. We suppose enough data so that the following frequencies may be accepted as practically certain by Reboam.

C A S N D 0.16

C N S N ff) 0.24

C N S N D 0.02

C N S A D 0.08

C N S A D 0.08

~ n s n ~ 0.02

C n S N D 0.22

C N S N D 0.18.

Consider first the general case:

U(-aP E S-q, K R ) = Des(-ap C D n S -1) Prob(K R and rap E S -7, raao c D 7) + Des(-o~P E D n S-q) Prob(Kn and r~p E S 7, rape/~7)

= 4.8 + 52 = 56.8.

U~-aP @ if-q, KR) = Des(aP E D n S'q) Prob(K n and r ap E Sq, r a p e D -q) + D e s ( a P E D N~) Prob(Kn and rap E if-l, rap EE/3-q)

= 0 + 4 1 . 6 = 41.6.

164 HENRY E. K Y B U R G

This is as it should be, and corresponds to the application of the stochastic sure-thing principle described by Harper and Gibbard.

But it does not apply to Reboam, because r is not a random member of r p N S 7 with respect to D, relative to Reboam's own corpus. This is (as in

Solomon's case) a result of the fact that he is deliberating and choosing. Let

the corpus that reflects the course of his deliberations be Kn and C. The

probability that Reboam is charismatic is 1/2, since before he chooses

whether or not to be severe, he is just another king, and half of them are

charismatic. The same is true for us.

Let Reboam now choose to be severe. For us, he is a random member of the kings who thus choose, of which 80% are charismatic. For Reboam this is

not so: that he chooses to be severe or not to be severe has no effect on his

degree of belief in his charisma. He knows (given the story that Gibbard and

Harper tell) that his choice cannot affect his degree of charisma. We can express this by saying that r belongs to a subclass o f r p n S 7 (or o f r p N fin),

even when r r E S -~ is known (when r r E fin is known) in which the frequency

of C is the same as it is in P in general: namely, the class in which S is freely chosen. Surely Reboam regards his choice as free, even if we do not (else he

would not be deliberating). Furthermore, that it is freely chosen implies that

it cannot be taken as evidence of a prior state: that Reboam freely chooses r r E S 7 implies that the epistemic probability that he is charismatic is unchanged - that r r E C -q is epistemically independent of r r E S q. Of course we don't know what proportion of r p N S -1 represents the free choice of

S - but we do know that however small this class may be, it must contain the same proportion of C as P itself, else epistemic independence must fail.

Let the class of instances in which S or ff is freely chosen be PF; then Reboam will regard his situation r as a random member of PF, where he knows that the frequency of C among both r(PF) O S --1 and r (pF) N if-1 is

1/2. He also knows that the frequency of deposition given C and S, given C

and S, given C and S, and given C and S is as stipulated earlier: F has no known bearing on these frequencies. Therefore, we may compute:

Prob(KR and rrES-q,rrED 7) = Prob(KR a n d r r e S 7, rrE (D n C) U (D n e~ ) .

Since RAN(r, rpF n S 7, D, Ks and r r E S-q), and we may compute:

ACTS AND CONDITIONAL P R OBABILITIES 165

S(PF n S, (D N C) U (D N C), 0.6, 0.6);

S(PF n S, (D G C) U (D N C), 0.4, 0.4);

S(PF N S, (D N C) U (D N C), 0.375,0.375);

S(PF n S, (D n C) U (D G C), 0.625,0.625).

Thus: U(rrEST,KR) = 0 . 6 x 0 . 1 0 + 0 . 4 x 100 = 46;

U(rrEST,Ka) = 0.375 x 0 + 0.675 x 80 = 50.

In the former case we could apply a sure-thing principle because the prob-

abilities of the outcomes of the acts were independent of the acts. Here

matters are a little more complicated: there is a circumstance (C or C) independent of the act which serves as a basis for the application of the sure-

thing principle. Note that C is not independent of the act under the stochastic

analysis, but that D is; under the epistemic analysis C is independent of the

act but D is not. The distinction between purely epistemic utility and sto-

chastic utility, which reflects the difference between purely epistemic prob-

ability and (indefinite) stochastic probability, provides for precisely the dis-

tinction that Harper and Gibbard want to draw attention to, but requires

neither a peculiar counterfactual, nor a variety of sure-thing principles.

8. NEWCOMB

Let s be the subject's situation, N the set of Newcomb situations, M the set in

which there is a million dollars in the right-hand box, and R the set in which

the right-hand box alone is taken. We had best forget about the demon or the

psychologist, because that introduces a competitive element which might

distort our intuitions. We merely suppose that we and the subject know that all (almost all) R's are M's and that none (almost none)/~"s are M's. The

stochastic expectations are easily computed:

u(r aN E R 7 ,Kn) = $106 X Prob(Kn and raN E R~ , aN ~M) ~ $ 1 0 6 ;

U(FaN E R -1,Kn) = $106 X P y o b ( K n and raN E R-1, aNEM) + $1000 ~ $1000.

But the subject of the experiment is not concerned with stochastic

166 HENRY E. KYBURG

probability, but with proper epistemic probability:

U(r-s ER 7 ,gn) = $106 x Prob(Kn and rs E R 7 , r-s E 3~r7);

U(Vs ER 7 ,Kn) = $106 x Prob(Kn and rs ER 7, Vs ~M 7) + + $1000.

We need to evaluate these probabilities. Now since to make Fs E R 7 or

rs E/~7 true is in the power of the subject (and if we want to tall about a

demon or a psychologist, it is not within her power to alter the truth of

Vs E M 7 at the time of the subjects' decision), he must regard his situation s as falling in a subset FN of N in which M is statistically independent of R; he

may know relatively little about this frequency, say merely that it lies

between 0.1 and 0.5. This is to say that in Kn are the statements:

rS(N,M, 0.1,0.5)7;

r-S(FN cl R,M, 0.1,0.5)7;

r:S(FN c~ P,, M, 0.1,0.5) 7 .

We may suppose that RAN(s, r-FN A R 7 , M, Kn and r-s C R 7) and RAN(s, rFNC3KT,M, Kn and r s c / 2 7 ) . Therefore Prob(Kn and VsERT, rsEM7)= Prob(Kn and VsER7, r s E M T ) = ( 0 . 1 , 0 . 5 ) , and U(rsER 7, Kn) = (100000, 500000), U(rs E/~ 7 ,Kn) -- (101 000,501 000).

Note that we obtain the same result if the subject is totally ignorant of the

frequency with which a million dollars is put in the right hand box; let K*

contain: r-S(N, M, O, 1)7;

r-S(FN C3 R,M, O, 1)7;

r-S(FN c3 R, M, O, 0 7 ;

U(r-s E R 7,K~,) = (0, 106);

UCs E / 2 7 , K *) = (1000, 10 6 + 1000).

Finally, the subject may consider a set of rational corpora Knp which are like his except that they contain the knowledge of the frequency with which M

occurs in N;Knp contains:

rS(N,M,p,p)7;

ACTS AND CONDITIONAL PROBABILITIES 167

rS(FN n R, M, p, p)-q ;

FS(FN N R, M, p, p)-q ;

U(F-s~R-q,Knp) = p • 106;

U([-sER-q,Knp) = p • 106+ 1000.

In any event, we see that taking both boxes is more profitable than taking

one box.

9. THE ROAD TO DAMASCUS

This is strikingly different from Newcomb's problem because of the com-

petitive element that enters into it. We suppose that Death is trying specifi-

cally to outwit the traveller. (If there is an appointment book made up in

advance, the previous analyses go through and the probability, of death for

the traveller is the same whatever he does: it is epistemically independent

of his decision.) Furthermore, we suppose that Death has psychic powers, so that the decision to go to Aleppo will, with high probability, be psychically perceived by Death and the decision to go to Damascus will, with high prob- ability, be perceived by Death, in either case with unfortunate results. What

is the traveller to do?

Let t be his situation, F the set of such situations, A the subset in which the traveller decides to go to Aleppo, A the subset in which he decides to go

to Damascus, Da the set in which Death seeks him in Aleppo and D.a the set in which Death seeks him in Damascus. We have supposed that the traveller's

corpus Kt contains:

rS(FNA,DA, 0.9, 0.9)-q ;

r S ( F n .4, D•, 0.9, 0.9) -q .

That is, due to Death's psychic powers the frequency with which Death seeks

a traveller who has decided to go to Aleppo is high, and similarity for Damascus. Thus:

u(rceF E A -q , Kt) = DesCaF E A ^ aF E DA-q) Prob (K t and r-ceF E A-q, VaF E DA ~) + Des (raF E A ^ aF E D~) Prob (Kt and VaF E A -q, V aF E D ~ )

168 H E N R Y E. K Y B U R G

= - - 1 0 0 • 2 1 5 = - -90 .

Similarly,

U(VaF e Z17 , Kt) = -- 90.

Since Death is psychic, the same holds for the specific situation t:

U(Vt E A 7 , Kt) = -- 90;

U(rt E A 7, Kt) = -- 90.

But the traveller has a way out: since Death is depending on his psychic

powers to perceive the traveller's decision, the traveller can improve his

chances by not deciding either to go to Damascus or to go to Aleppo. How?

By tossing a coin! We have not supposed that Death is totally prescient, but

merely that he can with high probability, read the traveller's mind. Let T be

the statement that the traveller goes to Aleppo or Damascus according as his

toss lands heads or tails; TF the set of F situations in which a coin toss is

used. Then

U(T, Kt) = Des(rT ^ t E A ^ tEDaT)Prob(Kt, VT ̂ t E A ^ t @ DA 7) + Des(FT ̂ t E A A t E Da-7) Prob(Kt,VT ̂ t @ A ^ tEDA 7) + Des(FT ̂ t E A ^ tED~)Prob (K t , FT ̂ t E A ^ t E D ~ ) + Des(VT^ t EA ^ t E Dy~-q)Prob(Kt, FT^ t EA ^ t E D~7).

We suppose that in general Death spends half his time in Damascus and half

his t ime in Aleppo; we have already supposed that he has no foreknowledge

of the outcomes of coin tosses. This means that we may plausibly suppose the

traveller to have the following statistical knowledge:

vS(TF, ACIDA, 1/4, 1/4)-7;

rS(TF, A CTD~, l /4, l/4)-~; 7S(TF, A C7 DA, 1/4, 1/4)-7;

vS(TF, A CT Da, 1/4, 1/4) 7

So far as the traveller is concerned,

RAN(t, TF, rA (7 D A N , Kt),

ACTS AND C O N D I T I O N A L P R O B A B I L I T I E S 169

RAN(t, TF, VA A D~ q ,Kt),

RAN(t, TF, r-~ ~ Da~ ,Kt),

RAN(t, TF, F~ • D ~ , K,).

We may thus compute U(T, Kt)= - -50 . The situation is not only stable,

but the traveller has improved his expected utility.

10. C O N C L U S I O N S

What is the upshot of all this? First of all, that an analysis of these seemingly

problematic decision situations can be provided in which no use is made of

counterfactuals, and in particular in which no use is made of Stalnaker's

implausible principle that one or the other of

I fA were true,B would be true, or

I fA were true, B would be false,

is employed. Although we may want the object language to contain an inten-

tional statistical predicate, there is no need for this to be the case, and we

have little reason, in most non-theoretical contexts to introduce such a

predicate.

In the metalanguage, similarly, everything is extensionalized: we do not

talk about what Solomon's corpus would be if he were to decide to make

Fs E B q true; we can actually refer to a certain set of sentences (namely

K s and Fs E Bq), which exists at least as unproblematically as any other

set. (This may not be very unproblematic.)

We need only one decision rule, that of maximizing expected proper

epistemic utility. (This rule breaks down in some situations, due to the

fact that utilities should be regarded as interval valued, which fact in turn

is a consequence of the fact that probabilities are interval valued. But this

refinement does not enter into any of the examples discussed in this paper.)

In those cases in which, in general, a proposition is an indicator of a

property known to have a bearing on the outcome of the act of making that

proposition true, we can in general suppose that the agent, insofar as he

regards himself as an agent, must regard himself in that act as belonging to a

subset of the general class of situations, in which subset the property in

170 HENRY E. KYBURG

question is properly epistemically independent of the proposition that that agent is contemplating making true.

We can define expected stochastic utility, with which some of the examples have been concerned, but this is of interest to an agent only when it has the same value as expected proper epistemic utility - that is, when the agent's

situation is a random member of the class of situations with which the sto-

chastic utility is concerned.

We can derive a sure thing principle from our decision rule; it is simply a

consequence of that rule, and has the form:

If rs E R -q is properly epistemically independent of r-s E C 7,

relative to K, and r-s E R ~ is properly epistemically independent of r s E C T , relative to K, and U(T,K and r-sECT) is greater

than U(~ T,K and rs E C-1), and U(T,K and rs ~ C-~) is greater

than U(~T, KandrsECT) , then U(T, K) is greater than

U(~T,K).

But this is not a very interesting principle - it merely, sometimes, simplifies

some calculations.

We can make sense of situations in which the analysis of Harper and

Gibbard yields unstable conditional utilities by means of randomization.

According to the analysis provided by Harper and Gibbard, U-maximiz-

ation sometimes leads to instability, and V-maximization simply isn't defined

once a person knows what he is going to do. Neither of these problems beset

the analysis offered here. Finally, the semantics for all that is employed here requires the use of only

one possible world, unless we want to give an intensional interpretation to

some of the statistical statements involved. (For example, we might suggest that the traveller toss a fair coin, which would bring in a number of theoretical

considerations.) One world, classical sentential connectives, one notion of epistemic probability and one of utility (of which, as special cases, we may distinguish between stochastic and properly epistemic), one principle of decision, and yet we can respond to the most common intuitions about the cases examined. We have to give up the general Bayesian principle of condition-

alization, but I would argue that even it has its advantages.

The University of Rochester

ACTS AND C O N D I T I O N A L P R O B A B I L I T I E S 171

B I B L I O G R A P H Y

[ 1 ] Gibbard, Allen and Harper, William L., 'Counterfactuals and Two Kinds of Expected Utility', in Hooker, Leach and McClennen (eds.), Foundations and Applications of Decision Theory, Vol. I, D. Reidel Pubfishing Co., Dordrecht, Holland, 1977.

[2] Kyburg, Henry E., Jr., The Logical Foundations of Statistical Inference, D. Reidel Publishing Co., Dordrecht, Holland, 1974.

[3] Levi, Isaac, 'Newcomb's Many Problems', Theory and Decision 6 (1975), 161-175.