THE TWO-ENVELOPE PARADOX AND THE FOUNDATIONS OF RATIONAL DECISION THEORY

Terry Horgan

University of Memphis

You are given a choice between two envelopes. You are told, reliably, that each envelope has some money in it—some whole number of dollars, say—and that one envelope contains twice as much money as the other. You don’t know which has the higher amount and which has the lower. You choose one, but are given the opportunity to switch to the other. Here is an argument that it is rationally preferable to switch: Let x be the quantity of money in your chosen envelope. Then the quantity in the other is either 1/2x or 2x, and these possibilities are equally likely. So the expected utility of switching is 1/2(1/2x) + 1/2(2x) = 1.25x, whereas that for sticking is only x. So it is rationally preferable to switch.

There is clearly something wrong with this argument. For one thing, it is obvious that neither choice is rationally preferable to the other: it’s a tossup. For another, if you switched on the basis of this reasoning, then the same argument could immediately be given for switching back; and so on, indefinitely. For another, there is a parallel argument for the rational preferability of sticking, in terms of the quantity y in the other envelope. But the problem is to provide an adequate account of how the argument goes wrong. This is the two-envelope paradox.

In an earlier paper (Horgan 2000) I offered a diagnosis of the paradox. I argued that the flaw in the argument is considerably more subtle and interesting than is usually believed, and that an adequate diagnosis reveals important morals about both probability and the foundations of decision theory. One moral is that there is a kind of expected utility, not previously noticed as far as I know, that I call nonstandard expected utility. I proposed a general normative principle governing the proper application of nonstandard expected utility in rational decisionmaking. But this principle is inadequate in several respects, some of which I acknowledged in note added in press and some of which I have meanwhile discovered. The present paper undertakes the task of formulating a more adequate general normative principle for nonstandard expected utility. After preliminary remarks in section 1, and a summary in section 2 of the principal claims and ideas in Horgan 2000, I take up the business at hand in sections 3-6.

1.         Preliminaries.

To begin with, the paradoxical argument is an expected-utility argument. In decision theory, the notion of expected utility is commonly articulated in something like the following way (e.g., Jeffrey 1983). Let acts A1,…,Am be open to the agent, and let the agent know this. Let states S1,…,Sn be mutually exclusive and jointly exhaustive possible states of the world, and let the agent know this. For each act Ai and each state Sj, let the agent know that if Ai were performed and j obtained, then the outcome would be Oij and let the agent assign to each outcome Oij a desirability DOij. These conditions define a matrix formulation of a decision problem. If the states are independent of the acts—probabilistically, counterfactually, and causally—then the expected utility of each act Ai is this:

U(Ai) = åj pr(Sj)×DOij

I.e., the expected utility of Ai is the weighted sum of the desirablities of the respective possible outcomes of Ai, as weighted by the probabilities of the respective possible states S1,…,Sn.

Second, the conditions characterizing a matrix formulation of a decision problem are apparently satisfied in the two-envelope situation, in such a way that the paradoxical argument results by applying the definition of expected utility to the relevant matrix. The states are characterized in terms of x, the quantity (whatever it is) in the agent’s chosen envelope. Letting the chosen envelope be M (for ‘mine’) and the non-chosen one be O (for ‘other’), we have two possible states of nature, two available acts, and outcomes for each act under each state, expressible this way:

O contains 1/2x                                                 O contains 2x

Stick                       Get x                                                                Get x

Switch                    Get 1/2x                                                            Get 2x

Matrix 1

Each of the two states of nature evidently has probability 1/2. So, letting the desirability of the respective outcomes be identical to their numerical values, we can plug into our definition of expected utility:

U(Stick) = [pr(O contains 1/2x)×D(Get x)] + [pr(O contains 2x)×D(Get x)]

= 1/2×D(Get x) + 1/2×D(Get x)

= 1/2x + 1/2x

= x

U(Switch) = [pr(O contains 1/2x)×D(Get 1/2x)] + [pr(O contains 2x)×D(Get 2x)]

= 1/2×D(Get 1/2x) + 1/2×D(Get 2x)

= 1/2×1/2x + 1/2×2x

= 1/4x + x

= 5/4x

Third, the operative notion of probability, in the paradoxical argument and in decision theory generally, is epistemic in the following important sense: it is tied to the agent’s total available information. So I will henceforth call it ‘epistemic probability’. Although I remain neutral about the philosophically important question of the nature of epistemic probability, lessons that emerge from the two-envelope paradox yield some important constraints on an adequate answer to that question. [1]

Fourth, below it will be useful to illustrate various points by reference to the following special case of the two-envelope decision situation, which I will call the urn case. Here we stipulate that the agent knows that the dollar-amounts of money in the two envelopes were determined by randomly choosing a slip of paper from an urn full of such slips; that on each slip of paper in the urn was written an ordered pair of successive numbers from the set {1,2,4,8,16,32}; that there was an equal number of slips in the urn containing each of these ordered pairs; and that the first number on the randomly chosen slip went into the envelope the agent chose and the second went into the other one. Under these conditions, the acts, states, and outcomes are represented by the following matrix:

Stick                             Switch

M contains 1 and O contains 2               Get 1                            Get 2

M contains 2 and O contains 1               Get 2                            Get 1

M contains 2 and O contains 4               Get 2                            Get 4

M contains 4 and O contains 2               Get 4                            Get 2

M contains 4 and O contains 8               Get 4                            Get 8

M contains 8 and O contains 4               Get 8                            Get 4

M contains 8 and O contains 16              Get 8                            Get 16

M contains 16 and O contains 8              Get 16                          Get 8

M contains 16 and O contains 32            Get 16                          Get 32

M contains 32 and O contains 16            Get 32                          Get 16

Matrix 2

Since each of the 10 state-specifications in Matrix 2 has epistemic probability 1/10,

U(Stick) = 1/10(1+2+2+4+4+8+8+16+16+32) = 9.3

U(Switch) = 1/10(2+1+4+2+8+4+16+8+32+16) = 9.3

Fifth, below I will occasionally refer to the following variant of the original two-envelope decision situation. You are given an envelope M, and there is another envelope O in front of you. You are reliably informed that M has a whole-dollar amount of money in it that was chosen by a random process; that thereafter a fair coin was flipped; and that if the coin came up heads then twice the quantity in M was put into O, whereas if the coin came up tails then half the quantity in M was put into O. I will call this the coin-flipping situation, in contrast to the original situation that generates the two envelope paradox. In this coin-flipping situation, you ought rationally to switch—as has been correctly observed by those who have discussed it (e.g., Cargile 1992, 212-13, Jackson et. al. 1994, 44-45, and McGrew et. al. 1997, 29).

Finally, it also will be useful to have before us the following special case of the coin-flipping situation, which I will call the coin-flipping urn case. Here we stipulate that the agent knows that the whole-dollar amount in his own envelope M was determined by randomly choosing a slip of paper from an urn full of such slips; that on each slip in the urn was written one of the numbers in the set {2,4,8,16,32}; and that there was an equal number of slips in the urn containing each of these numbers. The agent also knows that after the quantity in M was thus determined, the quantity in O was then determined a fair coin-flip, with twice the quantity in M going into O if the coin turned up heads, and half the quantity in M going into O if the coin turned up tails. Under these conditions, the expected utilities are calculated on the basis of the following matrix:

Stick                             Switch

M contains 2 and O contains 1               Get 2                            Get 1

M contains 2 and O contains 4               Get 2                            Get 4

M contains 4 and O contains 2               Get 4                            Get 2

M contains 4 and O contains 8               Get 4                            Get 8

M contains 8 and O contains 4               Get 8                            Get 4

M contains 8 and O contains 16              Get 8                            Get 16

M contains 16 and O contains 8              Get 16                          Get 8

M contains 16 and O contains 32            Get 16                          Get 32

M contains 32 and O contains 16            Get 32                          Get 16

M contains 32 and O contains 64            Get 32                          Get 64

Matrix 3

Since the probability is 1/10 for each of the states in Matrix 3, the expected utilities are

(Stick) = 1/10(2+2+4+4+8+8+16+16+32+32) = 1/10(124) = 12.4

(Switch) = 1/10(1+4+2+8+4+16+8+32+16+64) = 1/10(155) = 15.5

2.         Diagnosis and Theoretical Implications.

Discussions of the two-envelope paradox (e.g., Nalebuff 1989, Cargile 1992, Castell and Batens 1994, Jackson et. al. 1994, Broome 1995, Arntzenius and McCarthy 1997, Scott and Scott 1997, Chalmers unpublished) typically claim that there is something wrong with the probability assignments in the paradoxical argument—although there are differences of opinion about exactly how the probabilities are supposed to be mistaken. I disagree. Consider the urn case, for example. On my construal of the paradoxical reasoning, the symbol ‘x’ goes proxy for a rigid definite description, which we can render as ‘the actual quantity in M’ (where ‘actual’ is a construed as rigidifying operator). With respect to the urn case, the following list of statements constitutes a fine-grained specification—expressed in terms of the rigid singular term ‘the actual quantity in M’—of  the epistemic possibilities concerning the contents of envelopes M and O:

1.         The actual quantity in M = 1 & O contains 2

2.         The actual quantity in M = 2 & O contains 1

3.         The actual quantity in M = 2 & O contains 4

4.         The actual quantity in M = 4 & O contains 2

5.         The actual quantity in M = 4 & O contains 8

6.         The actual quantity in M = 8 & O contains 4

7.         The actual quantity in M = 8 & O contains 16

8.         The actual quantity in M = 16 & O contains 8

9.         The actual quantity in M = 16 & O contains 32

10.        The actual quantity in M = 32 & O contains 16

Each statement on this list has epistemic probability 1/10. Hence, since all the statements are probabilistically independent of one another, the disjunction of the five even-numbered statements on the list has probability 1/2, and the disjunction of the five odd-numbered ones also has one half. But the epistemic probability of the statement

O contains 1/2(the actual quantity in M)

is just the epistemic probability of the disjunction of the even-numbered statements on the list, since each even-numbered disjunct specifies one of the epistemically possible ways that this statement could be true. Likewise, the epistemic probability of the statement

O contains 2(the actual quantity in M)

is just the epistemic probability of the disjunction of the odd-numbered statements on the list, since each of the odd-numbered statements specifies one of the epistemically possible ways that this statement could be true. Therefore, in the urn case, the statements

pr(O contains 1/2(the actual quantity in M)) = 1/2

pr(O contains 2(the actual quantity in M)) = 1/2

are true.  In both , the constituent statement within the scope of ‘pr’ expresses a coarse-grained epistemic possibility, a possibility subsuming exactly half of the ten equally probable fine-grained epistemic possibilities corresponding to the statements on the above list. Each of these two coarse-grained epistemic possibilities does indeed have probability 1/2, since each possibility is just the disjunction of half of the ten equally probable fine-grained epistemic possibilities. Moreover, these points about the urn case generalize straightforwardly to the original two-envelope situation. So, since the symbol ‘x’ in the paradoxical argument goes proxy for ‘the actual quantity in M’, the probability assignments employed in the argument are correct.

How then does the paradoxical argument go wrong? To come to grips with this question, we need to appreciate several crucial facts about epistemic probability and about the concept of expected utility—facts that the argument helps bring into focus.

First, epistemic probability is intensional, in the sense that the sentential contexts created by the epistemic-probability operator do not permit unrestricted substitution salva veritate of co-referring singular terms. Consider the urn case, for example, and suppose that (unbeknownst to the agent, of course) the actual quantity in M is 16. Then the first of the following two statements is true and the second is false, even though the second is obtained from the first by substitution of a co-referring singular term:

pr(M contains the actual quantity in M) = 1

pr(M contains 16) = 1.

Likewise, the first of the following two statements is true and the second false, even though the second is obtained from the first by substitution of a co-referring singular term:

pr(O contains 1/2(the actual quantity in M)) = 1/2

pr(O contains 8) = 1/2.

It should not be terribly surprising, upon reflection, that epistemic probability is intensional in the way belief is, since epistemic probability is tied to available information in much the same way as is rational belief. (This certainly should not be surprising to those who think that epistemic probability is just rational degree of belief.)

Second, it is important to distinguish between two ways of specifying states, outcomes, and desirabilities in matrix formulations of decision problems. On one hand are canonical specifications: the items, as so specified, are epistemically determinate for the agent, given the total available information—i.e., the agent knows what item the specification refers to. On the other hand are noncanonical specifications of states, outcomes, and desirabilities: the items, as so specified, are epistemically indeterminate for the agent. The paradoxical two-envelope argument employs noncanonical specifications of states and of outcomes/desirabilies; for, the specifications employ the symbol ‘x’ which goes proxy for the noncanonical referring expression ‘the actual quantity in M’, and the quantity in referred to is epistemically indeterminate (as so specified) for the agent. (The canonical/noncanonical distinction is discussed at greater length in Horgan 2000.)

Third, it needs to be recognized that because expected utility involves epistemic probabilities, and because epistemic-probability contexts are intensional, the available acts in a given decision problem can have several different kinds of expected utility. On one hand is standard expected utility, calculated by applying the definition of expected utility to a matrix employing canonical specifications of states, outcomes, and desirabilities. On the other hand are various kinds of nonstandard expected utility, calculated by applying the definition to matrices involving various kinds of noncanonical specifications.

Take the urn version of the two-envelope problem, for instance, and suppose that (unbeknownst to the agent, of course) M contains 16 and O contains 32. The standard expected utilities, for sticking and for switching, are calculated on the basis of a matrix employing canonical state-specifications, like Matrix 2 (in section 1). As mentioned above, since each of the 10 state-specifications in Matrix 2 has epistemic probability 1/10,

U(Stick) = 1/10(1+2+2+4+4+8+8+16+16+32) = 9.3

U(Switch) = 1/10(2+1+4+2+8+4+16+8+32+16) = 9.3

On the other hand, one nonstandard kind of expected utility for the acts of sticking and switching, which I will call x-based nonstandard utility and I will denote by ‘Ux’, is calculated by letting ‘x’ go proxy for ‘the actual quantity in M’ and then applying the definition of expected utility to a matrix with noncanonical state-specifications formulated in terms of x, viz., Matrix 1 (in section 1). Since each of the two state-specifications in Matrix 1 has epistemic probability 1/2, and since (unbeknownst to the agent) M contains 16,

Ux(Stick) = x = 16

Ux(Switch) = 1.25x = 20

Another nonstandard kind of expected utility for the acts of sticking and switching, which I will call y-based nonstandard utility and I will denote by ‘Uy’, is calculated by letting ‘y’ go proxy ‘the actual quantity in O’ and then applying the definition of expected utility to a matrix with noncanonical state-specifications formulated in terms of y, viz.,

M contains 1/2y                                                 M contains 2y

Stick                       Get 1/2y                                                            Get 2y

Switch                    Get y                                                                Get y

Matrix 4

Since each of the two state-specifications in Matrix 4 has epistemic probability 1/2, and since (unbeknownst to the agent) O contains 32,

Uy(Stick) = 1.25y = 40

Uy(Switch) = y = 32

There is nothing contradictory about these various incompatible expected-utility values for sticking and switching in this decision problem, since they involve three different kinds of expected utility—the standard kind U, and the two nonstandard kinds Ux and Uy.

Fourth, since a distinction has emerged between standard expected utility and various types of nonstandard expected utility, it now becomes crucial to give a new, more specific, articulation of the basic normative principle in decision theory—the principle of expected-utility maximization, prescribing the selection of an action with maximum expected utility. This principle needs to be understood as asserting that rationality requires choosing an action with maximum standard expected utility. Properly interpreted, therefore, the expected-utility maximization principle says nothing whatever about the various kinds of nonstandard expected utility that an agent’s available acts might also happen to possess.

Having extracted these important morals about epistemic probability and expected utility from consideration of the paradoxical argument, we are now in a position to diagnose how the argument goes wrong. Since the kind of expected utility to which the argument appeals is Ux—i.e., x-based nonstandard expected utility—the principal flaw in the argument is its implicit reliance on a mistaken normative assumption, viz., that in the two-envelope decision problem, rationality requires Ux-maximization. Thus, given that Ux is the operative notion of expected utility in the paradoxical argument, the reasoning is actually correct up through the penultimate conclusion that the expected utilities of sticking and switching, respectively, are x and 1.25x. But the mistake is to infer from this that one ought to switch.

Equivocation is surely at work too. Since the unvarnished expression ‘the expected utility’ is employed throughout, the paradoxical argument effectively trades on the presumption that the kind of expected utility being described is standard expected utility. This presumption makes it appear that the normative basis for the final conclusion is just the usual principle that one ought rationally to perform an action with maximal expected utility. But since that principle applies to standard expected utility, whereas the argument is really employing a nonstandard kind, the argument effectively equivocates on the expression ‘the expected utility’.

In light of this diagnosis of the paradoxical argument, and the distinction that has emerged between standard expected utility and various kinds of nonstandard expected utility, important new questions emerge for the foundations of rational decision theory: Is it sometimes normatively appropriate to require the maximization of certain kinds of nonstandard expected utility? If so, then under what circumstances?

Such questions are of  interest for at least two reasons. First, the use of an appropriate kind of nonstandard expected utility sometimes provides a suitable shortcut-method for deciding on a rationally appropriate action in a given decision situation. A correctly applicable kind of nonstandard expected utility typically employs a much more coarse-grained set of states, thereby simplifying calculation.

Second (and more important), maximizing a certain kind of nonstandard expected utility sometimes is rationally appropriate in a given decision situation even though the available acts lack standard expected utilities—i.e., even though the total available information does not determine a uniquely rationally eligible standard probability distribution over a suitable set of exclusive and exhaustive states of the world. (By a standard distribution of epistemic probabilities, I mean a probability distribution over states as canonically specified.) In such decision situations it is rationally appropriate to maximize a certain kind of nonstandard expected utility even though the agent’s total available information makes it rationally inappropriate to adopt any standard probability distribution, because numerous candidate-distributions all are equally rationally eligible.[2]

The two-envelope situation itself is a case in point. The official description of the situation does not provide enough information to uniquely fix a standard probability distribution that generates standard expected utilities for sticking and switching.  (In this respect, the original problem differs from our special case, the urn version.) Nevertheless, the following is a perfectly sound expected-utility argument for the conclusion that sticking and switching are rationally on a par.  Let z be the lower of the two quantities in the envelopes, so that 2z is the higher of the two. Then the epistemic possibilities for states and outcomes are described by the following matrix:

M contains z and O contains 2z              M contains 2z and O contains z

Stick                                                Get z                                                     Get 2z

Switch                                             Get 2z                                                     Get z

Matrix 5

The two state-specifications in Matrix 5 both have probability 1/2. Hence the expected utility of sticking is 1/2z + 1/2(2z) = 3/2z, whereas the expected utility of switching is 1/2(2z) + 1/2z = 3/2z. So, since these two acts have the same expected utility, they are rationally on a par.

The soundness of this argument is commonly acknowledged in the literature. What is not commonly acknowledged or noticed, however, is that the notion of expected utility employed here is a nonstandard kind. I will call it z-based nonstandard expected utility, and I will denote it by Uz. In order to illustrate the fact that Uz differs from standard expected utility U, return to the urn case, and suppose that (unbeknownst to the agent, of course) the actual lower quantity z in the envelopes is 16 (and hence the actual higher quantity in the envelopes, 2z, is 32). Then, as calculated on the basis of Matrix 2 (in section 1), U(Stick) = U(Switch) = 9.3. However,

Uz(Stick) = 1/2z + 1/2(2z) = 1/2×16 + 1/2×32 = 24

Uz(Switch) = 1/2(2z) + 1/2z = 1/2×32 +1/2×16 = 24

And with respect to the original two-envelope situation (as opposed to the urn case), there are no such quantities as U(Stick) and U(Switch), since there is not any single, uniquely correct, distribution of standard probabilities over canonically-specified epistemic possibilities.

The coin-flipping version of the original decision problem also illustrates the rational applicability of a suitable kind of nonstandard expected utility—in this case, Ex. The form of reasoning employed in the original two-envelope paradox not only yields the correct conclusion, but in this situation also appears to be a perfectly legitimate way to reason one’s way to that conclusion. This fact is acknowledged in the literature; but once again, it is not commonly acknowledged or noticed that the notion of expected utility employed here is nonstandard. In order to illustrate the fact that Ux does indeed differ from standard expected utility, consider the coin-flipping urn case, and suppose that (unbeknownst to the agent, of course) the actual quantity in M is 16. Then although U(Stick) = 12.4 and U(Switch) = 15.5, as explained in section 1 above (with reference to Matrix 3), the x-based nonstandard expected utilities are:

Ux(Stick) = x = 16

Ux(Switch) = 1/2(1/2x) + 1/2(2x) = 1/2×8 + 1/2×32 = 20

And with respect to the original coin-flipping two-envelope situation (as opposed to our urn version of it), there are no such quantities as U(Stick) and U(Switch), since there is not any single, uniquely correct, distribution of standard probabilities over canonically-specified epistemic possibilities.

In light of these observations, in Horgan 2000 I proposed the following general normative principle for the maximization of various kinds of nonstandard expected utility in various decision situations. For a given decision problem, let d be a singular referring expression that is epistemically indeterminate given the total available information, and hence is noncanonical. Let Ud be a form of nonstandard expected utility, applicable to the available acts in the decision situation, that is calculated on the basis of a matrix employing noncanonical state-specifications, outcome-specifications, and desirablity-specifications formulated in terms of d. Suppose that for the given decision situation, the following existence condition obtains:

(E.C.)   There is at least one rationally eligible standard probability distribution over epistemically possible states of nature.

Under these circumstances,

(A)      Rationality requires choosing an act that maximizes Ud just in case there is a unique ratio-scale ordering O of available acts such that  (i) for every rationally eligible standard probability distribution D to epistemically possible states of nature for the given decision situation, UD ranks the available acts according to O, and (ii) Ud ranks the acts in an epistemically determinate way, and according to O.[3]

Here, UD is the standard expected utility as calculated on the basis of D. To say that several standard probability assignments are “rationally eligible” does not mean, of course, that each of them is one that the agent is rationally permitted to adopt; rather, essentially it means that none of them conflict with the total available information. Insofar as they are all equally rationally eligible, it would be rationally inappropriate to adopt any one of them, over against the others.

The proposed normative principle (A) dictates Uz-maximization in the original two-envelope situation, but not Ux-maximization or Uy-maximization. It dictates Ux-maximization in the coin-flipping version of the two-envelope situation, but not Uy-maximization or Uz-maximization. It has applications not only as an occasional short-cut method for rational decision-making that is simpler than calculating standard expected utility, but also (and much more importantly) as a method for rational decision-making in certain situations where the available acts have no standard expected utilities at all.

3.         A Residual Theoretical Issue.

I now think that the proposed principle (A) is inadequate, in four specific ways. I will explain the first problem in this section, and the second in section 4. In section 5 I will propose a new normative principle in place of (A), one that overcomes these two problems. Then in section 6 I will introduce the third and fourth problems, and I will address them by proposing yet another principle, a generalization of the one proposed in section 5.

The first problem is that condition (A) applies only to decision situations for which there is at least one rationally eligible standard probability distribution over epistemically possible states of nature (i.e., situations where (E.C.), the existence condition, holds). Yet there are decision situations for which (i) rationality requires choosing an act that maximizes a given kind of nonstandard expected utility, but (ii) there is no rationally eligible standard probability distribution over epistemically possible states of nature—i.e., no probability distribution over canonically specified states that satisfies all the conditions of the given decision situation. Hence, there is a need to generalize principle (A) in order to cover such decision situations.

We obtain a case in point by elaborating the original two-envelope decision situation in the following way. You are told, reliably, that the actual quantity in M has this feature: if you were to learn what it is, then you would consider it equally likely that O contains either twice that amount of half that amount; and likewise, the actual quantity in O is such that if you were to learn what it is, then you would consider it equally that M contains either twice that amount or half that amount. I will call this the expanded version of the two-envelope situation.

The expanded version remains a coherent decision problem. For this case too, no less that the original version, rationality requires choosing an act that maximizes Uz—which means that sticking and switching are rationally on a par. And, as in the original version, neither act has a standard expected utility. However, the reason why not is different than before. In the original version, the lack of standard expected utility was due to the fact that there were numerous rationally eligible standard probability distributions to epistemically possible states of nature—so that there is no rational reason to adopt any one of them, over against the others. In the expanded version, however, there is no rationally eligible standard probability distribution over the relevant states of nature. Why not? Because the following argument looms:

1. If I were to learn that M contained the minimum amount 1, then I would not consider it equally likely that O contains either twice that amount of half that amount (because I would know that O contains 2). Hence, M does not contain 1. By parallel reasoning, O does not contain 1.

2. Since neither M nor O contains 1, if I were to learn that M contained 2, then I would

not consider it equally likely that O contains either twice that amount or half that amount

(because I would know that O does not contain 1). Hence, M does not contain 2. By

parallel reasoning, O does not contain 2.

.

.

.

n. Since neither M nor O contains 2n-1, if I were to learn that M contained 2n, then I would not

consider it equally likely that O contains either twice that amount or half that amount

(because I would know that O does not contain 2n-1). Hence, M does not contain 2n. By

parallel reasoning, O does not contain 2n.

Etc.

This argument has a familiar structure: it is a version of the so-called “surprise examination paradox.” Presumably it is flawed in some way—in whatever way constitutes the proper diagnosis of the surprise examination paradox. However, be that as it may, the fact that such a paradox arises from the conditions specified in the expanded two-envelope decision situation has this consequence: no standard probability distribution—i.e., no probability distribution over canonically specified potential states of envelopes M and O—is fully consistent with these specified conditions. For, the canonical state-specifications

M contains 1 and O contains 2

O contains 1 and M contains 2

each would have to be assigned probability zero, and hence the canonical state-specifications

M contains 2 and O contains 4

O contains 2 and M contains 4

each would have to be assigned probability zero, and so forth for all potential quantities in M and in O—whereas the sum of the probabilities constituting a probability distribution must be 1. So in the case of the expanded two-envelope situation, there are no rationally eligible standard probability distributions to epistemically possible states of nature.

Similar remarks apply, mutatis mutandis, to an expanded version of the coin-flipping situation that includes this additional condition: you are told, reliably, that the actual quantity in O has this feature: if you were to learn what it is, then you would consider it equally likely that M contains either twice that amount or half that amount. (Presumably it is already true, even for the earlier-described coin-flipping situation, that the actual quantity in M has the corresponding feature vis-à-vis O.) In this informationally enriched decision situation, as in the official coin-flipping situation, rationality requires choosing the act that maximizes Ux, viz., switching. But once again there is no rationally eligible standard probability distribution over canonically specified potential states of nature, because the conditions of the decision situation collectively have a surprise-examination structure.

I will not propose a solution to the surprise-examination paradox, nor is it one required for present purposes. The crucial points are these. First, the expanded versions of the original two-envelope situation and the coin-flipping situation are coherent decision problems, despite the fact that they have a surprise-examination structure that precludes any rationally eligible standard probability distribution over canonically-specified potential states of nature. Second, in each of these situations, rationality requires choosing an act that maximizes a certain kind of nonstandard expected utility—viz., Uz in the expanded version of the original situation, and Ux in the expanded version of the coin-flipping situation. Third, the general normative principle (A) in Horgan 2000, stating when rationality requires choosing an act that maximizes a given kind of nonstandard expected utility in a given decision situation, does not apply to the cases lately described, because principle (A) applies only when (E.C.) is satisfied—i.e., only when there is at least one rationally eligible standard probability distribution over epistemically possible states of nature. Thus arises the following theoretical issue for the foundations of rational decision theory: articulating a normative principle, to govern the application of nonstandard expected utility, that is more general than (A)—a principle that does subsume decision situations like those I have described in this section.[4]

4.         A Second Residual Theoretical Issue.

A second problem arises from the fact that principle (A) is supposed to specify the conditions under which rationality requires the maximization of a given kind of nonstandard expected utility Ud. In a footnote to (A) in Horgan 2000, I remarked:

Saying that rationality “requires the maximization” of Ud means more than saying that rationality requires choosing an available act that happens to have a maximal Ud-value. It also means that having a maximal Ud-value is itself a reason why Ud-maximization is rationally obligatory. The idea is that Ud accurately reflects the comparative rational worth (given the agent’s available information) of the available acts.

Suppose, however, that Ud turns out to generate the right ratio-scale rankings of the actions, but for purely accidental and coincidental reasons. Then Ud will not “accurately reflect” those rankings in the sense intended; it will not be a guaranteed, non-accidental, indicator of them. And having maximal Ud-value will not be a “reason for rational obligatoriness,” in the sense intended, to choose a Ud-maximizing act. What is wanted, then, is something stronger that clause (ii) of principle (A). Ud should have some feature guaranteeing that it generates the appropriate ranking of the available acts.

5.         Ratio-Scale Comparative Rational Worth and a New Normative Principle.

One important pre-theoretic idea about rationality is that for some decision problems, the agent’s total information (including desirabilities of various potential outcomes of available acts) confers upon each of the available acts some epistemically determinate, quantitively measurable, absolute rational worth. This idea gets explicated in decision theory in terms of the familiar notion of expected utility—i.e., what I have here called standard expected utility. The available acts in a given decision problem have absolute rational worth, for the agent, just in case they have standard expected utilities; and the absolute worth of each act just is its standard expected utility. Thus, having absolute rational worth requires that there be a set of epistemically determinate state-specifications such that (a) the agent has an epistemically determinate probability distribution over these state-specifications and (b) for each available act Ai and each state-specification, Ai has an epistemically determinate outcome and epistemically determinate desirability under the state as so specified.

Another important pre-theoretic idea about rationality is that for some decision problems, the agent’s total information determines, for the set of available acts, an epistemically determinate, ratio-scale, ranking of comparative rational worth. When the acts each have an absolute rational worth (i.e., a standard expected utility), this will automatically confer comparative rational worth as well: the standard expected utilities fix a corresponding ratio-scale ranking of the acts. For some decision problems, however, the agent’s total information determines a specific ratio-scale ranking of comparative rational worth for the available acts, independently of any specific probability distribution over epistemically determinate states of nature. Sometimes this happens even though there is also a uniquely correct standard probability distribution, so that the acts have standard expected utilities too (e.g., the urn case, and the coin-flipping urn case). Sometimes it happens when there is not a uniquely correct standard probability distribution, so that the acts do not have standard expected utilities—either (a) because the total information is consistent with more than one rationally eligible standard probability distribution over the relevant canonically specified states (e.g., the original two-envelope situation, and the coin-flipping situation), or (b) because the total available information has a “surprise examination” structure that actually precludes any rationally eligible probability distribution over the relevant canonically specified states (e.g., the extended versions of the original two-envelope situation and the coin-flipping situation).

Although nonstandard expected utilities are specific numerical quantities, they are epistemically indeterminate for the agent. Thus, they are not a measure of absolute rational worth. Nevertheless, in decision situations like those discussed above, nonstandard expected utility does generate an epistemically determinate ratio-scale ranking of the available acts (even though the nonstandard expected utilities themselves are epistemically indeterminate). Moreover, for each of these decision situations, the available acts stand in a unique ratio-scale ranking of comparative rational worth, independently of any specific probability distribution over canonically specified states of nature. As I will put it, the acts stand in a unique ratio-scale ranking of SPD-independent comparative rational worth (i.e., comparative rational worth that is independent of any specific standard probability distribution). So in such decision situations, the normatively appropriate kind of nonstandard expected utility is a kind that is guaranteed to rank the available acts in accordance with their SPD-independent ratio-scale comparative rational worth. In the original two-envelope situation and its urn variant and its extended variant, Uz does this (but Ux and Uy do not), whereas in the coin-flipping situation and its urn variant and its extended variant, Ux does this (but Uy and Uz do not).

In effect, clause (i) of principle (A) is an attempt to characterize the relevant kind of SPD-independent ratio-scale comparative rational worth, and clause (ii) is an attempt to specify how a given type of nonstandard expected utility Ud must be linked to this feature in order for Ud-maximization to be rationally required. But clause (i) is unsatisfactory, because it fails to apply to relevant situations with a “surprise examination” structure. And clause (ii) is unsatisfactory too, because it does not preclude the possibility that Ud happens to rank the acts in accordance with their SPD-independent ratio-scale comparative rational worth for purely fortuitous and accidental reasons. What we need, then, is a normative principle that (1) is applicable to decision situations for which there are no rationally eligible standard probability distribution over the epistemically possible states of nature (e.g., the extended two-envelope situation, and the extended coin-flipping situation), and (2) articulates the conditions under which a specific kind of nonstandard expected utility non-accidentally ranks the available acts in a given decision problem by SPD-independent ratio-scale comparative rational worth.

Consider the original two-envelope situation, the extended version of the original situation, and the urn case. Why is it mistaken to use Ux in these decision situations? The fundamental problem is the following. On one hand, the state-specifications employed in calculating Ux, viz.,

O contains 1/2x

O contains 2x

hold constant the epistemically indeterminate quantity x in envelope M, while allowing the content of O to vary between the two epistemically indeterminate quantities 1/2x and 2x. But on the other hand, this asymmetry, with respect to the fixity or variability of epistemically indeterminate features of the actual situation, does not reflect any corresponding asymmetry in the agent’s total available information. Yet the effect of the asymmetry is that Ux(Switch) = 5/4Ux(Stick). Thus, since switching and sticking are rationally on a par, Ux fails to order these acts by their ratio-scale comparative rational worth.

By contrast, why is it correct to use Uz in the original two-envelope situation, in the extended version of it, and in the urn case? Because on one hand, the two state-specifications employed in calculating Uz, viz.,

M contains z and O contains 2z

M contains 2z and O contains z

are symmetric with respect to matters of fixity variability concerning the two epistemically indeterminate quantities z and 2z. The quantities themselves (viz., the lower and the higher of the two actual quantities in the two envelopes) are both held fixed; and the locations of these two quantities vary in a symmetrical way, across the two epistemically indeterminate states. On the other hand, this symmetry with respect to fixity and variability reflects the symmetry of the agent’s available information concerning the contents of envelopes M and O. The result is that Uz(Switch) = Uz(Stick), so that Uz accurately ranks the acts in accordance with their ratio-scale comparative rational worth.

Consider now the coin-flipping version of the two-envelope situation, the extended coin-flipping version, and the coin-flipping urn case. Why is it correct to use Ux in these cases? Because on one hand, the two state-specifications employed in calculating Ux, viz.,

O contains 1/2x

O contains 2x

hold constant the epistemically indeterminate quantity x in envelope M, while allowing the content of O to vary between the two epistemically indeterminate quantities 1/2x and 2x. On the other hand, this asymmetry, with respect to the fixity and variability of epistemically indeterminate features of the actual situation, directly reflects a corresponding asymmetry in the agent’s total available information: the agent knows that the quantity x in envelope M was selected first, and then either 1/2x or 2x was placed in envelope O, depending on the outcome of a fair coin-toss. That informational asymmetry renders switching 5/4 as rationally valuable as sticking. So, since the asymmetry is reflected in the fact that the state-specifications hold fixed the quantity x in envelope M while allowing the quantity in envelope O to vary between 1/x and 2x, Ux accurately ranks switching and sticking by their ratio-scale comparative rational worth: Ux(Switch) = 5/4Ux(Stick).

By contrast, why is it incorrect to use Uz  in the coin-flipping version of the two-envelope situation, the extended coin-flipping version, and the coin-flipping urn case? Because on one hand, the two state-specifications employed in calculating Uz, viz.,

M contains z and O contains 2z

M contains 2z and O contains z

are symmetric with respect to matters of fixity and variability concerning the two epistemically indeterminate quantities z and 2z. On the other hand, these state-specifications thereby fail to reflect the crucial asymmetry in the agent’s information about the contents of envelopes M and O, with the result that Uz fails to accurately rank switching and sticking by their ratio-scale comparative ratio worth of 5 to 4, and instead ranks them equally.

These observations point the way toward the general normative principle we are seeking, concerning the rational appropriateness or inappropriateness of using a specific kind of nonstandard expected utility in a given decision situation. For a given decision problem, let d be a singular referring expression that denotes some numerical quantity and is epistemically indeterminate given the total available information, and hence is noncanonical. Let Ud be a form of nonstandard expected utility, applicable to the available acts in the decision situation, that is calculated on the basis of a matrix employing noncanonical state-specifications, outcome-specifications, and desirability-specifications formulated in terms of d. We will say that the set of state-specifications employed to calculate Ud is symmetry and asymmetry reflecting, with respect to fixity and variability of features of the decision situation (for short, SARf/v) just in case any symmetries or asymmetries in these state-specifications reflect corresponding symmetries and asymmetries in the agent’s total available information. Then

(B)       Rationality requires choosing an act that maximizes Ud if (i) Ud employs state-specifications that are SARf/v, and (ii) Ud generates an epistemically determinate ratio-scale ranking of the available acts.[5]

When the conditions in (B) are met, the available acts do indeed possess SPD-independent ratio-scale comparative rational worth, and Ud is guaranteed to rank the acts in a way that accurately reflects their comparative worth. For, the very symmetries and asymmetries in the agent’s total information that fix determinate ratio-scale comparative worth for the acts, independently of any specific probabilities for canonical state-specifications, are directly reflected in the fixity/variability structure of the noncanonical state-specifications employed by Ud.

6.         Ordinal-Scale Rational Worth and a More General Normative Principle.

Although the two problems with principle (A) described in sections 3 and 4 have now been dealt with, two further problems need to be addressed; both also arise for principle (B) and hence will prompt modifications of (B) in turn. The third problem is that there are decision problems for which (i) the available acts stand in an ordinal-scale, but not a ratio-scale, ordering of SPD-independent comparative rational worth, and (ii) there is a suitable kind of nonstandard expected utility that rationally ought to be maximized (because it is guaranteed to reflect the ordinal-scale comparative rational worth of the acts).

Here is a simple example. You are given a choice between two envelopes E1 and E2, after being reliably informed that first some whole-dollar quantity of money of \$2 or more was chosen by some random process and placed in E1, and then the square of that quantity was placed into E2. Assuming that the desirability of an outcome is just the dollar-amount obtained, in this decision situation there is a kind of nonstandard expected utility definable for this situation that ought rationally to be maximized, viz. Uw, where w = the actual quantity in E1. Since

Uw(Choose E1) = w

Uw(Choose E2) = w2

and since w2 > w for all potential values of w, rationality requires the Uw-maximizing act, viz., choosing E2. However, since the epistemically possible quantities in E2 are a non-linear function of the corresponding epistemically possible quantities in E1, the two acts do not stand not in an SPD-independent ratio-scale ranking of comparative rational worth, but only in an SPD-independent ordinal-scale SPD-independent ranking of comparative worth. (Accordingly, Uw generates only an epistemically determinate ordinal-scale ranking of the acts.)

The fourth problem is that rationality sometimes requires maximizing a more general version of nonstandard expected utility than has so far been discussed, a version involving several noncanonical number-denoting terms rather than just one. Consider the following decision situation, for example. You are given a choice of two envelopes E1 and E2. Envelope E1 has two slots S1E1 and S2E1, and envelope E2 has two slots S1E2 and S2E2. Each slot in E1 contains some dollar-quantity of money, selected by some random process. (The two selections were independent of one another.) Slot S1E2 of E2 contains either half or twice the quantity in slot S1E1 of E1, depending on the outcome of a fair coin-flip. Slot S2E2 of E2 contains either one fourth of, or four times, the quantity in slot S1E2, depending on the outcome of an independent fair coin-flip.

Letting x be the actual quantity in S1E1 and y be the actual quantity in S2E1, there is a nonstandard expected utility Ux,y definable for this decision problem that yields epistemically indeterminate expected utilities expressed as mathematical functions of x and y. Assuming that the desirabilities of the potential outcomes are just their dollar amounts,

Ux,y(Choose E1) = 1/4[(x+y) + (x+y) + (x+y) + (x+y)] = x+y

Ux,y(Choose E2) = 1/4[(1/2x+1/4y) + (2x+1/4y) + (1/2x+4y) + (2x+4y)] = 5/4x + 17/8y.

Ux,y is guaranteed to reflect the acts’ SPD-independent comparative ordinal-scale rational worth, because (5/4x + 17/8y) > (x + y) for any permissible values of x and y. Thus, rationality dictates the maximization of Ux,y in this decision situation. (Notice that the third problem too is illustrated by this case. The stated conditions fix an SPD-independent ordinal-scale comparative rational worth for the two acts, without fixing any unique ratio-scale ordering: choosing E2 is rationally preferable to choosing E1, but not by any specific, probability-independent, ratio.)

So for some decision problems, a certain kind of nonstandard expected utility reflects SPD-independent ordinal-scale comparative rational worth of the available acts, even when they lack PDP-independent ratio-scale comparative rational worth. Moreover, for some decision problems, SPD-independent comparative rational worth is reflected by a kind of nonstandard expected utility based on several noncanonical number-denoting terms rather than one. Thus a normative principle more general than (B) is needed, to govern the rationally appropriate use of nonstandard expected utility in such cases.

The needed principle can be articulated by generalizing (B) in the following way. For a given decision problem, let d1,...,dm be singular referring expressions that denote numerical quantities and are epistemically indeterminate given the total available information, and hence are noncanonical. Let Ud1,...,dm be a form of nonstandard expected utility, applicable to the available acts in the decision situation, that is calculated on the basis of a matrix employing noncanonical state-specifications, outcome-specifications, and desirability-specifications formulated in terms of d1,...,dm. Then

(C)       Rationality requires choosing an act that maximizes Ud1,...,dm just in case (i) Ud1,...,dm employs state-specifications that are SARf/v, and (ii) Ud1,...,dm generates an epistemically determinate ordinal-scale ranking of the available acts.[6]

When these conditions are met, the available acts do indeed possess SPD-independent ordinal-scale comparative rational worth, and Ud1,...,dm is guaranteed to rank the available acts in a way that accurately reflects their comparative rational worth. For, the very symmetries and asymmetries in the agent’s total information that fix determinate ordinal-scale comparative worth for the acts, independently of any specific probabilities for canonical state-specifications, are directly reflected in the fixity/variability structure of the noncanonical state-specifications employed by Ud1,...,dm. So we have arrived at a general normative principle governing the maximization of nonstandard expected utility, a principle that overcomes all four problems faced by principle (A).

Principle (B), which states only a sufficient condition for the rationality of maximizing a given kind of nonstandard expected utility (rather than a sufficient and necessary condition), remains in force. In effect, it is a special case of our more general normative principle (C).

Let me make several final observations about principles (C) and (B) and the key notion they employ, viz., the feature SARf/v. First, I take it that the failure to be SARf/v is a feature that can be exhibited only by state-specifications of the kind that figure in nonstandard expected utility, viz., epistemically indeterminate state-specifications. Only when relevant features of the actual situation are specified in epistemically indeterminate ways does it become possible to fix or vary them in ways not reflective of one’s total information, within a set of state-specifications that are mutually exclusive and jointly exhaustive.

Second, the feature of being SARf/v is evidently clear enough to be useful and applicable in concrete decision situations like those I have described in this paper. Often in such situations, one can tell by inspection whether or not the state-specifications employed by a given kind of nonstandard expected utility are SARf/v. Indeed, it is evidently very common in practice—in betting decisions, for example—to rely on calculations of nonstandard expected utilities that are SARf/v.

But third, being SARf/v also has been characterized somewhat vaguely, in terms of several vague ideas: (1) symmetries and asymmetries in one’s total information, (2) symmetries and asymmetries in a set of noncanonical state-specifications, and (3) a relation of “reflection” between the latter and the former kinds of symmetries and asymmetries. It would be theoretically desirable to explicate these notions further, and to employ the explicated versions to articulate a sharpened normative principle that would replace and explicate the vague normative principles (C) and (B).

Fourth, the notion of SPD-independent comparative rational worth is also somewhat vague, as so far characterized. It would be theoretically desirable to provide a direct explication of it too, and to explicitly articulate its connection to explicated versions of principles (C) and (B). These tasks of further explication and articulation I leave for a future occasion.[7]

REFERENCES

Arntzenius, F. and McCarthy, D. 1997 “The Two envelope Paradox and Infinite Expectations,” Analysis, 57, 42-50.

Broome, J. 1995 “The two-envelope paradox,” Analysis, 55, 6-11.

Cargile, J. 1992 “On a Problem about Probability and Decision,” Analysis, 54, 211-16.

Castell, P. and Batens, D. 1994 “The Two-Envelope Paradox: The Infinite Case,” Analysis, 54, 46-49.

Chalmers, D. Unpublished “The Two-Envelope Paradox: A Complete Analysis?”

Horgan, T. 2000 “The Two-Envelope Paradox, Nonstandard Expected Utility, and the Intensionality of Probability,” Nous, in press.

Jeffrey, R. 1983 The Logic of Decision, Second Edition, Chicago: University of Chicago Press.

Jackson, F., Menzies, P., and Oppy, G. 1994 “The Two Envelope ‘Paradox’,” Analysis, 54, 43-45.

McGrew, T., Shier, D. and Silverstein, H. 1997 “The Two-Envelope Paradox Resolved,” Analysis, 57, 28-33.

Nalebuff, B. 1989 “The Other Person’s Envelope is Always Greener,” Journal of Economic Perspectives, 3, 171-81.

Scott, A. and Scott, M. 1997 “What’s in the Two Envelope Paradox?” Analysis, 57, 34-41.

[1] Epistemic probability, as understood here, must conform to the axioms of probability theory. Although the term ‘epistemic probability’ has sometimes been used for subjective degrees of belief that can collectively fail to conform to these axioms, I think it is important to reclaim the term from those who have employed it that way. I would maintain that there are objective facts about the kind of probability that is tied to the agent’s available information—i.e., about what I am calling epistemic probability. One objective fact is that epistemic probability obeys the axioms of probability theory.

[2] According to some construals of epistemic probability, rationality permits the initial adoption of virtually any standard probability distribution that obeys the axioms of probability and also is consistent with the agent’s total available information—provided that that one then updates one’s prior standard probabilities, on the basis of new evidence, in accordance with Bayes’ theorem. In my view this tolerant attitude toward prior standard probabilities is mistaken, precisely because rationality prohibits the adoption of any single specific standard probability distribution when numerous candidate-distributions are all equally eligible. But even those who take the tolerant approach to prior standard probabilities can agree about the theoretical importance of nonstandard expected utilities, vis-à-vis decision situations in which the comparative rational worth of the available acts is independent of any specific standard probability distribution over epistemically determinate states of nature.

[3] This formulation improves upon the version in Horgan 2000 by explicitly building into clause (ii) a feature that the earlier version effectively took for granted, but should have articulated: viz., that the ratio-scale ranking of available acts generated by Ud is epistemically determinate for the agent (even though the Ud-quantities themselves are epistemically indeterminate).

[4] Note that it would not suffice merely to drop (E.C.) from the specification of the circumstances under which principle (A) applies, and leave (A) otherwise intact. For, clause (i) of principle (A) would then be vacously satisfied in the expanded two-envelope situation by each of Ux, Uy, and Uz. Principle (A) would thus require the maximization of all three of these kinds of nonstandard expected utility, in the expanded two-envelope situation—a requirement that is not only normatively inappropriate, but is impossible to fulfill.

[5] Condition (B) is stated merely as a sufficient condition for the rationality of Ud-maximization, rather than a sufficient and necessary condition, because it is still not general enough to cover all cases. See section 6.

[6] Clause (ii) is non-redundant, because there are decision situations in which clause (i) is satisfied but clause (ii) is not. Here is an example. You are given a choice between two envelopes E1 and E2, each of which contains some whole-dollar quantity of money. You are told that some quantity n, evenly divisible by 3, was first selected by a random process and placed into E1, and that the quantity (n/3)2 was then placed into E2. Letting w = the actual quantity in E1, Uw(Choose E1) = w, whereas Uw(Choose E2) = (w/3)2. In this situation Uw is a form of nonstandard expected utility that satisfies clause (i) of principle (C). However, Uw does not generate an epistemically determinate ordinal-scale ranking of the available acts, and hence does not satisfy clause (ii) of (C). For, Uw(Choose E1) > Uw(Choose E2) if w < 9, whereas Uw(Choose E1) = Uw(Choose E2) if w = 9, whereas Uw(Choose E1) < Uw(Choose E2) if w > 9.

[7] I dedicate this paper to my wife Dianne, who has patiently endured my envelope obsession. She plans to put my ashes into two envelopes, and then put one envelope on the mantel and sprinkle the other’s contents into the wind at the U.S. Continental Divide.