2 10 Extremals and convex cones with constraints

Advanced math

Before beginning today’s lecture, consider some of the applications of the tools that we have (and will) learn this semester (theoretically.)

  • In order to use perturbations, we need to have continuity—but only in the immediate area around which we perterb. Thus, as in the example, you just need to find a place where the derivative is well defined and go to it.
  • Extremals and convex cones are useful for problems (or parts of problems) that can be represented by a linear relationship. It is necessary for there to be a linear relationship to use them, but we have that from the expectation function around the utility function.
  • Monotone comparative statics is useful where there are discontinuities. Some example where this is useful are for general equilibrium analysis of the core, auctions, and rational expectations contract theory.

The goal of today’s lecture is to apply the concept of extremals and convex cones to the situation in which there are constraints. Of course, a convex cone that is partitioned by one—or even n— linear constraints simply divides the convex cone into more convex cones.

Today, we will begin by deriving the necessary and sufficient conditions for a particular type of relationship to hold. (Remember from the end of last lecture: we are going to be comparing types of risk preferences.) After we have derived the mathematical relationship, we plug in and derive a couple a couple of sample relationships.

In order to get warmed up, consider the simplest extremal and convex cone example. What does it take for:

Ef(Xº)≤0 for all Xº?

Well, we need f(x)≤0. Done.

We did a few intermediate level of difficulty problems in the last lecture. Let’s add a degree of difficulty, albeit one that will turn out to be not all that large: let’s add a constraint. Before looking at the specific problem, let’s consider what this will do geometrically.

The constraint will be a linear line that says that ‘something must be true’. That something must be true will be true for a portion of the convex cone. As long as the ‘something must be true’ leads to a linear constraint, we easily see that we are left with a convex set. Here’s the picture (of course, the triangle will be the convex set of possible three-dimension random events):

Now let’s consider the exact form of the problem that we are going to look at. We want to know when the following statement will hold:

If Ef(Xº)≤f(0) then Eg(Xº)≤g(0)

But wait! This statement is really saying, “consider the cases in which the ‘if’ part of the statement is true. Under what conditions will that be the necessary and sufficient condition for the second part of the statement. Note that we will have to deal with showing that the first implies the second part, and that the second part can not correspond to yes to the first part.

Bringing that discussion back to the graph, we have that the if statement splits the full simplex of random variables into those that satisfy the if clause and those that do not. By linearity in expected utility, the constraint is linear—as pictured above. The key is that, because we have a linear constraint, we maintain the convex cone, and thus the ability to analyze things using extremals.

Adding another mode of intuition for the above graph: suppose there are three states as expected:

Then, Ef(Xº)≤f(0) is stating:

P1[f(x1)-f(0)]+P2[f(x2)-f(0)]+[P3[f(x3)-f(0)]≤0

I.e. the f(xi) is the coefficient that goes along with the probabilityi. The constraint is the linear set where the inequality holds with equality.

Once again, we are going to concentrate on the extremals:

No matter what the constraint, the extremals will look like the above case (although we will not prove it explicitly.) Some of the extremals are the ones we considered for the case last time: complete certainty regarding the outcome (not much of the gamble.)

On the other hand, there are two other extremals as well due to the addition of a constraint. Both of these come from:

  • Being on the constraint
  • The random process leads to only two possible outcomes. (This is true in the graph above—but also will be true for uncertainty in n dimensions.)

As always, we will consider all of the extremals. First, we consider random variables in which the outcome is certain. This will only give us limited information, but it will be helpful. Then, we consider the class of gambles for which the constraint binds and the uncertainty is regarding which of two possible outcomes will arise.

Just as before, after finding the conditions which lead our proposal to be true for the extremals, we are done: we know linearity in relationship, and that all points within the convex cone can be created from convex combinations of the extremals.

Consider what would occur if we had two constraints. then we would need to consider mixes over three certain events. Here’s the graphical representation of how it would look:

It is clear that the circle is the fifth extremal. In order to get to that point, you need to mix three certain events—and thus you need to consider all mixes of three certain events. (Specifically, you would need to consider the set of random variables over two states in which one constraint binds and the other constraint is satisfied. In addition, you would need to consider points in which there is uncertainty over three states of the world and both constraints bind.)

Another intuition for only needing two points is Jensen’s inequality. The valuation is less than on the line connecting two points mixed over. Therefore, you only need two points in order to get the line between them.

Now let’s begin the math. Our goal is to show the conditions under which the following is true:

If Ef(Xº)≤f(0), then Eg(Xº)≤g(0)

In other words, ‘f is indifferent to the gamble or really doesn’t like it’ is the set of random variables we are going to consider for person of type g. The method that serves to consider this is the Kuhn-Tucker method. {Remember, the process most frequently considered when doing maximization subject to a constraint.) Thus, the first step is to write out the Lagrangian:

E[g(Xº)-g(0)]-mE[f(Xº)-f(0)]

When you do maximization problems, you realize that the constraint acts to create a separating hyperplane. The same is true for this problem. The difference between this problem and the maximization problem is simply direction of the proof. For maximization problems, you solve for the separating hyperplane (the values of the La Grange multipliers, i.e. prices) while finding the points that characterize the set of maxima. In this problem, we simply solve for the general separating hyperplane that generally allows the condition to be true.

It takes a bit of intuition to solve this problem. In other words, you need to guess the solution as there is no explicit way of solving for it. However, if you remember the Kuhn-Tucker method and you see how similar this problem to that one, you can guess the general form of the answer quite easily:

g(x)-g(0)≤m[f(x)-f(0)]

In other words, g gets less value above the baseline from each point x than does f—for the gambles that satisfy the constraint.

Let’s begin by showing that the above is the sufficient condition. Well, simply integrate both sides over the possible x inside xº, and you are done. The result is, as predicted (assuming that we can show that m≥0, which we will quite shortly.) Integrate both sides to get the comparative expectation:

E[g(Xº)-g(0)]≤E[f(Xº)-f(0)]

In other words:

If Ef(Xº)≤f(0), then Eg(Xº)≤g(0)

Proving that this is a necessary condition is the topic of the Gollier and Kimball paper ‘New Methods in the Classical Economics of Uncertainty: Characterizing Utility Functions”. The diffidence theorem states that, for some m for each X:

g(x)-g(0)≤m[f(x)-f(0)] ↔ If Ef(Xº)≤f(0), then Eg(Xº)≤g(0)

We have now proved the sufficiency condition; we leave sufficiency to the paper and turn to characterizing the m.

By the method of extremals and convex cones, we know that all we need to do is use the extremals, so let’s begin with the simplest ones (those where the constraint holds with inequality, and the random variable leads to a certain outcome.)

Write out the condition we are playing with one more time:

If Ef(Xº)≤f(0) then Eg(Xº)≤g(0)

Then g(x)-g(0)≤m[f(x)-f(0)] for all x

The top statement says that type g must not strictly prefer an certainties (from initial point 0) that type f does not. Use this information in the second statement, and we see that m must be ≥0. In other words, we have showed that the La Grange multiplier must be non-negative. However, this does not prove that it exists, nor does it fully characterize it. In order to prove this, we need to consider the extremals which are on the constraint and a mixture of two realizations.

Before we begin the actual algebra (which is quite simple), let’s try and get some graphical intuition for what we are trying to find:

We want to show that a line with slope m exists, and characterize the values along that line (hyperplane). The line will certainly be a function of both functions f and g. This line will be in the dimensions of ‘amount type f and amount type g prefer a particular gamble to their initial position’. We will only look for it for gambles over two possible outcomes, but it will generalize to randomization over more. Thus, we put all of the possible realizations of the randomization Xº (which type f does not like any more than their outside option) as small circles on a plane:

Of course, if one of the circles is within the first quadrant, then we have a gamble (with certain realization which f disliked, but g liked) that would leave type f worse off but type g better off. Thus, if there are any points in that quadrant, then the proposition must be false (disliked by f does not mean disliked by g.

The line m separates all realizations that can be randomized over leaving the proposition true. In other words, the line characterized by slope m will make certain that no linear combination of the items that satisfy the constraint can lead to a point within the first quadrant. If that were possible, then a combination of those two points would be preferable to g, but not to f. The X’s are unacceptable points (the connected ones because you could mix them into a gamble and have a point in the first quadrant.)

As you can see, this is really just the separating hyperplane theorem: there is a disjoint convex set that can be made of the realizations and values of X that a gamble Xº represents which satisfies the constraint that it is less ‘good’ than the present certain situation. Thus, there is a separating hyperplane that separates these from those points that would lead to the preference of g, but not of f. Graphically, we are looking at:

Now that we have the intuition well and muddled in our heads, let’s look at the algebra that is quite straightforward. Since only need to consider two-point uncertainty, write it that way. In addition, we are looking for the extremal, and thus for the point where the constraint holds with equality. It can only hold with equality if f(X1)<f(0) and f(X2)>f(0) {It would also hold if both points were equal to 0, but that is not a two-point system.) Denote P≡P1. Then on the border for type f means:

P[f(X1)+f(0)]=-(1-P)[f(X2)-f(0)]

Combine this with only the slightest algebra and we get:

Which leads to:

P=ε[0.1]

And similarly:

1-P=

Despite being on the border, it must still satisfy the condition:

P[g(x1)-g(0)]+(1-P)[g(X2)-g(0)≤0

Plug in for P and 1-P, and we get:

≤0

Because we have not limited ourselves to the point where g and f types are both indifferent to the gamble, we do not necessarily have a straight line. Instead, it will look like:

Even so, we see that we have something that looks like the m slope. We are almost there. A little algebra (and dividing out the ‘positive’) leads to:

In other words, as drawn in the graph, the above equation states that the graph of slope two is steeper than that for X1.

Note that the supremum (sup) is the equivalent of the maximum, except that we allow for the maximum to not exist. In random variables, it would be the limit of the support of the distribution at the top. Similarly, the infemum (inf) acts in the same manner for minimum.

Thus, as X1 gets closer and close to X2, we get:

Note that we used the supremum and infemum on purpose as we allow the possible states to not be continuous. (In addition, we use it because, even for differentiable functions, the sup and the inf are equal only at 0—in other words, at a gamble over one point, not two.) If f and g are not continuous, then there is a range of m that will work. Why this is so is easily seen from the following one-dimensional graph:

The 0’s are the points to be separated, and the thick lines are some of the possible hyperplanes separating them.

The equation above with the sup and inf are what we use to generate the actual equation for the separating hyperplane, m. That equation fully characterizes it, but we will still need to do a bit of algebra in order to use the equation. Of course, for non-continuous functions, there will be the possibility of multiple hyperplanes. Thus, to simplify, we assume double differentiability on the f function and g function. However, this is really overkill. All we really need is that f maps to the real line. Anything beyond that is simply simplification.

Just as in a maximization problem, we find the separating hyperplane from the first and second order conditions. Therefore, we write out the ‘value’ of the LaGrangian which we use to define m:

ξ(X)≡g(x)-g(0)-m[f(x)-f(0)]≤0

In other words, ξ is the valuation (above g(0)) that ξ values x—using the information about how much more type f prefers x to 0.

Of course, ξ(0)=0 as both f and g are indifferent. By differentiability, we also know that t is continuous in x. Combine this with the fact that m≥0, and the statement we began with (f dislikes the gamble, and g does not like it anymore), combined with m ≤0, and we seethat ξ≤0:

We know from the problem that we are building that ξ is maximized at 0 (although it could also be equal at other points.) Furthermore, by differentiability of a function that we are maximizing, we know that the FOC=0 there and the SOC<0. We are going to use these FOC and SOC to build the mathematics tool that we will apply to actual equations. Take the FOC and SOC of ξ:

FOC: ξ’(0)=g’(0)-mf’(0)=0

SOC:ξ”(0)=g”(0)-mf”(0)≤0

Of particular interest is the FOC as it states that:

m=

That’s it! We’re done (up to a normalization.) You can see the condition that leads to:

Ef(Xº)≤f(0) then Eg(Xº)≥g(0) (using differentiability, g’ exists and f’(0)≠0)

Is simply that:

1. g(x)-g(0)≤[f(x)-f(0)] for all x

by substituting in for m.

Now, let’s do two examples:

Example 1: f(x)=u1(w+x), g(x)=u2(w+x)

Plug this into equation 1:

U2(w+x)-u2(w)≤[U1(w+x)-U1(w)]

Unfortunately, the above equation does not have much economic meaning—remember that utility is ordinal rather than cardinal. Therefore, let’s normalize the slope at f(0) to be equal to 1. Dividing by U2’(w) leads to the normalization and we have:

This is the necessary and sufficient condition for:

If EU1(w+x)≤U1(w), then EU2(w+x)≤U2(w)

The above comparison is useful for ‘take it or leave it’ choices. It gives the necessary and sufficient condition for a type two player to not accept any more gambles than a type one player would. The term for this (courteous of Miles Kimball) is diffidence. (The more diffident player rejects more risks.)

Graphically, you see the following comparison:

The solid line is the utility of player one (in excess of the no risk option) divided by the slope of the utility curve at w. The curved line is the same for U2.

Note that diffidence is not as strong a condition as global risk aversion. In fact, it only gives the information for given points in wealth (as the w is the outside option.)

Example 2: f(x)=xU1(w+x), g(x)=xU2(w+x)

In other words, type two wants less of the risky asset than does type one—for given wealth.

The mapping of terms is:

F’(x)=u1’(w+x)+xU1”(w+x)

G’(x)=u2(w+x)+xU2”(w+x)

F”(x)=2u1”(w+x)+xU1”’(w+x)

G”(x)=2u2”(w+x)+xU2”’(w+x)

F’(0)=u1’(w)

G’(0)=u2’(w)

Plug into our magic formula, and you have:

XU2’(w+x)≤XU1’(w+x)

This is the necessary and sufficient condition for greater central risk aversion. While similar, this statement is weaker than a statement of global risk aversion. {Note that if U’(0)=0, then you have some difficulty as you are dividing by zero. However, 1. Hotelling’s lemma should give you a fine fraction, and 2. Marginal utility at zero=0??? Come on.)

We can do some useful algebra on the above condition. It implies:

The sign switches for X greater and less than 0, and thus we have:

for x>0

for x<0

In other words, we have a single-crossing property on marginal utility: