Maximization in Economics

Viðskipta- og hagfræðideild

Hagfræðiskor

Notes on

Maximization in Economics

Ragnar Arnason

DRAFT

January 2001

Preface

These notes contain a reasonably straight-forward and concise outline of elementary maximization techniques in economics. The notes are designed to be of practical use to undergraduate students in economics and business. Therefore the notes are primarily concerned with describing techniques for solving economic problems and providing examples. They contain no formal analysis. Although to a significant degree self-contained, the notes unavoidably assume some familiarity with elementary calculus and a minimal knowledge of the concepts matrix algebra.

University of Iceland

08.01.01

Ragnar Arnason

Contents

Page

0.Introduction2

1.Mathematical Preliminaries2

1.1Derivatives2

1.2Total Differentials4

1.3Concave, Convex Functions4

1.4Definite Hessian Matrics6

2.Maximization7

2.1The Simple Problem: No Constraints7

2.2Equality Constraints9

2.2.1 The Substitution Approach9

2.2.1 The Lagrange Multiplier Approach10

2.3Inequality Constraints12

3.Applications to Microeconomics13

3.1Derivation of Demand Functions13

3.2Derivation of the Slutsky Equation15

Further Readings17

0. Introduction

Consider a function, f(x,z), where x is a (1xn) vector and z is a (1xk) vector. This function maps x and z into a real number representing the value of the function.

A common problem in economics is to identify the x's that correspond to the maximum value of the function f(x,z). In that case f(x,z) is referred to as an objective function and the elements of the vector x as control variables.

The allowable range of the control variables is sometimes constrained. The constraints may be functions in the form of equalities or inequalities as follows:

bg(x),

where b is a constant and g is a function. A very simple example of this type of constraints is the inequality 0-x1 which of course translates into the more convenient expression x10.

When the functions f(x,z) and g(x) are twice continuously differentiable, a well developed theory for finding the maximizing levels of the control variables is available. The following discussion is restricted to that case.

1. Mathematical Preliminaries

1.1 Derivatives

Consider the function y=f(x)f(x1,x2,...,xn). By definition the first partial derivative of f( ) with respect to xi is[1]:

f(x)/xi= [f(x1,..,xi+h,...,xn)-f(x1,x2,...,xn)]/h.

This first derivative of f(x) with respect to xi is interchangeably referred to as

y/xi  f(x)/xi  fi fxi  fi(x)  fxi(x).

The function fi(x) is called a partial derivative because it measures the change in the variable y with respect to xi when the other variables contained in the vector x are kept constant. The derivative in the alternative case, i.e. where the other variables in x are also allowed to change in response to a change in xi (and thus exerting a secondary influence on y), is referred to as a total derivative denoted by

dy/dxidf(x)/dxi.

Not surprisingly, total derivatives are more complex and difficult to find than partial derivatives.

The first partial derivative (as well as the total derivative) is generally a function of the same variables as the original function. Consequently it may be differentiated again yielding:

fi/xj= [fi(x1,..,xi+h,...,xn)-fi(x1,x2,...,xn)]/h.

A derivative of a derivative is called a second derivative and is interchangeably denoted by:

f2(x)/xixj  y2/xixj  fij  fxixj  fij(x)  fxixj(x).

If i is not equal to j, fij is often referred to as a cross derivative of the function f(x). Concerning cross derivatives, the following useful theorem is available:

Clearly the above-described process of differentiation can in principle be extended to higher derivatives.

Example:

The first partial derivatives of the Cobb-Douglas function, y=xazb are:

y/x = axa-1zb,

y/z = bxazb-1.

The second partial derivatives are:

2y/x2  f11 = a(a-1) xa-2zb,

2y/z2  f22 = b(b-1)xazb-2,

2y/xz  f12  f21= abxa-1zb-1.

In maximization theory points in the domain of f(x) at which first derivatives of the function vanish, i.e. fi=0, are of special interest as they may correspond to a maximum of the function. Consequently they are often referred to as critical points.

1.2 Total Differentials

In order to find derivatives of a function (and for several other purposes) it is often convenient to take total differentials of the function.

Consider the function y=f(x), where x is a (1xn) vector. The total differential of the function is defined as:

(1)df(x)=f1dx1+f2dx2+.....fndxn i fidxi,

where dxi represents an infinitesimal change in xi.

Now, (1) can be used to easily obtain total and partial derivatives. Thus the total derivative of y with respect to x1, say, is:

df(x)/dx1=f1+f2dx2/dx1+.....fndxn/dx1.

Note that the terms dxi/dx1 for any i represent the change in xi when x1 is altered.

To find the first partial derivative of y with respect to x1 we must keep the other x variables (i.e. x2, x3,..., xn) constant. This means that dx2=dx3=....=dxn=0 and the total derivative symbol 'd' should be replaced by the partial derivative symbol ''. Hence (1) immediately yields:

f(x)/x1=f1.

Example:

Considering again the Cobb-Douglas function y=xazb. The total differential of this function is:

dy=(y/x)dx+(y/z)dz=(axa-1zb)dx+(bxazb-1)dz.

Thus, the total derivative of y with respect to z is immediately found to be:

dy/dz=(axa-1zb)dx/dz+(bxazb-1).

Restricting x to be constant reduces this expression to:

y/z=bxazb-1.

1.3 Concave, Convex Functions

Consider the function y=f(x), where x is a (1xn) vector. This function is strictly concave in x if and only if is satisfies the following condition:

For any two vectors x1 and x2 in the domain of f(x) and any  such that 1>>0

f(x1+(1-)x2)> f(x1)+(1-)f(x2).

This simply means that a straight line between any two points on the function f(x) lies below the function. This then is the defining characteristic of a concave function.

A weakly concave function f(x) is defined in the same way except the strict inequalities, i.e. the symbols '>', are replaced by weak inequalities, i.e. ''.

A concave function is illustrated in Figure 1.a below.

The definition of a strictly convex function, f(x), is as follows:

For any two vectors x1 and x2 in the domain of f(x) and any  such that 1>>0

f(x1+(1-) x2)< f(x1)+(1-)f(x2).

This simply means that a straight line between any two points on the function f(x) lies above the function. This then is the defining characteristic of a convex function.

A weakly convex function f(x) is defined in the same way except the strict inequalities, i.e. the symbols '<', are replaced by weak inequalities, i.e. ''.

A convex function is illustrated in Figure 1.b below.

It should be noticed that Figure 1 suggests that critical points of concave functions, points where all partial derivatives equal zerco, correspond to a maximum and critical points of convex functions to a minimum.

Finally, the following rules concerning concave and convex functions are useful:

1. The sum of two concave (convex) is concave (convex).

2. If f(x,z) is concave (convex) -f(x,z) is convex (concave).

1.4 Definite Hessian Matrices

Consider the function y=f(x), where x is a (1xn) vector. The following (nxn) matrix composed of second derivatives of f(x) is called the Hessian matrix of f(x):

The Hessian matrix, H, is not only square. It is also symmetric as fij=fji by Young's theorem. It is important realize that the elements of H are generally functions of x and, consequently, the values of these elements will generally alter from one point in the domain of f(x) to another.

The Hessian matrix, H, is said to be negative definite if the determinants of its principal minors alternate in sign with the first one being negative.

Conversely, the matrix, H, is said to be positive definite if sign of the determinants of all its principal minors are positive.

The following result is crucial:

2. Maximization

A typical economic problem is to find the level of a set of control variables that maximizes an objective function subject to some constraints. Formally this problem may be written as:

(I)Max f(x,z)

Subject to: bg(x),

where x is a (1xn) vector of control variables, z (1xk) vector of exogenous variables, b (mx1) vector of constants and g (mx1) vector of constraint functions. The problem, in other words has n control variables k exogenous variables and m constraints. The constraints determine the set of allowable x vectors from which the optimal one must be chosen.

It should be noted that the formulation of the constraints in (I) is quite general including equality constraints[2] as well as strict inequality constraints.

The case of minimisation is also included in (I). For the minimisation of a function f(x,z) subject to the constraints in (I) is equivalent to maximizing -f(x,z) subject to the same constraints.

2.1 The Simple Problem: No Constraints

Consider now this simple variant of (I):

(II)Max f(x,z),

i.e. an unconstrained maximization problem.

The first step towards solving this problem is to locate the critical points. They are given by the x satisfying the following conditions:

f(x,z)/x1  f1 = 0,

f(x,z)/x2  f2 = 0,

. . . .

f(x,z)/xn  fn = 0.

These conditions are also known as the necessary or first order conditions for solving (II). Thus the necessary conditions for solving (II) is that all first derivatives of the objective function with respect to the control variables must simultaneously be zero. This of course is highly intuitive. At the very peak of a mountain the ground is level (slopes equal to zero).

Zero slopes, however, are found at both minima and maxima as well as at several other singularities of functions. Therefore further checks are usually required in order to verify that a maximum has in fact been located. These checks take the form of additional conditions usually referred to as sufficient or second order conditions. Sufficient conditions in this simple case may be formulated as follows:

The necessary conditions correspond to a global maximum if

either (a) the function f(x,z) is concave in x

or(b) the Hessian matrix for f(x,z) is negative definite.

Moreover, if either the function f(x,z) is concave or the Hessian negative definite in the neighbourhood of a critical point, that critical point corresponds to a maximum.

If on the other hand the objective function is convex (or the Hessian positive definite) a critical point corresponds to a minimum.

Example

Consider the following problem:

Max f(x,y)=ax+by+cx2 + dy2

x,y

where a, b > 0 and c, d <0.

The necessary conditions are:

f1=a+2cx=0,

f2=b+2dy=0,

which yield the following candidates for a solution:

x=-a/2c,

y=-b/2d.

Checking the second order conditions, the Hessian matrix is:

which is clearly negative definate since c and d are both negative. hence we conclude that the first order conditions have in fact located a maximum.

2.2 Equality Constraints

Consider now the following variant of the standard problem (I):

(III)Max f(x,z),

Subject to: b = g(x).

So, in this case the maximization is subject to constraints but the constraints are all equality constraints. It may be helpful to write the constraints out in full. For m constraints, the full set is:

b1=g1(x),

b2=g2(x),

. .

bm=gm(x).

Example

An example of problem (III) is the simple utility maximization problem in economics:

Max U(x,y)

x,y

Subject to: pxx+pyy=,

where U(x,y) is a person's utility function dependent on the consumption of the commodities x and y. pxand pyare the market prices of x and y respectively and  represents the person's available money. Thus, the constraint is the person's budget line.

There are essentially two approaches to solving problems of type (III); (i) the Lagrange multiplier method and (ii) the method of substitution. Although the Lagrange multiplier approach is a general method, far superior to the substitution approach there are cases where the latter is more convenient than the former. It therefore deserves a brief description.

2.2.1 The Substitution Approach

In the substitution approach all the constraints are solved for a subset of the x's. If this is possible, which may very well not be the case, explicit functions for m of the control variables are obtained. Substituting these equations into the objective function transforms the initial maximization problem into an unconstrained one.

Example

It is easy to check that the utility maximization problem in the previous example may be rewritten as the following equivalent problem:

Max U((-pyy)/px,y)

Refer to the solution to this problem, i.e., the value of y that maximizes utility as y*. Then the optimal level of x is given by the equation:

x=(-pyy*)/px.

2.2.2 The Lagrange Multiplier Approach

The first step in this approach is to form the Lagrange function:

(2)L=f(x,z)+i i(bi-gi(x)),

where the i's are the Lagrange multipliers. There is one for each constraint. Notice that provided the constraints are not violated the Lagrange function is actually just the objective function.

Now, this Lagrange function may be regarded as an unconstrained objective function. The problem now is to find critical points of this Lagrange function with respect to x and the i's. This can be done in the usual manner outlined in section 2.1. The main point however is that critical points for the x's obtained by this approach will also be the critical points of the initial problem (i.e. (III)).

The critical points located correspond to a first order condition for solving (III). The following second order or sufficient condition is available:

For applying theorem 3, the following results are very useful:

Example

Consider again the utility maximization problem discussed in the previous two examples. The problem is:

Max U(x,y)

x,y

Subject to: pxx+pyy=,

The Lagrange function for this problem is:

L=U(x,y)+(-pxx-pyy).

The first order conditions are:

L=-pxx-pyy=0,

Lx=Ux-px=0,

Ly=Uy-py=0.

If U(x,y) is concave these first order conditions are sufficient, i.e. solve the problem.

If the determinant of the following bordered Hessian, H*, satisfies the conditions specified in Theorem 5, then the Lagrange function is concave in x and y and the first order conditions solve the problem.

A very important feature of the Lagrange approach is that the Lagrange multipliers can be interpreted as the value, in terms of the objective function, of a marginal relaxation of the constraining parameters, b. More precisely:

i=f(x*,z)/bi, where x* is the level of x that solves the maximization problem.

Thus the i’s measures the highest price the maximizer would be willing to pay for a small increase in bi. Since the 's have many of the characteristics of prices they are generally referred to as shadow prices. You may find it interesting that in a perfect market economy all prices equal shadow prices.

2.3 Inequality Constraints

We now turn to the general maximization problem stated in (I) at the outset of chapter 2. This problem, we have already noticed, includes the case of equality constraints as a special case. The general problem, however, comprises both the inequality and equality constraints. Not surprisingly, therefore, the conditions for solving this problem are both more complicated and offer less assistance in actually identifying the solution. Here, essentially for completeness, we simply list the fundamental conditions due to Kuhn and Tucker for solving the general problem.

It is convenient to restate problem (I) as follows[3]:

(IV)Max f(x,z)

Subject to:(a) b  g(x),

(b) x  0,

where as before x is a (1xn) vector of control variables, z (1xk) vector of exogenous variables, b (mx1) vector of constants and g (mx1) vector of constraint functions.

As before, the solution method involves forming the Lagrange function:

L=f(x,z)+ii(bi-gi(x)),

where the i's are Lagrange multipliers. In what follows it will be assumed that the expressions in parentheses are all nonnegative, i.e. not written in the form gi(x)-bi.

Assuming that the constraints satisfy certain regularity conditions, usually referred to as constraint qualification, the following provide necessary (first order conditions) for solving the problem:

(a) Lj=fj(x,z)- iigij(x)0, xj0, xjLj=0, j=1,2,..,n

(b) Lj=bj-gj(x)0, j0, jLj=0, j=1,2,...,m.

In the case of no nonnegativity constraints on the control variables, conditions (a) are reduced to the much simpler Lagrange conditions specified in section 2.2. In the case of equality constraints, conditions (b) are also reduced to the usual Lagrange conditions of section 2.2.

The sufficient conditions are essentially the same as the previous section. In particular, if the Lagrange function is concave in the control variables and the constraint qualification is satisfied the necessary conditions, (a) and (b) are also sufficient.

3. Applications to Microeconomics

A fundamental axiom in microeconomic theory is that economic agents strive to maximize their objective functions. Frequently, moreover, this maximization is subject to various constraints. Therefore, not surprisingly, it has been found that the mathematical theory of maximization outlined in chapter 2, provides an indispensable tool for analysing and predicting microeconomic behaviour. To illustrate this we will now briefly consider examples of the application of these methods to simple microeconomic problems.

3.1 Derivation of Demand Functions

Consider a consumer with well behaved preferences represented by the concave utility function U(x), where x represents a (1xn) vector of commodities. The consumer seeks to maximize his utility but is constrained in this endeavour by the budget constraint:

i pixi,

where pi denotes the unit price of commodity i.

Since utility maximization implies that the budget constraint is actually binding (why?) we may rewrite this problem as follows:

Max U(x)

Subject to: =i pixi.

To solve this problem we form the Lagrange function:

L=U(x)+(-ipixi).

The necessary and sufficient (why?) conditions for solving this problem are:

(i)Ui(x)=pi, i=1,2,...,n.

(ii)=i pixi.

It is worth noticing that conditions (i) immediately imply

Ui(x)/Uj(x)=pi/pj, for all i and j.

I.e. the marginal rate of substitution between goods i and j (MRS(i,j)) equals their price ratios.

Solving the n+1 necessary conditions for the x's and æ as functions of the other variables of the problem i.e. p and m we find:3

xi=Xi(p,), i=1,2,...,n,

(3)

=F(p,).

The first set of equations is the this consumer's demand function for the commodities, x. The second simply gives the shadow value of available money.

Notice that with the appropriate specifications of the commodities in the vectors x and p, the demand functions in (3) can also represent labour supply and demand for future commodities.

With the help of techniques called comparative statics additional information about the first derivatives of the demand functions can be obtained. Moreover, given an explicit utility function, explicit forms of the corresponding demand functions may be obtained.

Example

Consider now the simple case of two commodities x and y. Let the utility function be:

u=axbyc,

where a, b and c are positive constants. This function will be concave if b+c<1.

The budget constraint in obvious notation is:

=pxx+pyy.

The Lagrange function for the utility maximization problem is:

L=axbyc+(-pxx-pyy)

The necessary and sufficient conditions for maximum are:

abxb-1yc=px,

acxbyc-1=py,

=pxx+pyy.

Solving these three equations for x and y yields the following explicit form for the demand functions:

x=b/((b+c)px),

y=c/((b+c)py).

3.2 Derivation of the Slutsky Equation

Consider an n commodity economy. According to our previous analysis, a typical consumer's demand function for commodity i in this economy may be written as:

x(i)=X(p,m) X(p,ipixi),

since =i pixi.

Let the original commodity bundle chosen by the consumer be x*. Consider now a function that traces out the demand for commodity i for different prices on the assumption that income, i.e. m, is adjusted for every price combination so that the consumer is always just able to purchase his original bundle x*. This hypothetical function is a special type of a demand function usually referred to as a compensated demand function. Its important feature from our point of view is that this function does not incorporate any income effects only substitution effects. Call this compensated demand function for commodity i h(i). Clearly, h(i)=H(p,x*).

Now at the original commodity bundle:

H(p,x*)X(p,).

Differentiating this equation with respect to the ith price, i.e., pi we find:

H(p,x*)/pi=X(p,)/pi+(X(p,)/)(/pi),

H(p,x*)/pi=X(p,)/pi+(X(p,)/)xi,

since /pi=xi.

Rearranging, we obtain:

X(p,)/pi=H(p,x*)/pi-(X(p,)/)xi,

which is the Slutsky equation. The term H(p,x*)/pi is the substitution effect and the term -(X(p,)/)xi the income effect.

Further Readings

Binmore, K.G. 1983. Calculus. Cambridge University Press.

Chiang, A.C. 1984. Fundamental Methods of Mathematical Economics, McGraw-Hill.

Glaister, S. 1984. Mathematical Methods for Economists, Blackwell.

Intriligator, M.D. 1971. Mathematical Optimization and Economic Theory, Prentice Hall.