An Informal Proof/Demonstration of the Second Derivative Test

(on Second-Degree Two-Variable Polynomials)

Math 211, Arizona State University

By Scott Surgent

A second-degree two-variable polynomial has this general form:

The letters A through F represent the coefficients of the function, and the only requirement is that both A and B be non-zero (otherwise the surface reduces to a simpler shape that loses its appeal for this discussion).

Depending on the values and signs of A, B and C, the surface may be a paraboloid that opens up (hence, its vertex is a minimum), a paraboloid that opens down (the vertex is a maximum) or a saddle-shaped surface (the vertex is the saddle-point). The values D, E and F simply govern the shift of the surface, not its intrinsic shape. As we will see, only the values and the signs of A, B and C will be important for this demonstration.

More formally, the following conditions generate the following shapes:

Case I: A and B are both positive and C is “small”: The surface is a paraboloid and opens up.

Case II: A and B are both negative and C is “small”: The surface is a paraboloid and opens down.

Case III: A and B have the same sign but C is “large”: The surface is a saddle.

Case IV: A and B have different signs: the surface is a saddle, regardless of C.

In the demonstration that follows, exact conditions on C will be imposed.

Before we start with the demonstration, we will assume that the vertex point (the critical point) is located at and . If it is not, then simple shifts can be imposed to move the critical point over the origin. In other words, once we show the second derivative test to be true at and , then it is true at any critical point by taking into accounts the shifts that can be imposed at will. We do not lose any generality this way.

The basic idea is this: if the function has a critical point at and , then it may be a minimum, maximum or saddle depending on the results of the second derivative. Visually, if the critical point is a maximum, we want it to be a maximum for all cross-sections through the origin, not just for the x-axis and y-axis cross-sections. Similarly, if the critical point is a minimum, it is a minimum for all cross-sections. However, if the vertex is a saddle, then it is a maximum for some cross-sections or a minimum for other cross-sections.

Take the two first partial derivatives:

Now take the four second partial derivatives:

Make note of these values as they will become important much later in the discussion. Also, note that only the values A, B and C “survive” to the second derivative step.

Now, refer back to the original function:

We want to look at any cross section through the origin. This can be done by defining a line of slope k on the x-y plane that passes through the origin (akin to a constraint path). Let this line be . Therefore, substituting this into the function, we get:

The result is a cross-section function in x only, and it is a parabola. Using normal single-variable calculus steps, the first and second derivatives are:

Although the second derivative is complicated looking, it is just a constant that depends on the values and signs of A, B, C and k.

The idea now is to study this second derivative equation and whether its signs change or stay the same, and under what conditions the signs may change. The “2” can be ignored; we’ll only deal with the expression .

In a sense, we are letting k act as a variable since we are looking at all possible cross-sections through the origin, and each k represents one possible cross-section (except for the y-axis itself). So we look at this expression and treat it as a parabola: . From algebra, the “k”-coordinate vertex of a parabola is found by . The full vertex would be

where the second coordinate was found by substituting into and reducing the expression into a single fraction.

Very important fact:

The vertex is always the highest or lowest point of a parabola. Therefore, one of the following two facts will be true (depending on B):

if or if

If B is positive, the expression on the left is a parabola that opens up so therefore its “y”-values (the full expression on the left) must be greater than or equal to the “y”-value of its vertex point (the full expression on the right). Analogously if B is negative.


So now we run through all possible cases. Remember from the first page the cases were:

Case I: A and B are both positive and C is “small”: The surface is a paraboloid and opens up.

Case II: A and B are both negative and C is “small”: The surface is a paraboloid and opens down.

Case III: A and B have the same sign but C is “large”: The surface is a saddle.

Case IV: A and B have different signs: the surface is a saddle, regardless of C.

Now we plan to be more formal with the discussion of the cases:

Case I: A and B are both positive and C is “small”: The surface is a paraboloid and opens up.

Since A and B are positive, the values are always greater than the value :

Think of being a parabola (in variable k) that opens up because B is positive, and therefore all its “y”-values will always be greater than the y-value of the vertex. As long as this value is positive, then the second derivative of the original surface is always a positive quantity for all k (for all cross-sections). Hence, all cross-sections are concave up, and the vertex (0,0) from the original surface is a minimum.

This is true only if C is “small” so that . This allows us now to formally define what C can be: we want , so solving for C we get . As long as this is true, then case I holds true. If , then the expression is negative, and we have a case where the second derivative is sometimes positive (for some k) and sometimes negative (for some other k). This is a saddle (case III).

Case II: A and B are both negative and C is “small”: The surface is a paraboloid and opens down.

Since A and B are negative, the values are always less than the value :

Furthermore, the value will be negative as long as C is “small” (the bottom will be negative no matter what, while the top may be positive if C is small). A similar argument as in case I ensues: the second derivative values are always less than a known negative quantity , so therefore the second derivative is always negative for all k (for all cross-sections). Therefore, the vertex at (0,0) is a maximum.

If C is sufficiently large so that it makes the numerator of negative (and hence, the whole expression positive), then we get another case where the second derivative expression can sometimes be positive (for some k) and sometimes negative (for other k). This is a saddle and another example of case III. The same restrictions on C are imposed as from case I.


Case III: A and B have the same sign but C is “large”: The surface is a saddle.

We have just discussed this case: when A and B have the same sign and , then either case I or case II is true. Otherwise we get this case, and we have a saddle point.

Case IV: A and B have different signs: the surface is a saddle, regardless of C.

Again we look at the relationship between and . If A and B have different signs then 4AB is always negative (do you see why?), so subtracting keeps the numerator negative anyway. In the case where B is positive and A negative, then the expression is negative while opens up, so therefore is sometimes positive and sometimes negative, depending on k. Hence, a saddle (case III again). In the case where B is negative and A positive, then opens down but is positive: again, a case where is sometimes positive and sometimes negative, depending on k, which results in a saddle.

Lastly, note that the expression played a fundamental role in the study of each case. Remember, these were actually second derivatives of the original function:

Hence, . This is the familiar (except for the leading 4) expression for the second derivative test in multi-variable functions. The “4” is a result of the fact we only studied one type of surface (the second-degree two-variable case).

Conclusion

This demonstration was by no means a complete proof since we did it for the specific case of a second-degree two-variable function and not for general functions in three dimensions. The nature of this demonstration also made it impossible to include the case where the cross section is the y-axis itself (since k would have to be infinite). However, we can handle that case on its own as we have before. The full proof requires vectors and total derivatives, topics we do not cover in Math 211. Even so, hopefully this helps explain why the second derivative form has the appearance that it does.