Using Secant Parabolas to Find Roots
Historical Background One of the most important and longest-standing problems in mathematics is that of finding the roots of equations. Solving any linear equations is trivial. Quadratic equations are more challenging, though the quadratic formula was essentially known to the ancient Babylonians, some 3000 years ago. It took about 1500 years until the Italian mathematician Niccolò Tartaglia discovered a formula (in the 1530’s) that takes about half of a printed page to display for finding the three roots of any cubic equation, though he attempted to keep the result a secret. Several years later, the Italian mathematician Lodovico Ferrari discovered a method for solving for the four roots of any quartic equation (this takes two full printed pages). In 1824, the Norwegian mathematician Niels Abel proved that it is not possible for any such formula to exist for every polynomials of degree five or higher; it became known as the Abel Impossibility Theorem.
If the equation is not polynomial, then the chances of being able to find its roots algebraically become far less likely. For instance, the transcendental equation has infinitely many real roots (look at the graphs of the two functions), but it is impossible to solve for any of them in closed form (meaning, via an algebraic expression for x). Instead, over the last 200 years, mathematicians and other users developed various numerical techniques for approximating the roots of any polynomial or other equation to any desired degree of accuracy. Efforts to find more effective (meaning faster) methods are still a major on-going research effort.
Some Root-Finding Methods We look briefly at some of the simplest, most famous, and useful, methods for approximating roots of equations , assuming is continuous. Perhaps the simplest is the bisection method that starts by finding, (either trial-and-error or the graph), two values of x, call them and , for which and have opposite signs. Consequently, and lie on either side of an unknown real root r, as shown in Figure 1. The interval is then split in half at . The desired root r must then lie in either the left-hand subinterval if and have opposite signs, or in the right-hand subinterval otherwise. This process is then repeated iteratively until the distance between the bracketing points is as small as desired and so the final approximation is as accurate as one wants.
For example, suppose we seek the intersection point of and (see Figure 2a). This point is clearly somewhere in the interval , roughly near . Instead of using the two curves, we consider the function (see Figure 2b). The intersection point therefore corresponds to the real root of .
Using the bisection method, the midpoint of is where . Consequently, the root is in the right half, so we consider the subinterval , whose midpoint is . Since is positive, is negative, and is negative, we conclude that the root must be in the left-hand subinterval . The midpoint is at ; since is positive, the root must be between 0.625 and 0.75. In turn, the midpoint of this subinterval is , where , so r must be between 0.6875 and 0.75. The midpoint of this subinterval is . The process can easily be continued to zero in on the root, which is actually , correct to eight decimal places. However, notice that after four iterations, we have only determined the first decimal place, so the process certainly seems rather slow. We show the values obtained in Table 1, which demonstrates that the method is clearly converging to the root 0.70346742. This is reinforced by looking at the errors – the differences between the approximation and the correct answer – that are in the last column. The errors tend to zero and are reduced by about 1/10 after every three iterations, which indicates a slow speed for convergence.
This observation suggests that it is possible to assess the accuracy of each subsequent approximation. At each stage, both the actual root r and the approximation lies within some subinterval, say , and therefore the maximum possible error, which is the difference between r and , is at most . Thus, in the above example, the first estimate is off by at most 0.5, the width of that subinterval. The second estimate, , is off by at most 0.25, which is half of the preceding maximum error, and so forth. Therefore, after any three successive estimates, the maximum error is reduced by a factor of 1/8, which means that, roughly, every three iterations of the bisection method reduces the error by about 1/10, so that we essentially obtain an extra decimal place of accuracy.
The author has developed an exploratory dynamic Excel spreadsheet that allows interested readers and their students to experiment with the bisection method by changing the function and the initial interval to see the convergence both graphically and numerically. It, as well as dynamic Excel spreadsheets for the other methods discussed in this article, can be downloaded from ….
A second, more effective, method for approximating roots of equations is the secant method. It similarly starts by bracketing a real root r between two values and , which determine the points and on the curve. The line connecting these points is constructed and the point where it intersects the horizontal axis is calculated. See Figure 3. Then the process is repeated using and to obtain , and so on, until the final approximation is as accurate as one wants.
Using the same example , we again start with the initial interval , as shown in Figure 2b. The secant line through the endpoints and has the equation . This line intersects the x-axis at , correct to six decimal places, as shown in Figure 4. We then construct the secant line joining and . The equation of this line is and it intersects the x-axis at . A third repetition of the method has us find the secant line through and . The equation of the line is , which intersects the x-axis at . Notice that after three iterations, we have already determined the first three decimal places of the root r, compared to the bisection method that required four iterations to gain one decimal place accuracy. So the secant method converges more rapidly than the bisection method.
We show the various values obtained in Table 2, which demonstrate that the method is clearly converging to the root. Notice how rapidly the error values decrease to zero; it is much faster than with the bisection method. However, the secant method may fail if we do not have a good initial interval to begin with. On the other hand, the bisection method uses the shrinking bracketing intervals that always contain the root and hence will always work.
A dynamic Excel spreadsheet is available to experiment with the secant method for any function and any initial interval to see the convergence both graphically and numerically.
A third method, called the Regula Falsi (False Position) Method, is a combination of the secant method and the bisection method. It starts with the secant method to bracket a real root r between two values and , which determine the points and on the curve. The line connecting these points is constructed and the point where it intersects the horizontal axis is calculated. Then, as in the bisection method, the desired root r must lie in either the left-hand subinterval if and have opposite signs, or in the right-hand subinterval otherwise. The process is repeated iteratively until the length of the subinterval between the bracketing points is as small as desired and so the final approximation is as accurate as one wants.
Using the same example , we again start with the initial interval . The secant line through the endpoints and has equation ; it intersects the x-axis at , correct to six decimal places. Since , the root is between 0.612700 and 1. We then construct the secant line from to ; its equation is and it intersects the x-axis at , where . A third repetition of the method has us find the secant line through and , whose equation is . Again this line intersects the x-axis at, where . We show the various values obtained in Table 3, which demonstrates that the method is clearly converging to the root. Notice how rapidly the error values decrease; it is faster than with the bisection method, but slower than with the secant method. Unlike the secant method that may fail if the we do not have a good initial interval, the false position method always works because it keeps the solution inside the initial interval. Notice that each successive calculated value for y – first 0.166485, then 0.018994, and finally 0.002061 – is considerably smaller than the previous one. This suggests that the points we are using for the approximations are ever closer to the root.
A dynamic Excel spreadsheet is also available for experimenting with the false position method for any function and any initial interval to see the convergence both graphically and numerically.
A fourth, calculus-based, root-finding method is Newton’s method that appears in almost every calculus text. It starts with a single initial approximation to the root r. The tangent line to the curve at the point is constructed and the point where the line crosses the x-axis is then taken as the subsequent approximation, as shown in Figure 5. The process is continued repetitively until the desired level of accuracy is achieved. If is sufficiently close to r, the process converges exceptionally rapidly; typically, once a sufficiently accurate approximation is determined, each subsequent approximation essentially doubles the number of correct digits. Unfortunately, if is not close enough to r, the process could shoot way off to a very distant intersection point and meander slowly back toward r, or it could converge to a different root, or it could even not converge at all.
We suggest that interested readers apply Newton's Method to the same example we used previously to compare the rate of convergence.
A Root-Finding Method Using Quadratic Interpolation We now consider a slightly more sophisticated approach that extends the secant method by using the secant parabola to the curve determined by three points in the interpolation sense. Any three non-collinear points (points that do not lie on a line) uniquely determine a quadratic polynomial. Suppose that the root of the equation is r and that we bracket r between and . For our third point , choose the midpoint of the interval , which is . If the two initial points are reasonably close to r, the parabola determined by the points , , and will lie extremely close to the curve of the function. See Figure 6. We then use the quadratic formula to determine the zeros of this parabola; if the initial points are close enough to r, then the approximating parabola will have a pair of real zeros and one of them should be very close to r, as seen in Figure 6. We select whichever of the two zeros lies within the interval , call it , and then determine the parabola through the three points , , and . The following set of points will be , , and , and so on.
We can actually anticipate which of the two zeros should be used if the portion of the curve being bracketed is either strictly increasing or decreasing with fixed concavity. If it is increasing and concave up, then the larger root will be inside the interval and close to r; if it is increasing and concave down, then the smaller root will be inside the interval and close to r. If the curve is decreasing and concave up, we should use the smaller root and if it is decreasing and concave down, then the larger root works best. However, if the direction and/or the concavity change, we cannot anticipate in advance; instead, we should zoom in more closely about r.
We again use the same example as before to illustrate the convergence of this method. The initial interval is , so that and the three points are , , and . The parabola through these points is (you can use a graphing calculator to find this) and its roots are and , both correct to eight decimal places (from the quadratic formula or various methods on a graphing calculator). Since is within the interval , we take . In comparison, the actual root is roughly , so this single iteration has produced an approximation that is accurate to two decimal places. Incidentally, we do not show the actual graphs because they are so close to one another that they are indistinguishable.
For the next iteration, we use the values , , and , so that the three points are , , and . Notice how close to 0 we are (0.00410845) at the middle point. The parabola through these points is and its roots are and , so we use as the next approximation. Note that this approximation is correct to four decimal places, so we have effectively increased the number of correct digits by 2. Also, the associated value of y is 0.00004137, which is extremely close to 0. We show the results of the first three iterations of the method in Table 4, which shows just how quickly the successive approximations converge to the root r, how fast the associated heights at these points converge to 0, and how rapidly the error approaches 0. In fact, after three iterations, we effectively have determined the solution correctly to six decimal places, a gain of two more correct digits. We can expect that the next iteration adds at least two additional correct digits.