Nonlinear Programming (ORI 391Q)
Homework 6 (Steepest Descent)
1. Consider the problem
Minimize 5x2 + 5y2 - xy - 11x + 11y + 11.
a. Find a point satisfying the first-order necessary conditions for a solution.
b. Show that this point is a global minimum.
c. What would be the rate of convergence of steepest descent for this problem?
d. Starting at x = y = 0, how many steepest descent iterations would it take (at most) to reduce the function value to 10-11?
2. Let e1, e2, . . . , en denote the eigenvectors of the symmetric positive definite n ´ n matrix Q. For the quadratic problem: min{f(x) = xTQx – xTb}, suppose x0 is chosen so that g0 = Qx0 – b belongs to a subspace M spanned by a subset of the ei’s. Show that for the method of steepest descent gk Î M for all k. Also, find the rate of convergence for this case.
Hint: Let M = {y Î Rn : y = , where bi are scalars}, and use induction. In your proof, you might need to make use of the eigenvalue equation: Qei = liei, where li is the ith eigenvalue, i = 1,…,n.
3. Let f(x,y) = x2 + y2+ xy - 3x.
a. Find an unconstrained local minimum point of f.
b. Why is the solution to part (a) actually a global minimum point?
c. Find the minimum point of f subject to x ³ 0, y ³ 0.
d. If the method of steepest descent were applied to part (a), what would be the rate of convergence of the objective function?
4. Stopping criterion: A question that arises in using an algorithm such as steepest descent to minimize an objective function f is when to stop the iterative process, or, in other words, how can one tell when the current point is close to a solution. If, as with steepest descent, it is known that convergence is linear, this knowledge can be used to develop a stopping criterion. Let be the sequence of values obtained by the algorithm. We assume that fk ® f* linearly, but both f* and the convergence ratio b are unknown. However, we know that, at least approximately,
fk+1 - f* = b ( fk - f*)
and
fk - f* = b (fk-1 - f*).
These two equations can be solved for b and f*.
a. Show that
.
b. Motivated by the above we form the sequence defined by
as the original sequence is generated. (This procedure of generating from {fk} is called the Aitken d2-process.) If = bk + o(bk) show that = o(bk) which means that converges to f* faster than { fk} does. The iterative search for the minimum of f can then be terminated when fk - is smaller than some prescribed tolerance.
Hint: For convenience and without loss of generality, assume f* = 0. Also recall that the notation h(x) = o(x) means that limxॠh(x)/x = 0. Now consider /bk.
- 1 -