FW853 Topic 13. Dynamic Programming

FW853--Topic 13. Dynamic Programming

Much different than past methods, since there isn=t a Ageneral@ algorithm. The method and basic principles, however, are very flexible, and can accomodate nonlinear constraints and objective functions. Also applicable to situations where have probabilities, and other stochastic problems.
Most often used in problems where there is a sequence of choices to be evaluated. Best seen through and example.
Example
Stagecoach problem-basically, a traveler wants to get from point 1 to point 10 with the least likelihood of getting killed (or less dramatically, at the least cost). This problem presumes 4 stages. This problem also has 10 states. The entire problem can be neatly summarized in a figure showing stages and costs of different routes. This problem has 18 possible routes, so it could be done by complete enumeration. Larger problems with more stages or states can have far more possibilities than could be evaluated by complete enumeration.
Notice that a Agreedy@ algorithm is not the best-if go 1->2->6->9->10 this costs 13 whereas 1-4-6-9-10 costs 11
Question is Awhat is the algorithm@-Begin at the end. Note this is a simple case since have only one possible end point
At stage 4, you can only be at state 10, and cost is zero. Now need some notation and concepts-

(1)let n=stage. Here, stages go from 1 to 4. Since only have one start, x1=1, and x4=10.

(2)let x=decision variable. Thus, xn (n=1,2,3,4) are immediate destination on stage n.

(3)let s=state.

(4)let fn(s,xn)=total cost of the best overall policy for remaining stages, given that you are in state s, ready to start stage n, and selects xn as the next destination-go to example

(1)If you are in state 8, ready to go to stage 4, and select state 10 as immediate destination, then, value of fn=3

(2)This will become more clear on next stage.

(5)Given s and n, let x be the Abest@ choice of new destination. Also let f (s) be the corresponding value. Thus,

(1)f (s)=min fn(s,xn) across all xn = fn(s,x )

(2)Put another way, fn(s,xn)=immediate cost for stage n + minimum future costs (stages n+1 onward) Formula fn(s,xn)=csxn + f (xn)

(3)Begin with tableaus

(6)Process works backwards until reach stage 1-problem is then Asolved@

(7)Advantages-

(1)Once have solved problem, you also have a solution for all possible starting points.

(2)If get off track-since you have all solutions, you can readily make best decision from there on

(3)This can also be used as a sort of sensitivity analysis

(4)Definitions of states and stages is very flexible-example in Hillier and Lieberman on distribution of effort (medical teams)

Principle of optimality-this is a necessary condition (i.e., assumption) for dynamic programming to work

(1)Given the current state of the system, an optimum policy for the remaining stages is independent of any policy adopted in previous stages

(2)Put another way (my words) It doesn=t matter how you got where you are, only worry about doing the best you can from here on out.

(3)Note that how you define state becomes very important if past history carries over in a way that is not represented.

Another example-allocation of effort
Copy pages from Hilier and Lieberman-World health organization
Key point is how states are defined