Rensselaer Polytechnic Institute

35476 Computer Applications Laboratory

EXPERIMENTS IN OPTIMAL CONTROL

Number of Sessions – 4

INTRODUCTION

In modern control systems design it is sometimes necessary to design controllers that not only effectively control the behavior of a system, but also minimize or maximize some user defined criteria such as energy or time conservation, or obey other physical or time constraints imposed by the environment. Optimal Control Theory provides the mathematical tools for solving problems like these, either analytically or through computer iterative methods, by formulating the user criteria into a cost function and using the state equation representation for the system dynamics.

The optimal control experiments in this lab involve the use of a PC-AT to illustrate the efficacy of microprocessors as system controllers. Two types of control will be considered:

1) Continuous (analog)

2) Discrete (digital)

The general formulation of the optimal control problem is as follows:

Let the system states (ie. the state variables) be represented by an n-dimensional vector x, where n is the order of the system. Let the control variables (input) be represented by an m-dimensional vector u, and the system output by an r-dimensional vector y. The system can, in general, be modelled by a set of differential equations of the form:

= a (x, u) , x(t0) = c

y = b (x, u) (1)

where a and b are general (possibly nonlinear and time varying) expressions of x and u, t0 is the initial time and c is an n-dimensional set of initial conditions. The objective is to find a control input u(t) that will drive the system from the initial point c ( =x(t0) ) in the state space to the final point x(tf), and at the same time minimize a cost functional J (formally called performance index/criterion) given by equation (2):

J = h(x(tf), tf) + dt (2)

where:

tf represents the end of the control interval,

h and g are user defined penalty expressions.

PART I – CONTINUOUS CONTROL

MATHEMATICAL FORMULATION – RESULTS

Even though the general problem is very hard to solve, there are very useful simplifications that lead to closed form (analytic) solutions. The most common simplification (and the one used for this experiment) is the LQR problem, where the system is Linear and the controller (Regulator) must satisfy a Quadratic cost functional. Assuming that the system is time invariant as well, then state and output equations (1) become:

= Ax(t) + Bu(t) , x(0) = x0

y(t) = Cx(t) + Du(t) (3)

where:

x(t) is the (n x 1) state vector,

x0 is the initial state vector,

u(t) is the (m x 1) control input vector,

y(t) is the (r x 1) output vector,

A is the (n x n) state dynamics matrix,

B is the (n x m) control dynamics matrix,

C is the (r x n) state-output matrix,

D is the (r x m) input-output matrix (for all practical purposes assumed 0 thereafter).

Corresponding to the system is a performance index represented in a quadratic form as:

J(u(t), x(t), t) = x'(t)Hx(t) + dt (4)

where:

H is the (n x n) terminal state penalty matrix,

Q is the (n x n) state penalty matrix,

R is the (m x m) control penalty matrix.

It is important to emphasize that the H, Q, and R matrices are user selectable, and it is through the proper selection of these that the various environment constraints are to be satisfied. If H, Q, and R are positive definite and symmetric, then it can be shown that a closed loop optimal control function u(t) exists (called u*(t)), and is uniquely given by:

u*(t) = -R-1B'P(t)x(t) = G(t)x(t)

G(t) -R-1B'P(t) (5)


where:

G(t) is the (m x n) optimal feedback gain matrix,

P(t) is an (n x n) symmetric and positive definite matrix that satisfies the continuous matrix differential Riccati equation given by:

= -P(t)A - A'P(t) + P(t)BR-1B'P(t) - Q , P(tf) = H (6)

From the above it is obvious that, even for the LQR problem, the set of coupled differential equations (6) must be solved before the controller gains be found. It should be noted that these gains are functions of time. The significance of the result of equation (5) is that the gains can be computed ahead of time, and the optimal controller is easily implemented on any digital computer. In all but the simplest cases of the real life applications, computer iterative methods are utilized to pre-calculate and store the P matrix and the gains, resulting in very efficient and robust open and closed loop controllers.

A further simplification to the above solution takes place when tf approaches ∞, or in more realistic terms the control is applied for a very large period of time. In this case it is shown that: if (1) the system is controllable, (2) H is zero, (3) the A, B, Q, and R, are all constant matrices, then the P(t) matrix converges to a constant symmetric, real, positive definite matrix Pss. Obviously then is zero, and the matrix differential Riccati (6) is transformed into a matrix algebraic equation (called the Steady State Riccati Equation) given by:

0 = -PA - A'P + PBR-1B'P - Q (7)

The steady state solution Pss need not be unique. However, only one real positive definite solution can exist. Since P is of dimensions (n x n), we obtain n2 equations to solve for the n2 components of P. Yet due to the fact that it is symmetric, the number of computations is greatly reduced. Note that the steady state Riccati equation for the continuous case is easier to solve without the aid of a computer than is the discrete equivalent to be computer later, because no inverse of the unknown P appears in the equation. Once the P matrix is determined, the optimal control gain matrix G can be found from equation (5). Of course now the gains are constants as expected. Solving the general (non steady state) LQR problem requires substantial storage for the gains that depends on the tf and the sampling interval, and exceeds the scope of this experiment which will hereafter focus on the steady state case.

Another obvious but never the less important result of the steady state case is that the system states must converge to zero regardless of their initial condition vector x0. An intuitive proof by contradiction of the above result can be derived using the following notion: The performance index is always positive as sum of positive factors since all matrices are positive definite. If it is to be minimized over an infinitely large time, then this minimum better be constant. Yet if the states are not approaching zero, then from equation (5) the control isn't either, and we end up summing two positive quantities, inside the integral, for an infinite time. This of course can never give a constant (minimum) value, therefore the states must go to zero.

Applications exploiting this result are controllers designed to keep the states always near zero, counteracting random abrupt changes in the state values (or any other causes that can be modelled as impulses). The following example sets up a typical infinite time LQR problem and analytically solves the steady state Riccati equation, giving the student sufficient experience to deal with the actual experiment.


EXAMPLE

Consider the following second order system:

= + u(t)

y(t) = [ 1 0 ]

with the corresponding performance index given by

J = dt

The continuous steady state Riccati equation (7) can be written as:

PBR-1B'P = PA + A'P + Q (8)

and using this example's data it becomes:

[1][ 0 1 ] = + +

The matrix equation can be rewritten as three simultaneous algebraic equations (remember
P12 = P21):

P= 1 - 2P12

P12P22 = P11 - P22

P= 1 + 2P12

Solving the three equations, two real solutions for P are obtained:

,

By checking the principal minors of each of the two matrices, it is found that only the first of these is positive definite[*], therefore it's the accepted solution.

The feedback gain matrix is then determined from equation (5) to be:

G = [ -0.4142 -1.352 ]

Thus, the optimal control input u(t) is given by:

u(t) = -0.4142x1(t) - 1.352x2(t) (9)

The student is expected to work through this example prior to attempting the problem for the experiment, to ensure a thorough understanding of the procedure.

FIGURE 1. Block diagram of the open loop system.

PROBLEM FORMULATION

This experiment is based on the Steady State case of the LQR problem. Here the plant to be controlled is a 2nd order linear and time invariant system implemented on the Comdyna analog computer, the penalty matrices Q, R are constants and H is zero. The optimal control input u(t) will drive the system states x1 and x2 from the initial values of -3 volts and 0 volts respectively to zero, and it will be generated either using the Comdyna analog computer, or a PC-AT running the appropriate program. The second order system to be controlled is modeled by the transfer function:

H(s) = = = (10)

which is to be implemented on the Comdyna analog computer following a suitably designed simulation diagram.

The state variable representation of the system transfer function (10) is of the same general form as equations (3) with D equal to zero, and is given by:

= + u(t) , =

y(t) = [ 1 0 ] (11)

Given the state variable equations, the open loop system block diagram can be obtained as in FIG. 1 and the actual analog computer simulation as in FIG 2. The student is urged to verify both results, paying particular attention to the initial conditions in FIG. 2. Note that the integrators invert the voltage on the IC input so a +3 volts will give the desired initial output voltage of -3 volts.

FIGURE 2. Analog computer simulation of the open loop system.

The performance index for this system is given in accordance with equation (4) by:

J = dt (12)

EXPERIMENTAL PROCEDURE

The aim of the experiment is to design a closed loop optimal controller that will drive the states to zero using the feedback gains obtained via the equations (5). For this purpose the following steps must be sequentially implemented:

1) The open loop system is to be built on the Comdyna analog computer in accordance with the simulation diagram of FIG. 2. Special care must be taken when implementing the initial conditions and the gains because of the sign inversions at the output of the amplifiers. The strip chart recorder must be thoroughly calibrated and its various scales mastered, before any useful work can be done. Remember, the Comdyna dial must be on the Pot Set position and the push button labeled IC be pressed in during setup; during operation the dial must always be on the Oper position, and the push button labeled OP must be pressed in just before each run starts.

2) The impulse response of the system is to be simulated using the Comdyna and the PC-AT, and the natural (uncontrolled) modes of the states x1, x2 are to be plotted on the strip chart. The concept behind this, is that when an Asymptotically Stable system is excited with an impulse function d(t), then the states converge to zero regardless of their initial values, producing the same final result as the optimal controller does. Hence the plots of the state trajectories obtained in this part, are to be compared with the ones produced by applying the closed loop optimal control. Before running this part it is necessary to verify that both states are indeed stable by solving the system differential equations. However, it should be noted that this system is not strictly Asymptotically Stable due to the pole at the origin.

To implement the impulse function, connect the D/A0 port of the PC-AT to the input of the system, and select the Uncontrolled State Response option from the program menu (details on running the PC-AT program are provided later). It is important to understand that since the d(t) function is theoretical pulse of infinite amplitude and infinitesimal duration, no human realizable input can reproduce it. Hence a +10 volt step function is applied for a period of 296 msec to produce the same net result. The duration is found experimentally and is valid only for the -3 volt initial condition of state x1. Also you may observe that state x1 reaches the desired zero as theoretically expected, yet it then continues linearly increasing as time passes by. This should be attributed to leaks within the Comdyna integrators (from the first to the second integrator) rather than to a theory inconsistency.

FIGURE 3. Block diagram of the closed loop system.

3) The algebraic Ricatti equation for the above system is to be solved and the P matrix elements and the optimal feedback gains g1, g2 are to be computed using the equations (5) and the example. Even though MATLAB or any other equivalent math package can be used for verification purposes, a detailed analytical solution of the P matrix elements and the gains is mandatory. Having found the gains, the optimal controller is given by equation (13) as:

u*(t) = g1x1(t) + g2x2(t) (13)

The block diagram of the complete closed loop system is given on FIG. 3, and its simulation diagram is given on FIG. 4. The control calculation (equation (13)) of FIG. 4 is to be implemented either on the Comdyna by the student, or through the PC-AT program upon selecting the relevant menu option.

Figure 4. Analog computer simulation of the closed loop system.

4) The complete closed loop system, both the plant and the feedback controller, is to be built on the Comdyna analog computer using the FIG. 4, and a strip chart recording of the state trajectories be obtained for comparison with the results from the other runs. This controller should produce the best results for a given set of dynamics and penalties. (Why?) The implementation on the Comdyna can be done either using the multipliers, or passing the states individually through the .1 inputs of adders to effectively increase them by a factor of 10, and then through potentiometers with values .1 times the gains. Selecting the later method requires the use two analog computers.