Abstract
In this article, we discuss a special class of time-optimal control problems for dynamic systems, where the final state of a system lies on a hyper-surface. In time domain, this endpoint constraint may be given by a scalar equation, which we call transversality condition. It is well known that such problems can be transformed to a two-point boundary value problem, which is usually hard to solve, and requires an initial guess close to the optimal solution. Hence, we propose a new gradient-based iterative solution strategy instead, where the gradient of the cost functional, i.e., of the final time, is computed with the adjoint method. Two formulations of the adjoint method are presented in order to solve such control problems. First, we consider a hybrid approach, where the state equations and the adjoint equations are formulated in time domain but the controls and the gradient formula are transformed to a spatial variable with fixed boundaries. Second, we introduce an alternative approach, in which we carry out a complete elimination of the time coordinate and utilize a formulation in the space domain. Both approaches are robust with respect to poor initial controls and yield a shorter final time and, hence, an improved control after every iteration. The presented method is tested with two classical examples from satellite and vehicle dynamics. However, it can also be extended to more complex systems, which are used in industrial applications.
1 Introduction
Among all optimal control problems, the solution of time-optimal problems is probably the most challenging ones. In this paper, we consider a dynamic system described by a set of minimal coordinates and given initial values. We are looking for the optimal control of the system such that the time for a prescribed process or maneuver becomes a minimum.
In order to solve such problems, fundamental books on optimal control as, e.g., Refs. [1] and [2] discuss direct and indirect methods. In summary, direct methods, also addressed as nonlinear programing, transform the dynamic problem to a static problem by a parameterization of the control variables, see Ref. [3], for example. Numerous methods follow a direct approach, e.g., interior point algorithms, active set, sequential quadratic programing, pseudo-spectral method, to name but a few. Indirect methods are based on the solution of the optimality conditions stated by Pontryagin resulting in a two-point boundary value problem. Solution strategies for the boundary value problem are discussed by various authors [4–7] and are already well established in a wide range in engineering sciences. The boundary value problem can, e.g., be solved with shooting methods or with discretization techniques in which both states and controls are discretized. For more details to direct and indirect methods, a good overview of both is given by Bianco et al. [8]. In a previous work, Bottasso et al. [9] presented a combined indirect approach for solving inverse and trajectory optimization problems in multibody dynamics, which also corresponds to the ideas presented in Ref. [5]. The latter mentioned work deals with the time-optimal control of a vehicle using an indirect approach to solve the boundary value problem.
A first significant formulation of the time-optimal control problem of a general vehicle maneuver has been stated by Hendrikx et al. [10]. This was followed by further articles concerning motorcycle lap time simulations in Ref. [4] and car lap time simulations discussed in Refs. [6] and [7]. Basically, all the mentioned authors have solved the boundary value problems associated with the optimal control problems. Time-optimal control problems play a significant role as well in robotic applications, see, e.g., Ref. [11] for the time-optimal control problem with specified trajectory. More recently, in Ref. [12] time-optimal motions following a predefined end-effector path are considered for kinematically redundant robots as well as for nonredundant robots. Here, a multiple shooting approach is used to solve the boundary value problem.
An indirect alternative to solving the underlying boundary value problem is given by gradient methods, independently developed by Kelley [13] and by Bryson and Denham [14] already in the sixties of the past century.
Here, the basic idea is to find a control history leading to a minimum of a cost functional. Numerous strategies are available to compute the control for which a functional attains a minimum, as, e.g., the method of the steepest descent, the conjugate gradient method, the gradient projection method, the Gauss-Newton method, or quasi-Newton methods like the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm when estimating the Hessian, see, e.g., Ref. [15] for an overview and comparison of performance behavior. In any cases, the gradient of the cost functional with respect to the control history has to be computed. A control signal is often time discretized in order to obtain a finite dimensional problem. In case of a time discretized control variable, the computational efforts are extremely high due to the numerical evaluation of the gradient at every time-step. However, the effort of the gradient computation can be significantly reduced by using the adjoint method, see, e.g., the pioneering work by Bryson and Ho [16]. The basic idea of this approach is the introduction of additional adjoint variables determined by a set of adjoint differential equations from which the gradient can be computed more efficiently than, e.g., with a numerical (direct) differentiation. To determine the partial derivatives with respect to each time-discrete control parameter by means of a numerical differentiation, the entire forward system must be solved once for a variation of every discretization parameter. With the adjoint method, however, the gradient can be computed very efficiently by solving two systems of ordinary differential equations only, i.e., by the solutions of the system equations and of the so-called adjoint equations. With the gradient information, the control input can be improved iteratively converging to an optimal control. Even by choosing a suboptimal initial guess for the control inputs, in general, the proposed method converges to the (local) optimal solution.
Various authors have already utilized this method in the field of inverse dynamics of multibody systems more than 20 years ago, as, e.g., Refs. [17] and [18], but nevertheless, the topic still provides open research potential.
More recently, Refs. [19] and [20] show how the adjoint method can be applied efficiently to a multibody system described by differential-algebraic equations of index three. A piecewise adjoint method is presented in Ref. [21], in which the coordinate partitioning underlying ordinary differential equations are formulated as a boundary value problem, and are solved by multiple shooting methods. It has to be emphasized that the adjoint method can be utilized to solve even highly nonlinear systems with a vast number of degrees-of-freedom in a straight forward way, as, e.g., in the application of second-order adjoint sensitivity analysis of large multibody systems in Ref. [22], in which a comparison between the adjoint method and direct differentiation shows a significant reduction in computational costs.
In this paper, a gradient-based solution approach using the adjoint method for time-optimal control problems for dynamic systems is presented, which has not yet been published in the open literature so far. The main novelty lies in the adaption of the adjoint gradient computation to a functional with variable integration limits. For that purpose, the adjoint method is formulated for two different domain approaches concerning a transformation between the time coordinate and a coordinate within a fixed interval essential for free final time problems. On the one hand, the adjoint method in a hybrid domain gives the advantage of solving the system and adjoint equations in time domain using classical solver and step size strategies, but defining the control and the gradient in the space domain. On the other hand, the adjoint method is presented in complete state space domain, in which the state and adjoint equations, cost functional, and control undergo a total transformation to a coordinate with fixed boundaries. Both methods arise as innovative tools for the efficient computation of the gradient of a cost functional for all kinds of optimization methods. Due to the novel consistent formulation, both presented methods even allow the application of preferable quasi-Newton methods using the BFGS-update for the Hessian. Therefore, time-optimal control problems can now be considered in an efficient way concerning convergence properties.
2 The Adjoint Gradient Method in Time Domain
where is a proper penalty function which is zero if the states and control variables satisfy the constraint conditions, and increases rapidly if they are not satisfied. Moreover, P should be continuously differentiable at least once. Unless violating the constraints, the cost functional J is simply the length of the time interval . The time-optimal control problem is then formulated as the problem of finding controls in the time interval which minimizes the functional J.
So far, we have not made any restrictions to the adjoint variables, now we shall find it convenient to choose such that all terms multiplied with vanish. Thus, the adjoint variables are defined by the linear final value problem
where ε must be a sufficiently small positive number.
3 Transformation to a Coordinate With Fixed Boundaries
A problem with the presented adjoint method arises because the modified control also causes a change of the time interval to . If , which should be the case for a control variation of the form like in Eq. (14), the improved control can be truncated at although the information about is available also for . However, if ε was selected too large, the terminal condition may be eventually met at a later time instance . In this case, we have no information about for .
which ranges between the fixed boundaries and . Based on this transformation, two solution strategies concerning the adjoint gradient method are pursued in this paper: (a) a hybrid formulation, where the state and adjoint equations are solved in the original time domain and only the controls and the gradient are defined in the fixed interval, and (b) the complete transformation of the time coordinate to a fixed interval. Both methods are discussed in detail below.
3.1 The Adjoint Method in a Hybrid Domain.
To eliminate the influence of all terms multiplied with , we now define the adjoint variables in the following way:
where ε is a sufficiently small positive number.
With this result, we are ready to apply a gradient-based optimization procedure for finding a minimum of J, which can be summarized as follows:
Choose an initial guess for the control history which causes the states to satisfy the terminal condition in Eq. (2).
Solve Eq. (1) for the state variables until Eq. (2) is satisfied. During the numerical integration of the state equations, the argument s in must be replaced by the original time t with Eq. (15).
Solve Eq. (21) for the adjoint variables by a backward integration starting at t = tf.
Select a value for ε and compute the control variation for every s from Eq. (25) by involving the inverse function of s(t). Alternatively, a quasi-Newton method can be applied to the update formula in Eq. (25), e.g., the BFGS-method.
Update the control history by setting . With the updated control , the cost functional should be reduced, if ε was selected sufficiently small.
Vary ε, resolve the state equations and evaluate the cost functional until an optimal value for ε is found, which minimizes the cost functional along the search direction. For that purpose, a line search algorithm as described in Ref. [24] can be used.
Replace the control history by the updated control and repeat steps 2–6. Terminate the optimization procedure when Eq. (24) is sufficiently small (zero gradient).
In summary, the advantage of the presented method is that the equations of motion and also the adjoint system can be solved in the time domain and only the control variables and the gradient formula are transformed into the state space. Unfortunately, this requires the derivative of the controls with respect to s which can become infinite for bang–bang controls and could cause an undesirable excitation term in the adjoint equations in Eq. (21a). In the next paragraph, we show that this problem can be overcome by an elimination of the time coordinate also in the state equations in Eq. (1a). However, the price to pay is that a standard multibody simulation software is probably no longer applicable.
3.2 The Adjoint Method in s-Domain.
The adjoint variables and can now be again defined such that all terms with and disappear. This is the case if and , i.e., for , and if
in which ε is again a small positive number determining the size of the control update.
Finally, we summarize the main steps of the algorithm for the s-domain:
Select a control history in the s-interval .
Solve the state equations in Eq. (27) for and the differential equation in Eq. (28) for t(s), simultaneously.
Solve the adjoint differential equations in Eq. (34) for the adjoint variables backward in s.
Compute the control update for every s by using Eq. (36). Note, instead of using the update directly, Eq. (36) can be used to apply a quasi-Newton method.
Improve the control history by setting . If ε was selected sufficiently small, the updated control reduces the cost functional in Eq. (29), in which the final time is given by .
Find a suitable factor ε, e.g., by using a line search algorithm.
Repeat steps 2–6 and stop the algorithm when in Eq. (35) is sufficiently small.
4 Numerical Examples
Now we consider two numerical examples in order to apply the proposed methods for computing the time-optimal control for dynamic systems. For the first example, the adjoint method in a hybrid domain is used for trajectory planning of a space flight, while in the second example the adjoint method in the s-domain is utilized for computing time-optimal vehicle maneuvers.
4.1 Time-Optimal Space Flight to the Sphere of Influence.
As a first example, we consider the time-optimal flight to the sphere of influence. The sphere of influence is a domain around a celestial body that indicates where the gravity of the body has a dominating effect on the dynamics of other space objects [25]. We are interested in the optimal control of a space vehicle so that we reach the sphere of influence in minimal time. We consider only planar motions described in polar coordinates r(t) and , see Fig. 3.
i.e., the length of the time interval .
For the numerical example, the following parameters are used: The specific impulse , the thrust force , the radius of the Earth , the gravitational acceleration , and the radius of the sphere of influence . Initially, the space vehicle is in a circular Earth orbit with altitude . Hence, the initial conditions at are given by: and . We consider two different initial controls, which are both far away from the optimal solution: (a) u(r) = 0 and (b) . In the first case, the thrust always acts in tangential direction. In the latter case, the thrust acts tangentially at the beginning and radially at the end, i.e., the sphere of influence is crossed in normal direction. The controls u(r) are discretized by 500 data points in time. Notice that a traditional gradient computation by direct differentiation would require 500 additional forward solutions of the state equations, whereas with our method the gradient is obtained already from one solution of the adjoint system equations. Figure 4 depicts the decrease of the final time with an increasing number of quasi-Newton iterations for both initial controls. The final time for the initial control (a) and for the initial control (b) are reduced to after the optimization process.
The time-optimal control angles u(r) computed from the initial controls (a) and (b) are shown in Fig. 5. The state histories r(t) and for the space vehicle are summarized in Fig. 6 and the flight trajectory of the space vehicle is plotted in Fig. 7. They are almost identical for both initial controls. The spacecraft moves nearly on a straight flight path. Notice that for the initial control (a) some small oscillations remain in the control u(r) at the end of the maneuver. The reason for this numerical artifact is that the final value is never updated by the gradient formula in Eq. (41) due to the zero values for p3 and p4 at t = tf and, hence, the control keeps its original value u = 0. Clearly, the final control value has no influence on the flight time. For the initial control (a), the algorithm converges to a step function at which requires a higher number of iterations. However, as it can be seen from the convergence plot in Fig. 4, the cost functional no longer decreases already after 20 iterations. This can be explained by the fact that the control history close to t = tf has only small influence on the cost functional. Hence, the oscillations in u(r) do not change the flight time and the flight path very much. However, Fig. 4 shows that choosing the initial control (b), for which the final value at t = tf coincides with the expected value , increases the convergence rate and the oscillations at the end of the optimal control history in Fig. 5 vanish.
4.2 Time-Optimal Control of a Racing Car.
As a second example, we consider a mechanical model of a car for which accelerating, braking, and steering should be determined such that the time for a vehicle maneuver becomes minimal.
4.2.1 Equations of Motion and Problem Formulation.
To describe the physical behavior of the vehicle, we use a single-track model and a Pacejka tire model for the road to wheel contact, see Fig. 8. The position of the vehicle is described by the absolute coordinates x and y of its center of gravity (CoG) and by the orientation angle ψ with respect to the x-axis. Moreover, m is the total mass of the vehicle and J denotes its moment of inertia. Both wheels have the effective radius R and the moment of inertia Jw about their rotation axis. The lengths a and b are the distances measured from the center of gravity to the rear axle and to the front axle. The control inputs of the model are the steering angle u1 of the front axis and the driving/braking torque u2 on the rear axis.
Since, for the Pacejka tire model, the tire forces , and must be described in the body-fixed frame of the chassis, it is convenient to formulate the equations of motion with respect to body-fixed coordinates, as proposed in Refs. [6] and [26].
where is the angular velocity of the chassis about the vertical and ω1 and ω2 denote the angular velocities of the wheels.
with . The numerical values for the tire parameters ai, bi, ci, μi, ki, di, and ei are shown in Table 1.
Parameter | Longitudinal i = X | Lateral i = Y |
---|---|---|
ai | 0.362 | −1.11 |
bi | 11.1 | 9.30 |
ci | 1.69 | 1.19 |
di | 12.4 | 6.46 |
ei | −10.8 | 4.20 |
ki | 1.09 | 1.08 |
μi | 1.20 | 0.961 |
Parameter | Longitudinal i = X | Lateral i = Y |
---|---|---|
ai | 0.362 | −1.11 |
bi | 11.1 | 9.30 |
ci | 1.69 | 1.19 |
di | 12.4 | 6.46 |
ei | −10.8 | 4.20 |
ki | 1.09 | 1.08 |
μi | 1.20 | 0.961 |
where and are the coordinates of the centerline in arc length description and primes denote derivatives with respect to s. Using, for example, a Newton scheme, s and r can be expressed by x and y from Eq. (46).
Here, , where w is the half road width and h is a small thickness parameter to activate the penalty function before the track constraint is violated. The penalty function is zero if and increases rapidly if . Note, that is quadratic in r if , whereas it increases linearly if . Hence, is a once continuously differentiable function which is necessary to avoid jumps in the adjoint equations.
Here, is the maximum driving torque and separates the domains of quadratic and linear increase.
where tf is the time to drive the vehicle through the road track.
4.2.2 Transformation to the s-Domain.
For the solution of the problem posed above, we use the adjoint method in s-domain as described in Sec. 3.2. For that purpose, the equations of motion are first transformed from time to s-domain, where s is the arc length along the road centerline which has been introduced in Sec. 4.2.1. Notice that the boundaries for s are fixed, since we know s = s0 for the initial state and s = sf for the terminal state when the finish line is crossed.
At this point, it is advantageous to replace the description of the vehicle position by (x, y) and use the curvilinear coordinates (s, r) instead. Since s is our independent variable now, we only need one state equation for r which is provided by Eq. (52b) after substituting again for and from the original state equations.
from which we may determine .
4.2.3 Numerical Results.
For the numerical computations, we consider a road centerline which is composed of straights and Clothoids.3 The vehicle starts at the centerline of the track with r = 0. Moreover, we prescribe the initial velocities with , VY = 0, , and at . The tire parameters and the vehicle parameters for the numerical computations are summarized in Tables 1 and 2.
Parameter | Abbr. | Value |
---|---|---|
Vehicle mass | m | 1000 kg |
Vehicle yaw inertia | J | |
Wheel inertia | Jw | |
Wheel radius | R | 0.3 m |
Front axle distance | a | 1.40 m |
Rear axle distance | b | 1.35 m |
Gravitational force | G | 9810 N |
Half road width | w | 2 m |
Thickness parameter | h | 0.3 |
Track penalty, weighting factor | α1 | 2 |
Track penalty, quadratic region | 1.7 | |
Track penalty, linear region | 3 | |
Control penalty, weighting factor | α2 | |
Control penalty, quadratic region | 1798 | |
Control penalty, linear region | 1900 |
Parameter | Abbr. | Value |
---|---|---|
Vehicle mass | m | 1000 kg |
Vehicle yaw inertia | J | |
Wheel inertia | Jw | |
Wheel radius | R | 0.3 m |
Front axle distance | a | 1.40 m |
Rear axle distance | b | 1.35 m |
Gravitational force | G | 9810 N |
Half road width | w | 2 m |
Thickness parameter | h | 0.3 |
Track penalty, weighting factor | α1 | 2 |
Track penalty, quadratic region | 1.7 | |
Track penalty, linear region | 3 | |
Control penalty, weighting factor | α2 | |
Control penalty, quadratic region | 1798 | |
Control penalty, linear region | 1900 |
In order to start the optimization procedure, we have to find a reasonable initial guess for the control of the vehicle. Therefore, we use a simple driver model in order to follow the road centerline and to apply suitable velocity profile. Various authors have already discussed the problem of path planning strategies [27,28], and the application of velocity profiles [27]. The goal of the first example is to identify the control of a vehicle driving through a single 90 deg curve. In Fig. 10, the convergence of the cost functional is shown. After approximately 988 quasi-Newton iterations, the total time can be reduced from to . The optimization was stopped due to a sufficient small first-order optimality measure. The initial and the optimized trajectories are depicted in Fig. 11.
The optimal control signals, i.e., the steering angle and the drive torque are shown in Fig. 12 and compared to the initial controls and . According to Pontryagin's minimum principle [29], see also Ref. [1], we expect the following cases for the optimal drive torque, which appears linear in the Hamiltonian:
where and . If the adjoint variable passes through zero, we identify a switching point of the control . However, particular attention shall be paid to finite time intervals in which the adjoint variable remains zero. Such intervals are called singular intervals. In that case, the necessary condition, which minimizes the Hamiltonian, provides no information about the optimal control. Here, the maximum transmitted force of the tire is the physical bound for the optimal control, which is observed by the method itself.
To circumvent the singular interval problem, usually the following traditional strategies are pursed: On the one hand, a singular control can be computed from differentiation of the switching condition and involving the state and costate equations, see Ref. [1]. On the other hand, as a numerical approach, a regularization term of the form, e.g., can be added to the cost functional. However, both strategies require either massive manipulations of the state and costate equations or an unphysical modification of the cost functional, whereas our method provides a natural approach to handle the singular case. With the presented approach, the control for a singular interval can be detected without any further modification of the optimization process. Figure 12 shows that the numerical result is in agreement with the theoretical principle.
Finally, we are looking for the time-optimal control of the single-track vehicle driving through a hairpin turn. The reduction of the cost functional over iterations is shown Fig. 13. After approximately 2254 quasi-Newton iterations, we stop the optimization procedure. Note that for industrial applications in most cases a much lower number of iterations is sufficient; however, to demonstrate Pontryagin's theory on singular intervals, a high accuracy is necessary. The total time can be reduced from to . In Fig. 14, we have once more established the result of the preprocessed driver model and the final solution of the time-optimal control problem. The control signals are shown in the first two diagrams in Fig. 15. In the third diagram, we encounter the theory of Pontryagin, too, where the control behavior of the driving torque prevails in a similar manner as discussed before.
5 Conclusion
Based on the adjoint gradient computation, we have proposed a new way to solve time-optimal control problems. The novelty of the presented approach lies in the adaption of the adjoint method to a cost functional with variable integration limits. With this method, an analytical formula for the gradient is obtained, for which only one additional system of adjoint equations has to be solved. For a numerical evaluation, the gradient formula can be discretized as fine as desired without increasing the computational effort. Further benefits of the adjoint approach are summarized as follows:
By using an efficient gradient-based approach, the solution of a two-point boundary value problem, which often requires the application of sophisticated homotopy strategies, can be avoided.
An improved control which reduces the cost functional is obtained after every successful control update.
The method can also be applied to more complex systems to find time-optimal controls. If standard simulation software, e.g., multibody simulation programs, is used, the gradient computation could be automatized if a module for the adjoint equations is implemented.
Even singular time intervals, where the optimal control cannot be determined from Pontryagin's minimum principle directly, can be identified without any further modification of the method.
Both, the hybrid formulation discussed in Sec. 3.1 and the complete formulation in s-domain introduced in Sec. 3.2 were successfully applied to two examples. The convergence of the orbital example was quite fast. However, the computation of the minimum time vehicle maneuver shows a rather slow convergence near the optimum, which is expectable for gradient-based methods, in particular, if the solution converges to a bang–bang control. Nevertheless, the method converges despite a poor initial guess for the control signals.
Footnotes
A Clothoid is a flat curve that is defined by the condition that the radius of curvature decreases inversely proportional to the arc length.
Acknowledgment
The financial support by the Austrian Federal Ministry for Digital and Economic Affairs and the National Foundation for Research, Technology and Development is gratefully acknowledged. Karin Nachbagauer acknowledges support from the Austrian Science Fund (FWF): T733-N30, and from the Technical University of Munich—Institute for Advanced Study.