Major updates since the last semester, setting Julia engine, new refe…

…rences.
hurak · Feb 2, 2025 · 79776cd · 79776cd
1 parent 307c25b
commit 79776cd
Show file tree

Hide file tree

Showing 80 changed files with 5,656 additions and 633 deletions.
diff --git a/Manifest.toml b/Manifest.toml
diff --git a/Project.toml b/Project.toml
@@ -1,6 +1,20 @@
 [deps]
+BlockArrays = "8e7c35d0-a365-5155-bbbb-fb81a777f24e"
+COSMO = "1e616198-aa4e-51ec-90a2-23f7fbd31d8d"
+CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
 ControlSystems = "a6e380b2-a6ca-5380-bf3e-84a91bcd477e"
+Convex = "f65535da-76fb-5f13-bab9-19810c17039a"
 DifferentialEquations = "0c46a032-eb83-5123-abaf-570d42b7fbaa"
+Gurobi = "2e9cd046-0924-5485-92f1-d5272153d98b"
+HiGHS = "87dc4568-4c63-4d18-b0c0-bb2238e4078b"
+JuMP = "4076af6c-e467-56ae-b986-b466b2749572"
+LinearSolve = "7ed4a6bd-45f5-4d41-b270-4a48e9bafcae"
+MosekTools = "1ec41992-ff65-5c91-ac43-2df89e9693a4"
 NLsolve = "2774e3e8-f4cf-5e23-947b-6d7e65073b56"
+OSQP = "ab2f91bb-94b4-55e3-9ba0-7f65df51de79"
 Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
+Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
+QDLDL = "bfc457fd-c171-5ab7-bd9e-d5dbfc242d63"
 Roots = "f2b01f46-fcfa-551c-844a-d8ac1e96c665"
+SCS = "c946c3f1-0d1f-5ce8-9dea-7daa1f7e2d13"
+SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
diff --git a/_quarto.yml b/_quarto.yml
@@ -3,12 +3,18 @@ project:
 
 website:
   title: "B(E)3M35ORR – Optimal and Robust Control"
-  page-footer: "Copyright 2024, Zdeněk Hurák"
+  page-footer: "Copyright 2025, Zdeněk Hurák"
   page-navigation: true
   sidebar:
     style: "docked"
     search: true
     contents:
+      - section: "0. Introduction"
+        contents:
+          - intro.qmd
+          - intro_outline.qmd
+          - intro_software.qmd
+          - intro_references.qmd
       - section: "1. Optimization – theory"
         contents:
           - opt_theory_problems.qmd
@@ -43,6 +49,7 @@ website:
           - dynamic_programming.qmd
           - dynamic_programming_tabular.qmd
           - dynamic_programming_LQR.qmd
+          - dynamic_programming_DDP.qmd
           - dynamic_programming_references.qmd
       - section: "6. More on MPC"
         contents:

diff --git a/cont_dp_HJB.qmd b/cont_dp_HJB.qmd
@@ -9,7 +9,7 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
 
 In the previous sections we investigated both direct and indirect approaches to the optimal control problem. Similarly as in the discrete-time case, complementing the two approaches is the dynamic programming. Indeed, the key Bellmans's idea, which we previously formulated in discrete time, can be extended to continuous time as well.
@@ -23,10 +23,15 @@ $$
 J(\bm x(t_\mathrm{i}), \bm u(\cdot), t_\mathrm{i}) = \phi(\bm x(t_\mathrm{f}),t_\mathrm{f}) + \int_{t_\mathrm{i}}^{t_\mathrm{f}}L(\bm x(t),\bm u(t),t)\, \mathrm d t.
 $$
 
-Optionally we can also consider constraints on the state at the final time (be it a particular value or some set of values)
+The final time can be fixed to a particular value $t_\mathrm{f}$, in which case the state at the final time $\bm x(t_\mathrm{f})$ is either free (unspecified but penalized through $\phi(\bm x(t_\mathrm{f}))$), or it is fixed (specified and not penalized, that is, $\bm x(t_\mathrm{f}) = \mathbf x^\mathrm{ref}$). 
+
+The final time can also be free (regarded as an optimization variable itself), in which case general constraints on the state at the final time can be expressed as
 $$
-\psi(\bm x(t_\mathrm{f}),t_\mathrm{f})=0.
+\psi(\bm x(t_\mathrm{f}),t_\mathrm{f})=0
 $$
+or possibly even using an inequality, which we will not consider here.
+
+The final time can also be considered infinity, that is, $t_\mathrm{f}=\infty$, but we will handle this situation later separately.
 
 ## Hamilton-Jacobi-Bellman (HJB) equation
 
@@ -61,20 +66,62 @@ $$
 
 This is obviously a partial differential equation (PDE) for the optimal cost function $J^\star(\bm x,t)$. 
 
-And since this is a differential equation, boundary value(s) must be specified to determine a unique solution. In particular, since the equation is first-order with respect to both time and state, specifying the value of the optimal cost function at the final state and the final time is enough. With the general final-state constraints we have introduced above, the boundary value condition reads 
+### Boundary conditions for the HJB equation
+
+Since the HJB equation is a differential equation, initial/boundary value(s) must be specified to determine a unique solution. In particular, since the equation is first-order with respect to both time and state, specifying the value of the optimal cost function at the final state and the final time is enough. 
+
+For a fixed-final-time, free-final-state, the optimal cost at the final time is 
+$$
+J^\star (\bm x(t_\mathrm{f}),t_\mathrm{f}) = \phi(\bm x(t_\mathrm{f}),t_\mathrm{f}).
+$$
+
+For a fixed-final-time, fixed-final-state, since the component of the cost function corresponding to the terminal state is zero, the optimal cost at the final time is zero as well
+$$
+J^\star (\bm x(t_\mathrm{f}),t_\mathrm{f}) = 0.
+$$
+
+With the general final-state constraints introduced above, the boundary value condition reads 
 $$
 J^\star (\bm x(t_\mathrm{f}),t_\mathrm{f}) = \phi(\bm x(t_\mathrm{f}),t_\mathrm{f}),\qquad \text{on the hypersurface } \psi(\bm x(t_\mathrm{f}),t_\mathrm{f}) = 0. 
 $$
 
-Note that this includes as special cases the fixed-final-state and free-final-state cases.
+### Optimal control using the optimal cost (-to-go) function
+
+Assume now that the solution $J^\star (\bm x(t),t)$ to the HJB equation is available. We can then find the optimal control by the minimization
+$$\boxed
+{\bm u^\star(t) = \arg\min_{\bm u(t)}\left[L(\bm x(t),\bm u(t),t)+(\nabla_{\bm x} J^\star (\bm x(t),t))^\top \bm f(\bm x(t),\bm u(t),t)\right].}
+$$
+
+For convenience, the minimized function is often labelled as
+$$
+Q(\bm x(t),\bm u(t),t) = L(\bm x(t),\bm u(t),t)+(\nabla_{\bm x} J^\star (\bm x(t),t))^\top \bm f(\bm x(t),\bm u(t),t)
+$$
+and called just *Q-function*. The optimal control is then
+$$
+\bm u^\star(t) = \arg\min_{\bm u(t)} Q(\bm x(t),\bm u(t),t).
+$$
 
-## HJB equation and Hamiltonian
+## HJB equation formulated using a Hamiltonian
 
 Recall the definition of Hamiltonian $H(\bm x,\bm u,\bm \lambda,t) = L(\bm x,\bm u,t) + \boldsymbol{\lambda}^\top \mathbf f(\bm x,\bm u,t)$. The HJB equation can also be written as
 $$\boxed
 {-\frac{\partial J^\star (\bm x(t),t)}{\partial t} = \min_{\bm u(t)}H(\bm x(t),\bm u(t),\nabla_{\bm x} J^\star (\bm x(t),t),t).}
 $$
 
-What we have just derived is one of the most profound results in optimal control – Hamiltonian must be minimized by the optimal control. We will exploit it next for some derivations. 
+What we have just derived is one of the most profound results in optimal control – Hamiltonian must be minimized by the optimal control. We will exploit it next as a tool for deriving some theoretical results. 
+
+## HJB equation vs Pontryagin's principle of maximum (minimum)
+
+Recall also that we have already encountered a similar results that made statements about the necessary maximization (or minimization) of the Hamiltonian with respect to the control – the celebrated Pontryagin's principle of maximum (or minimum). Are these two related? Equivalent?
 
-Recall also that we have already encountered a similar results that made statements about the necessary maximization (or minimization) of the Hamiltonian with respect to the control – the celebrated Pontryagin's principle of maximum (or minimum).
+## HJB equation for an infinite time horizon  
+
+When both the system and the cost function are time-invariant, and the final time is infinite, that is, $t_\mathrm{f}=\infty$, the optimal cost function $J^\star()$ must necessarily be independent of time, that is, it's partial derivative with respect to time is zero, that is, $\frac{\partial J^\star (\bm x(t),t)}{\partial t} = 0$. The HJB equation then simplifies to 
+
+$$\boxed{
+0 = \min_{\bm u(t)}\left[L(\bm x(t),\bm u(t))+(\nabla_{\bm x} {J^\star (\bm x(t),t)})^\top \bm f(\bm x(t),\bm u(t))\right],}
+$$
+or, using a Hamiltonian
+$$\boxed
+{0 = \min_{\bm u(t)}H(\bm x(t),\bm u(t),\nabla_{\bm x} J^\star (\bm x(t))).}
+$$
diff --git a/cont_dp_LQR.qmd b/cont_dp_LQR.qmd
@@ -9,7 +9,7 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
 
 As we have already discussed a couple of times, in the LQR problem we consider a linear time invariant (LTI) system modelled by
@@ -61,7 +61,7 @@ J^\star(\bm x(t),t) = \frac{1}{2}\bm x^\top (t)\mathbf S(t)\bm x(t).
 $$
 
 :::{.callout-note}
-Recall that we did something similar when making a *sweep* assumption to derive a Riccati equation following the indirect approach – we just make an inspired guess and see if it works. Here the inspiration comes from the observation made elsewhere, that the optimal cost function in the LQR problem is quadratic in $\bm x$.
+Recall that we did something similar when making a *sweep* assumption to derive a Riccati equation following the indirect approach – we just make an inspired guess and see if it solves the equation. Here the inspiration comes from the observation made elsewhere, that the optimal cost function in the LQR problem is quadratic in $\bm x$.
 :::
 
 We now aim at substituting this into the HJB equation. Observe that $\frac{\partial J^\star}{\partial t}=\bm x^\top(t) \dot{\mathbf{S}}(t) \bm x(t)$ and $\nabla_{\bm x} J^\star = \mathbf S \bm x$. Upon substitution to the HJB equation, we get

diff --git a/cont_indir_CARE.qmd b/cont_indir_CARE.qmd
@@ -9,5 +9,5 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
diff --git a/cont_indir_LQR_fin_horizon.qmd b/cont_indir_LQR_fin_horizon.qmd
@@ -9,5 +9,5 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
diff --git a/cont_indir_LQR_inf_horizon.qmd b/cont_indir_LQR_inf_horizon.qmd
@@ -9,5 +9,5 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
diff --git a/cont_indir_Pontryagin.qmd b/cont_indir_Pontryagin.qmd
@@ -9,5 +9,5 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
diff --git a/cont_indir_calculus_of_variations.qmd b/cont_indir_calculus_of_variations.qmd
@@ -11,7 +11,7 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
 
 
diff --git a/cont_indir_constrained.qmd b/cont_indir_constrained.qmd
@@ -9,5 +9,5 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
diff --git a/cont_indir_overview.qmd b/cont_indir_overview.qmd
@@ -11,7 +11,7 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
 
 Through this chapter we are entering into the realm of continuous-time optimal control – we are going to consider dynamical systems that evolve in continuous time, and we are going to search for control that also evolves in continuous time.

diff --git a/cont_indir_references.qmd b/cont_indir_references.qmd
@@ -11,7 +11,7 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
 
 Indirect approach to optimal control is based on *calculus of variations* (and its later extension in the form of *Pontryagin's principle of maximum*). Calculus of variations is an advanced mathematical discipline that requires non-trivial foundations and effort to master. In our course, however, we take the liberty of aiming for intuitive understanding rather than mathematical rigor. At roughly the same level, the calculus of variations is introduced in books on optimal control, such as the classic and affordable [@kirkOptimalControlTheory2004], the popular and online available [@lewisOptimalControl2012], or very accessible and also freely available online [@liberzonCalculusVariationsOptimal2011]. 

diff --git a/cont_indir_time_optimal.qmd b/cont_indir_time_optimal.qmd
@@ -9,5 +9,5 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
diff --git a/cont_indir_trajectory_stabilization.qmd b/cont_indir_trajectory_stabilization.qmd
@@ -11,7 +11,7 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
 
 Indirect methods for optimal control reformulate the optimal control problem into a set of equtions – boundary value problems with differential and algebraic equations in the case of continuous-time systems – and by solving these (typically numerically) we obtain the optimal state and control trajectories. Practical usefullness of these is rather limited as such optimal control trajectory constitutes an open-loop control – there is certainly no need to advocate the importance of feedback in this advanced control course. 

diff --git a/cont_indir_via_calculus_of_variations.qmd b/cont_indir_via_calculus_of_variations.qmd
@@ -10,6 +10,6 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
 
diff --git a/cont_numerical_direct.qmd b/cont_numerical_direct.qmd
@@ -9,7 +9,7 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
 
 We have already seen that direct methods for discrete-time optimal control problems essentially just regroup and reshape the problem data so that a nonlinear programming solver can accept them. In order to follow the same approach with continous-time optimal control problems, some kind of discretization is inevitable in order to formulate a nonlinear program. Depending of what is discretized and how, several groups of methods can be identified

diff --git a/cont_numerical_indirect.qmd b/cont_numerical_indirect.qmd
@@ -13,8 +13,7 @@ format:
 execute:
     enabled: true
     warning: false
-jupyter: julia-1.10
-#engine: julia
+engine: julia
 ---
 
 The indirect approach to optimal control reformulates the optimal control problem as a system of differential and algebraic equations (DAE) with the values of some variables specified at both ends of the time interval – the two-point boundary value problem (TP–BVP). It is only in special cases that we are able to reformulate the TP–BVP as an initial value problem (IVP), the prominent example of which is the LQR problem and the associate differential Riccati equation solved backwards in time. However, generally we need to solve the TP–BVP DAE and the only way to do it is by numerical methods. Here we give some.

diff --git a/cont_numerical_references.qmd b/cont_numerical_references.qmd
@@ -11,7 +11,7 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
 
 The indirect approach to the continuous-time optimal control problem (OCP) formulates the necessary conditions of optimality as a two-point boundary value problem (TP-BVP), which generally requires *numerical methods*. The direct approach to the continuous-time OCP relies heavily on numerical methods too, namely the methods for solving nonlinear programs (NLP) and methods for solving ordinary differential equations (ODE). Numerical methods for both approaches share a lot of common principles and tools, and these are collectively presented in the literature as called *numerical optimal control*. A recommendable (and freely online available) introduction to these methods is [@grosNumericalOptimalControl2022]. Shorter version of this is in chapter 8 of [@rawlingsModelPredictiveControl2017], which is also available online. A more comprehensive treatment is in [@bettsPracticalMethodsOptimal2020]. 

diff --git a/cont_numerical_software.qmd b/cont_numerical_software.qmd
@@ -11,7 +11,7 @@ format:
 execute:
     enabled: false
     warning: false
-jupyter: julia-1.10
+engine: julia
 ---
 
 The methods studied in this chapter are already quite mature and well described. Software implementations exist. Here we enumerate some of them. 

diff --git a/discr_dir_LQR.qmd b/discr_dir_LQR.qmd
@@ -6,7 +6,7 @@ format:
         code-fold: true
 execute:
     enabled: false
-jupyter: julia-1.10
+engine: julia
 ---
 
 Here we specialize the general procedure from the previous section to the case of a linear system and a quadratic cost. we start by considering a simple problem of *regulation*, wherein the goal is to bring the system either exactly or approximately to zero final state, that is, $\mathbf x^\text{ref}=\mathbf 0$ and we want $\bm x_N=\mathbf x^\text{ref}$ or $\bm x_N\approx\mathbf x^\text{ref}$, respectively. 
@@ -75,36 +75,44 @@ $$
 \end{aligned}}
 $$
 
-```{julia}
+<!-- ```{julia}
+using BlockArrays
+using LinearAlgebra
+using LinearSolve
+using QDLDL
+using SparseArrays
+
 function direct_dlqr_simultaneous(A,B,x₀,Q,R,S,N)
-    Qbar = BlockArray(spzeros(N*n,N*n),repeat([n],N),repeat([n],N))
+    n = 2 #size(A)[1]
+    m = 1 #size(B)[2]
+    Q̄ = BlockArray(spzeros(N*n,N*n),repeat([n],N),repeat([n],N))
     for i=1:(N-1)
-        Qbar[Block(i,i)] = Q
+        Q̄[Block(i,i)] = Q
     end
-    Qbar[Block(N,N)] = S
-    Rbar = BlockArray(spzeros(N*m,N*m),repeat([m],N),repeat([m],N))
+    Q̄[Block(N,N)] = S
+    R̄ = BlockArray(spzeros(N*m,N*m),repeat([m],N),repeat([m],N))
     for i=1:N
-        Rbar[Block(i,i)] = R
+        R̄[Block(i,i)] = R
     end
-    Qtilde = blockdiag(sparse(Qbar),sparse(Rbar))                               # The matrix defining the quadratic cost.
-    Bbar = BlockArray(spzeros(N*n,N*m),repeat([n],N),repeat([m],N))
+    Q̃ = blockdiag(sparse(Q̄),sparse(R̄))                                  # The matrix defining the quadratic cost.
+    B̄ = BlockArray(spzeros(N*n,N*m),repeat([n],N),repeat([m],N))
     for i=1:N
-        Bbar[Block(i,i)] = B
+        B̄[Block(i,i)] = B
     end
-    Abar = BlockArray(sparse(-1.0*I,n*N,n*N),repeat([n],N),repeat([n],N))
+    Ā = BlockArray(sparse(-1.0*I,n*N,n*N),repeat([n],N),repeat([n],N))
     for i=2:N
-        Abar[Block(i,(i-1))] = A
+        Ā[Block(i,(i-1))] = A
     end
-    Atilde = sparse([Abar Bbar])                                                # The matrix defining the linear (affine) equation.
-    A0bar = spzeros(n*N,n)
-    A0bar[1:n,1:n] = A
-    btilde = A0bar*sparse(x₀)                                                   # The constant offset for the linear (affine) equation.
-    K = [Qtilde Atilde'; Atilde spzeros(size(Atilde,1),size(Atilde,1))]         # Sparse KKT matrix.
-    F = qdldl(K)                                                                # KKT matrix LDL factorization.
-    k = [spzeros(size(Atilde,1)); -btilde]                                      # Right hand side of the KKT system
-    xtildeλ = solve(F,k)                                                        # Solving the KKT system using the factorization.
-    xopt = reshape(xtildeλ[1:(n*N)],(n,:))
-    uopt = reshape(xtildeλ[(n*N+1):(n+m)*N+1],(m,:))
+    Ã = sparse([Ā B̄])                               # The matrix defining the linear (affine) equation.
+    Ā₀ = spzeros(n*N,n)
+    Ā₀[1:n,1:n] = A
+    b̃ = Ā₀*sparse(x₀)                               # The constant offset for the linear (affine) equation.
+    K = [Q̃ Ã'; Ã spzeros(size(Ã,1),size(Ã,1))]      # Sparse KKT matrix.
+    k = [spzeros(size(Q̃,1)); -b̃]                    # Right hand side of the KKT system
+    prob = LinearProblem(K,k)                       # The KKT system as a linear problem.
+    z̃λ = LinearSolve.solve(prob)                    # Solving the KKT system. Ready for trying various solvers.
+    xopt = reshape(z̃λ[1:(n*N)],(n,:))
+    uopt = reshape(z̃λ[(n*N+1):(n+m)*N],(m,:))
     return xopt,uopt
 end
 
@@ -136,7 +144,7 @@ xlabel!("k")
 ylabel!("x")
 
 plot(p1,p2,layout=(2,1))
-```
+``` -->
 
 This constrained optimization problem can still be solved without invoking a numerical solver for solving quadratic programs (QP). We do it by introducing a vector $\boldsymbol\lambda$ of Lagrange multipliers to form the Lagrangian function
 $$