Skip to content

Conversation

jrmaddison
Copy link
Contributor

@jrmaddison jrmaddison commented Sep 19, 2025

Description

Trick for mesh-independent optimization with (at least some) optimizers which lack Riesz map support.

Requires dolfin-adjoint/pyadjoint#222

Problem

A source of mesh dependence in some Firedrake+pyadjoint optimizers is the use of an $l_2$ Riesz map. e.g. in a standard gradient-descent step optimizing over some FEM space $U$ we use

$$\tilde{u}_{n + 1} = \tilde{u}_n - \alpha_n \tilde{g}_n,$$

where $\tilde{u}_n$ are dofs for a function $u_n \in U$ and $\tilde{g}_n$ are dofs for a derivative $g_n \in U^*$. The linear combination on the RHS mixes primal and dual space dofs.

Instead we should use a Riesz map

$$\tilde{u}_{n + 1} = \tilde{u}_n - \alpha_n M^{-1} \tilde{g}_n,$$

where e.g. for the $L^2$ Riesz map $M$ is the mass matrix.

If we try to fix this by applying a Riesz map to the derivative then we instead introduce errors when evaluating directional derivatives (dolfin-adjoint/pyadjoint#153),

$$g_n(u) = \tilde{g}_n^* \tilde{u} \left[ \ne \tilde{g}_n^* M^{-1} \tilde{u} ~~ \text{in general} \right].$$

The trick

This PR uses a basis where the mass matrix is an identity.

With

  • $U$: Finite element control space: we want an optimal solution in $U$.
  • $V$: A DG space containing $U$ as a subspace.
  • $J : U \rightarrow \mathbb{R}$: Cost functional to minimize.
  • $\Pi: V \rightarrow U$: $L^2$ projection from the DG space onto the control space.

instead of minimizing $J$, we minimize $J \circ \Pi$, and then choose an $L^2$ orthonormal basis for the DG space $V$. Since $V$ is DG we can find an $L^2$ orthonormal basis with element-wise support (see reference below).

Once we have a solution $m_*$ in $V$ to the larger problem, we map to obtain a solution in $U$, $\Pi ( m_* )$.

Non-uniqueness

In the case where $V$ is larger than $U$ we are optimizing over a larger space. Derivatives in directions $L^2$ orthogonal to $U$ are zero so e.g. for a gradient-based optimizer we might not 'wander out' of $U$. However we can optionally add an extra penalty term,

$$\frac{1}{2} \alpha \left\| m - \Pi ( m ) \right\|_{L^2}^2.$$

What this looks like in code

The L2TransformedFunctional is an AbstractReducedFunctional which wraps a ReducedFunctional, performing the basis transformation and $L^2$ projection, and applying the chain rule (manually) for derivatives.

# J: AdjFloat defining the minimization problem
# c: Control

J_hat = L2TransformedFunctional(J, c, alpha=1e-5)

problem = MinimizationProblem(J_hat)
solver = TAOSolver(problem, parameters)
m_opt = solver.solve()

# m_opt contains the result using the L^2 orthonormal basis in the DG space.
# We map to the FEM basis in the control space.
m_opt = J_hat.map_result(m_opt)

Limitations

  • Dof-based constraints would apply to the DG control using the $L^2$ orthonormal basis, which is probably not useful.
  • A RieszMap is used to define $\Pi$, e.g. to supply solver parameters and for caching (specifically using a L2RieszMap subclass). A Projector might be better, but adjoint projection would need to be added (simple but would need to decide on an API).
  • The basis transformation is performed using left and right applications of a PETSc Cholesky PC. This uses undocumented PETSc behaviour, and there might be more efficient ways to do this.
  • Dofs in the $L^2$ orthonormal basis are stored in Functions / Cofunctions, which is a bit misleading since the FEM basis is not used.

Reference

The transformation is related to the factorization of a mass matrix in section 4.1 of https://doi.org/10.1137/18M1175239. Their matrix $H$ is the composition of

  1. a change of basis, defined by the lower Cholesky factor $C$ of the mass matrix $M_D$ for a DG space, followed by
  2. adjoint $L^2$ projection, defined by their matrix $L^T$ (equivalent to adjoint interpolation, where defined).

Precisely: the transformation used here is obtained by applying the $L^2$ Riesz map to the result, with a simplification $M_D^{-1} C = C^{-T}$ so that only Cholesky factor inverse actions are needed.

@jrmaddison jrmaddison force-pushed the jrmaddison/transformed_functional branch from 9eabd47 to 1d9a838 Compare September 22, 2025 08:55
@jrmaddison jrmaddison force-pushed the jrmaddison/transformed_functional branch from c3c5a86 to 90903a7 Compare September 23, 2025 18:45
@jrmaddison
Copy link
Contributor Author

Example using SciPy+L-BFGS-B: preconditioned_optimization.py
pc

@jrmaddison jrmaddison marked this pull request as ready for review September 24, 2025 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant