-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
derivatives of more matrix functions #764
Comments
Thanks for the issue. Indeed, Higham in Section 3 of https://eprints.maths.manchester.ac.uk/id/eprint/2754 lists as the recommended algorithm for the Fréchet derivative for an arbitrary matrix function either an approximate one (complex step method) or precisely the approach you give, which is elegant if not that efficient. The reference given is section 7.3 of https://doi.org/10.1017/S0962492910000036, where the only qualification on In terms of derivatives of matrix functions, the most important one we are missing is the matrix logarithm algorithm of https://doi.org/10.1137/120885991, because with this and our Here are some potential downsides to the block triangular approach:
|
Higham (2010) section 7.3, in turn, cites Higham (2008) |
The Mathias reference is Roy Mathias (1996), which seems to be the earliest cited source for this result. Mathias presents it as an original result: |
I notice that we only currently implement
matfun_frechet
(the Fréchet derivative of matrix functions) forexp(A)
, in #331 by @sethaxen.However, I came across a beautifully simple result based on Magnus et al. (2021) (proposition 2) that looks like it should allow us to define
matfun_frechet
for (at least) any function with a Taylor series.In particular, it is easy to show by induction that the Fréchet derivative d(Aⁿ) = (Aⁿ)′[dA] of f(A)=Aⁿ is given by:
It then follows, for any analytic matrix function f(A) (i.e. with a Taylor series), that
That is, you augment the matrix A with E and compute f on the augmented matrix — the diagonal blocks are now f(A) and the upper-right block is the Fréchet derivative f'(A) applied to E.
It's not as efficient as an algorithm specialized for particular functions f(A), e.g. it's probably slower than our current exp implementation, because doubling the matrix size probably increases the cost by a factor of 8 (or maybe less, if the algorithm can take advantage of the block-triangular structure). But it's better than nothing.
The text was updated successfully, but these errors were encountered: