Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InterpolateBlock doesn't stash interpolators #1962

Open
ReubenHill opened this issue Jan 25, 2021 · 27 comments
Open

InterpolateBlock doesn't stash interpolators #1962

ReubenHill opened this issue Jan 25, 2021 · 27 comments

Comments

@ReubenHill
Copy link
Contributor

The adjoint InterpolateBlock creates a new interpolator every time it is evaluated, e.g. for calculating the hessian.

Extra interpolators should probably be stashed on the Interpolator as necessary - though this is perhaps not the only solution. I believe @nbouziani has worked on a similar fix elsewhere - comments welcome.

@nbouziani
Copy link
Contributor

Sorry for the delay! I have commented what I think is the same issue, cf. #1981.

@ReubenHill
Copy link
Contributor Author

ReubenHill commented Feb 23, 2021

(Closed the duplicate issue)

@nbouziani
Copy link
Contributor

I have not dealt with similar issues in #1674 since the interpolate is part of the external operator evaluation process (happens inside the external operator), that is there is no additional InterpolateBlock induced.

Just to be sure I understand correctly: You want to stash Interpolator(expr, space) right ? Is it not just when you interpolate via Interpolator(expr, space).interpolate(...) that there is a computational cost ? If yes, then I don't understand the problem with defining a new Interpolator every time something needs interpolating.

@ReubenHill
Copy link
Contributor Author

If expr and space are the same multiple times (as I think typically they will be when doing hessian evaluations as part of a firedrake_adjoint.minimize(Ĵ, method='Newton-CG')) you really ought to stash and reuse the Interpolator to save compile time costs

@nbouziani
Copy link
Contributor

Ok I see. If this is just for adjoint purposes, then the information about which interpolator has already been made could also be stashed directly to the InterpolateBlock class. Although I am not sure which option is the most judicious.

@dham
Copy link
Member

dham commented Feb 23, 2021

That doesn't quite work, because you get a new InterpolateBlock for every interpolate operation.

@nbouziani
Copy link
Contributor

nbouziani commented Feb 23, 2021

I meant having a kind of cache mechanism via shared attributes as it is done for forms with TSFCKernel.

@dham
Copy link
Member

dham commented Feb 23, 2021

Global caches for interpolation matrices are going to get very expensive very fast.

@nbouziani
Copy link
Contributor

Ok I see.

@wence-
Copy link
Contributor

wence- commented Feb 23, 2021

But kernel caching should be implemented for interpolation as we do for form assembly

@dham
Copy link
Member

dham commented Feb 23, 2021

We should do that. However we can also avoid reassembling the matrices associated with a single interpolator object.

@wence-
Copy link
Contributor

wence- commented Feb 23, 2021

If we tie the lifetime of these things to the lifetime of the object in the forward model then that should DTRT.

@wence-
Copy link
Contributor

wence- commented Feb 23, 2021

Actually, that doesn't work

@ReubenHill
Copy link
Contributor Author

Why doesn't it work?

@wence-
Copy link
Contributor

wence- commented Feb 24, 2021

Imagine you have the following setup:

def forward(mesh):
    # set up forward model
    interpolator = Interpolator(...)
    while ...:
        interpolator.interpolate()
    return something_that_is_not_the_interpolator

J = forward(mesh)

compute_gradient(J)

So the interpolator doesn't live past the end of the forward model run, but we need it in the adjoint if that's where we're stashing the other thing.

@ReubenHill
Copy link
Contributor Author

Is this an edge case where stashing on the Interpolator would at least help?

Another idea, how about a global cache but only for the Interpolators in the InterpolateBlock to avoid the expense of caching all interpolators all the time?

@wence-
Copy link
Contributor

wence- commented Feb 24, 2021

Is this an edge case where stashing on the Interpolator would at least help?

No? The interpolator is the thing we're trying to keep alive.

Another idea, how about a global cache but only for the Interpolators in the InterpolateBlock to avoid the expense of caching all interpolators all the time?

How do you know this has happened?

@ReubenHill
Copy link
Contributor Author

How do you know this has happened?

By doing the caching in the InterpolateBlock, then when a new InterpolateBlock is made, it looks in the cache of existing Interpolate objects for InterpolateBlocks to see if there's one it can reuse

@wence-
Copy link
Contributor

wence- commented Feb 24, 2021

How do those get collected?

@ReubenHill
Copy link
Contributor Author

Well I don't know the details how caching is done at the moment, but I'm imagining some runtime-only global dictionary that's filled via an annotation of an Interpolate

@wence-
Copy link
Contributor

wence- commented Feb 24, 2021

what I mean is, how do you decide to throw those interpolate objects away from the interpolateblock?

@ReubenHill
Copy link
Contributor Author

I don't think I understand your question? Do you mean getting rid of the cached objects? Presumably if they lived in a dictionary in some suitably global scope, they would only exist as long as a script took to run?

@wence-
Copy link
Contributor

wence- commented Feb 24, 2021

That sounds like too long. Each Interpolator, if it is reusable, carries a sparse matrix around with it.

@ReubenHill
Copy link
Contributor Author

Solution: Cache all pyop2 Interpolation kernels created in firedrake/interpolation.py based on a hash of expression and function space (@wence- please correct if wrong)

@wence-
Copy link
Contributor

wence- commented Feb 24, 2021

Yes, I think that is right. ufl.core.Expr needs to gain a .signature() method that is similar to that for ufl.Form (which is stable wrt coefficient numbering). Then we can do something similar to the TSFCKernelCache stuff.

@ReubenHill
Copy link
Contributor Author

Do function spaces have a suitable hash?

@wence-
Copy link
Contributor

wence- commented Feb 24, 2021

No, but their UFL element does (or should).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants