Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate previous version of asdfhjkl directly into Laplace #238

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

aleximmer
Copy link
Owner

First version integrating the old asdfghjkl into Laplace directly. This allows to have it alongside the latest asdl and further integrate existing extensions, such as end-to-end differentiability and support for other loss functions. This version does not change any behavior and only integrates functions of asdfghjkl that are actually used. The only exception to this is kernel.py, which is currently not yet used but will be worth integrating.

Points to discuss:

  • What about documentation for asdfghjkl_src?
  • Should we first make sure asdfghjkl can be default by enabling regression?
  • If asdfghjkl becomes default, what is a sensible way to integrate it? I think we could have curvature/asd for the core utilities and merge the interfaces/default backends in curvature.py with asdfghjkl.py, deprecate the AsdfghjklXYZ classes and just go with GGN, EF, Hessian for them.

@wiseodd
Copy link
Collaborator

wiseodd commented Sep 2, 2024

Thanks for the progress!

  • Documentation: Since we are going to integrate it into laplace's core, we should follow laplace's code standard, in particular, tests, typehinting and docstrings. I think the last two are enough for documentation.
  • Minimum required features: I vote to make it default since all the backends currently have one limitations and the others (Efficient, universal, standalone Jacobian backend #203). To do so, I believe these are the requirements:
    • Support for regression
    • Support for backdrop through the Jacobian & GLM predictive (for continuous Bayesian optimization)
    • Support for LoRA + Huggingface models + reward modeling. The keys here are: (I think these are automatically supported, at least this is the case for the new ASDL. In any case, we need to provide tests for these.)
      • Support for arbitrary inputs (x: Union[Tensor, MutableMapping[str, Any]]).
      • Support for subset-of-weights (the Hessian & Jacobians are computed only for parameters with requires.grad == True).
  • Integration: I agree with you. We can put the forked asdfghjkl code in something like laplace.curvature.default_backend and use it directly in laplace.curvature.CurvatureInterface to provide a default implementations. So, if another backend doesn't implement a functionality, it will be covered by this default backend. Note that currently this default functionality is provided by torch.func which is not efficient.

@aleximmer
Copy link
Owner Author

Thanks, sounds good to me. I think asdfghjkl is well suited to be the universal backend we need and is much faster than torch.func for Jacobians etc. I will take care of the points you mentioned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants