Skip to content

serialization: apply new scheme for package (breaking change) #457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 5, 2025

Conversation

vpratz
Copy link
Collaborator

@vpratz vpratz commented May 2, 2025

Closes #451.

  • introduces new policy for consistent naming for serilization (see [BUG] serializable decorator produces wrong package for serialization #451
    for a discussion): standard is the path of a module a class resides
    in, truncated at depth two. So for a class in bayesflow.networks, we
    set package="bayesflow.networks", even if they live in the
    bayesflow.networks.mlp submodule.
  • The serializable decorator checks this and errors if this is not
    followed. The check can be disabled for certain cases (e.g., classes
    in the experimental module, that might eventually live somewhere
    else).
  • After this commit, previously saved models will not be loadable. As we
    introduced a bug regarding this anyway ([BUG] serializable decorator produces wrong package for serialization #451), we will accept this and
    should inform users about it.
  • usage of direct calls to keras.saving.register_keras_serializable
    were replaced with our custom decorator.
  • update docs and add a disclaimer to README regarding potential breaking changes until 2.1 release (feel free to adapt)

Todo for next patch release

  • as this breaks loading models from the previous patch versions, make a brief announcement in our community channels, so that people are aware and can pin their installation, if necessary
  • include changes from Add projectors to DeepSet #453 (if we want to include them), which are breaking as well

vpratz added 3 commits May 2, 2025 17:27
- introduces new policy for consistent naming for serilization (see bayesflow-org#451
  for a discussion): standard is the path of a module a class resides
  in, trucated at depth to. So for all class in bayesflow.networks, we
  set package="bayesflow.networks", even if the live in the
  bayesflow.networks.mlp submodule.
- The `serializable` decorator checks this and errors if this is not
  followed. The check can be disabled for certain cases (e.g., classes
  in the experimental module, that might eventually live somewhere
  else).
- After this commit, previously saved models will not be loadable. As we
  introduced a bug regarding this anyway (bayesflow-org#451), we will accept this and
  should inform users about it.
- usage of direct calls to `keras.saving.register_keras_serializable`
  were replaced with our custom decorator.
@vpratz vpratz requested review from stefanradev93 and LarsKue May 2, 2025 18:04
Copy link
Contributor

@LarsKue LarsKue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should think about whether we want the current module check to be a flag in the serializable decorator. If users are supposed to use our pipeline rather than the one from keras, the check must be off by default.

I think one of the following options is better:

  1. We add a check in any test_serialize_deserialize for the module structure, but remove the flag from the decorator altogether.
  2. We leave the flag in the current serializable decorator, but turn it off by default. Internal modules can use something like from bayesflow.internal import serializable which is a partial(bayesflow.utils.serializable, disable_module_check=False)

I am in favor of 1, since it avoids the issue of 2 that new contributors may accidentally use the "wrong" kind of decorator.

@vpratz
Copy link
Collaborator Author

vpratz commented May 4, 2025

Thanks for your comments. I like the idea of moving the test to the test suite, but as far as I can tell (correct me if I'm wrong) the test_serialize_deserialize test only explicitly covers a subset of classes. Many others are implicitly part of the test, but I do not know if we can access those for the test we have in mind.
I agree with your concern regarding 2.
A third option would be to limit the check to the case where the module name indicates a class is part of BayesFlow. I will add a commit that implements this, but we can still decide to go with one of the other options.

Would you say that we actually want users to to use our decorator instead of the Keras-provided one? Currently the only thing it does is to supply this check, which is only relevant internally, so we could also think about making it private. Or do we expect to encounter future scenarios where we want to have additional functionality in this decorator? As we cannot rule out that users use the Keras decorator instead, I think this would not be so desirable anyways.

What do you think?

This should ensure that users that try to use our decorator with
external classes do not encounter the error. Possible edge case: they
also name their module "bayesflow".
@stefanradev93 stefanradev93 merged commit e8d2f2c into bayesflow-org:dev May 5, 2025
9 checks passed
@LarsKue
Copy link
Contributor

LarsKue commented May 5, 2025

Thanks for the update. I think users should use our decorator, as this gives us the most flexibility in terms of maneuvering around keras.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants