Fix: optimizer was not used in workflow with multiple fits #549

vpratz · 2025-08-04T11:53:03Z

For the optimizer to be used, the approximator.compile function has to be called. This was not the case for repeated calls to fit_..., even with keep_optimizer set to False. I adapted the setup_optimizer function to match the description in its docstring, and made the compilation conditional on its output. The output indicates if a new optimizer was configured.

To reproduce, you can run the following code. In the version without the fix, one and two will give the same value, and three a different one. With the fix, all three produce different values, as expected.

import bayesflow as bf
import keras

epochs = 2
num_batches_per_epoch = 10
batch_size = 4

simulator = bf.simulators.TwoMoons()
validation_data = simulator.sample(100)

workflow = bf.BasicWorkflow(
    simulator=simulator,
    standardize="all",
    inference_conditions="observables",
    inference_variables="parameters",
)

workflow.fit_online(
    num_batches_per_epoch=num_batches_per_epoch, epochs=epochs, validation_data=validation_data, batch_size=batch_size
)
print("first", workflow.approximator.trainable_weights[0][0])
workflow.fit_online(
    num_batches_per_epoch=num_batches_per_epoch, epochs=epochs, validation_data=validation_data, batch_size=batch_size
)
print("second", workflow.approximator.trainable_weights[0][0])
workflow.approximator.compile(optimizer=keras.optimizers.Adam())
workflow.fit_online(
    num_batches_per_epoch=num_batches_per_epoch, epochs=epochs, validation_data=validation_data, batch_size=batch_size
)
print("third", workflow.approximator.trainable_weights[0][0])

For the optimizer to be used, the approximator.compile function has to be called. This was not the case. I adapted the `setup_optimizer` function to match the description in its docstring, and made the compilation conditional on its output. The output indicates if a new optimizer was configured.

stefanradev93 · 2025-08-04T14:16:30Z

The serialization tests seem to be suddenly failing for the multimodal network? Apart from that, the fix look good to me.

vpratz · 2025-08-04T19:09:29Z

Thanks for taking a look! I will check if I can reproduce the issue locally with updated dependencies and try to fix the failure.

vpratz · 2025-08-04T19:26:13Z

The error concerns all summary networks and all three backends. I could only reproduce it after an update of my dependencies, my suspicion is that the handling of DType in Keras might have changed, but I have not taken steps to verify this yet. I will take a look...

The FusionNetwork just happens to be the first and JAX the fastest in the tests...

vpratz · 2025-08-04T19:29:24Z

The issue does not exist with Keras 3.10 and appears when upgrading to Keras 3.11.

vpratz · 2025-08-04T19:50:16Z

This commit introduces the regression: keras-team/keras@24f104e, more specifically the code that was removed in this block and not moved to the Layer class.

vpratz · 2025-08-04T20:01:44Z

Reintroducing the removed code into Keras fixes the issue. How do we want to proceed? Do we want to try to get a fix included into Keras? With the changes in #500 we should also be able to work around the issue... We might also want to set keras <= 3.10 until the issue is resolved.

diff --git a/keras/src/ops/operation.py b/keras/src/ops/operation.py
index edda5a01d..355492a83 100644
--- a/keras/src/ops/operation.py
+++ b/keras/src/ops/operation.py
@@ -128,6 +128,17 @@ class Operation(KerasSaveable):
         arg_names = inspect.getfullargspec(cls.__init__).args
         kwargs.update(dict(zip(arg_names[1 : len(args) + 1], args)))
 
+        # Explicitly serialize `dtype` to support auto_config
+        dtype = kwargs.get("dtype", None)
+        if dtype is not None and isinstance(dtype, dtype_policies.DTypePolicy):
+            # For backward compatibility, we use a str (`name`) for
+            # `DTypePolicy`
+            if dtype.quantization_mode is None:
+                kwargs["dtype"] = dtype.name
+            # Otherwise, use `dtype_policies.serialize`
+            else:
+                kwargs["dtype"] = dtype_policies.serialize(dtype)
+
         # For safety, we only rely on auto-configs for a small set of
         # serializable types.
         supported_types = (str, int, float, bool, type(None))

stefanradev93 · 2025-08-04T20:12:04Z

Wait, does this break keras serialization in general or is it an artifact from using our custom monkey patch? As far as I can see, the commit simply removes the dtype from the constructor and simplifies the serialization of dtypes.

The extra call leads to the DTypePolicy to be deserialized. This is then passed as a class, and cannot be handled by autoconf, leading to the error discussed in #549

vpratz · 2025-08-05T06:53:22Z

Good call, @stefanradev93. The problem arises here because we deserialize the config before passing it to the constructor. This instantiates the DTypePolicy as an object. As the autoconf mechanism doesn't handle objects of that type, this leads to the error.

Removing the deserialize call from the SummaryNetwork seems to fix the problem.
This could cause problems downstream when someone was overriding get_config but not from_config, because it changes the default from_config for summary networks. To counter this, we could check if get_config was implemented so that autoconf is not used, and only apply deserialize to the config if this is the case, like this:

    @classmethod
    def from_config(cls, config, custom_objects=None):
        if hasattr(cls.get_config, "_is_default") and cls.get_config._is_default:
            return cls(**config)
        return cls(**deserialize(config, custom_objects=custom_objects))

codecov · 2025-08-05T07:02:20Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines	Coverage Δ
bayesflow/networks/summary_network.py	`91.89% <100.00%> (-2.40%)`	⬇️
bayesflow/workflows/basic_workflow.py	`69.94% <100.00%> (+0.17%)`	⬆️

vpratz · 2025-08-05T08:13:17Z

Merging this now to make the state of the dev branch fully functional again. If I interpret it correctly, not many users will encounter the issue, as it should only arise if you load a model and then save it again. Nevertheless, I think it would be good to do a bugfix release soon.
What do you think, @stefanradev93 and @LarsKue?

* fix trainable parameters in distributions (#520) * Improve numerical precision in MVNScore.log_prob * add log_gamma diagnostic (#522) * add log_gamma diagnostic * add missing export for log_gamma * add missing export for gamma_null_distribution, gamma_discrepancy * fix broken unit tests * rename log_gamma module to sbc * add test_log_gamma unit test * add return information to log_gamma doc string * fix typo in docstring, use fixed-length np array to collect log_gammas instead of appending to an empty list * Breaking changes: Fix bugs regarding counts in standardization layer (#525) * standardization: add test for multi-input values (failing) This test reveals to bugs in the standarization layer: - count is updated multiple times - batch_count is too small, as the sizes from reduce_axes have to be multiplied * breaking: fix bugs regarding count in standardization layer Fixes #524 This fixes the two bugs described in c4cc133: - count was accidentally updated, leading to wrong values - count was calculated wrongly, as only the batch size was used. Correct is the product of all reduce dimensions. This lead to wrong standard deviations While the batch dimension is the same for all inputs, the size of the second dimension might vary. For this reason, we need to introduce an input-specific `count` variable. This breaks serialization. * fix assert statement in test * rename log_gamma to calibration_log_gamma (#527) * simple fix * Hotfix: numercial stability of non-log-stabilized sinkhorn plan (#531) * fix numerical stability issues in sinkhorn plan * improve test suite * fix ultra-strict convergence criterion in log_sinkhorn_plan * update dependencies * add comment about convergence check * update docsting to reflect fixes * sinkhorn_plan now returns a transport plan with uniform marginal distributions * add unit test for sinkhorn_plan * fix sinkhorn function by sampling from the logits of the transpose of the plan, instead of the plan directly * sinkhorn(x1, x2) now samples from log(plan) to receive assignments such that x2[assignments] matches x1 * re-enable test_assignment_is_optimal() for method='sinkhorn' * log_sinkhorn now correctly uses log_plan instead of keras.ops.exp(log_plan), log_sinkhorn_plan returns logits of the transport plan * add unit tests for log_sinkhorn_plan * fix faulty indexing with tensor for tensorflow backend * re-add numItermax for ot pot test --------- Co-authored-by: Daniel Habermann <133031176+daniel-habermann@users.noreply.github.com> * isinstance sequence * Pass correct training stage in compute_metrics (#534) * Pass correct training stage in CouplingFlow.compute_metrics * Pass correct training stage in CIF and PointInferenceNetwork * Custom test quantity support for calibration_ecdf (#528) * Custom test quantity support for calibration_ecdf * rename variable [no ci] * Consistent defaults for variable_keys/names in calibration_ecdf with test quantiles * Tests for calibration_ecdf with test_quantities * Remove redundant and simplify comments * Fix docstrings and typehints --------- Co-authored-by: stefanradev93 <stefan.radev93@gmail.com> * Log gamma test fix (#535) * fix test_calibration_log_gamma_end_to_end unit test failing too often than expected * set alpha to 0.1% in binom.ppf * fix typo in comment * Stateless adapters (#536) * Remove stateful adapter features * Fix tests * Fix typo * Remove nnpe from adapter * Bring back notes [skip ci] * Remove unncessary restriction to kwargs only [skip ci] * Remove old super call [skip ci] * Robustify type [skip ci] * remove standardize from multimodal sim notebook [no ci] * add draft module docstring to augmentations module [no ci] Feel free to modify. * adapt and run neurocognitive modeling notebook [no ci] * adapt cCM playground notebook [no ci] * adapt signature of Adapter.standardize * add parameters missed in previous commit * Minor NNPE polishing * remove stage in docstring from OnlineDataset --------- Co-authored-by: Lasse Elsemüller <60779710+elseml@users.noreply.github.com> Co-authored-by: Valentin Pratz <git@valentinpratz.de> * Fix training strategies in BasicWorkflow * move multimodal data notebook to regular examples [no ci] * make pip install call on homepage more verbose [no ci] * remove deprecated summaries function The function was renamed to summarize in v2.0.4. * detail subsampling behavior docs for SIR simulator [no ci] fixes #518 * move DiffusionModel from experimental to networks Stabilizes the DiffusionModel class. A deprecation warning for the DiffusionModel class in the experimental module was added. * Add citation for resnet (#537) [no ci] * added citation for resnet * minor formatting --------- Co-authored-by: Valentin Pratz <git@valentinpratz.de> * Bump up version [skip ci] * Allow separate inputs to subnets for continuous models (#521) Introduces easy access to the different inputs x, t and conditions, to allow for specialized processing of each input, which can be beneficial for more advanced use cases. --------- Co-authored-by: Valentin Pratz <git@valentinpratz.de> * Auto-select backend (#543) * add automatic backend detection and selection * Fix typo Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Add priority ordering of backends --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: stefanradev93 <stefan.radev93@gmail.com> * Breaking: parameterize MVNormalScore by inverse cholesky factor to improve stability (#545) * breaking: parameterize MVNormalScore by inverse cholesky factor The log_prob can be completely calculated using the inverse cholesky factor L^{-1}. Using this also stabilizes the initial loss, and speeds up computation. This commit also contains two optimizations. Moving the computation of the precision matrix into the einsum, and using the sum of the logs instead of the log of a product. As the parameterization changes, this is a breaking change. * Add right_side_scale_inverse and test [no ci] The transformation necessary to undo standardization for a Cholesky factor of the precision matrix is x_ij = x_ij' / sigma_j, which is now implemented by a right_side_scale_inverse transformation_type. * Stop skipping MVN tests * Remove stray keyword argument in fill_triangular_matrix * Rename cov_chol_inv to precision_chol and update docstrings [no ci] * rename precision_chol to precision_cholesky_factor to improve clarity. * rename cov_chol to covariance_cholesky_factor * remove check_approximator_multivariate_normal_score function [no ci] --------- Co-authored-by: han-ol <g@hans.olischlaeger.com> * fix unconditional sampling in ContinuousApproximator (#548) - batch shape was calculated from inference_conditions even if they are known to be None - add approximator test for unconditional setting * Test quantities Linear Regression Starter notebook (#544) * Implementation of log-lik test quantity for SBC in starter notebook * update data-dependent test-quantities example * Small typo fixes in linear regression notebook --------- Co-authored-by: Paul-Christian Bürkner <paul.buerkner@gmail.com> * fix: optimizer was not used in workflow with multiple fits For the optimizer to be used, the approximator.compile function has to be called. This was not the case. I adapted the `setup_optimizer` function to match the description in its docstring, and made the compilation conditional on its output. The output indicates if a new optimizer was configured. * fix: remove extra deserialize call for SummaryNetwork The extra call leads to the DTypePolicy to be deserialized. This is then passed as a class, and cannot be handled by autoconf, leading to the error discussed in #549 * Compatibility: deserialize when get_config was overridden * unify log_prob signature in PointApproximator [no ci] ContinuousApproximator and BasicWorkflow allow passing the data positionally, we can allow the same for the PointApproximator. * Tutorial on spatial data with Gaussian Random Fields (#540) [no ci] The tutorial uses the experimental ResNet class to build a summary network for spatial data. * Support non-array data in test_quantity calibration ecdf [no ci] Simulator outputs are allowed to be of type int or float, and consequently have no batch dimension. This needs to be considered in the broadcasting of inference_conditions for data based SBC test quantities. "examples/Linear_Regression_Starter.ipynb" contains an example where this is necessary, where N is a non-batchable integer. * import calibration_log_gamma in diagnostics namespace [no ci] * Add wrapper around scipy.integrate.solve_ivp for integration * minor fixes and improvements to the pairs plot functions - pass target color to legend - do not use common norm, so that prior stays visible in kdeplots - do not share y on the diagonal, so that all marginal distributions stay visible, even if one is very peaked * fix: layers were not deserialized for Sequential and Residual As layers were passed with the `*layers` syntax, they could not be passed as keyword arguments. In `from_config`, however, this was attempted, leading to the layers to be ignored during reserialization. This commit fixes this by taking the layers from `kwargs` if they are passed as a keyword argument. * add serialization tests for Sequential and Residual * Fix: ensure checkpoint filepath exists before training Previously choosing a non-existant directory as checkpoint_filepath would lead to silently not saving at all. * Revert 954c16c since it was unnecessary The alledged issue didn't exist and checkpoint folders are created by the keras callback automatically already. I misread tests on this and didn't catch that the problem I was seeing was caused by a different part of my pipeline. * improvements to diagnostic plots (#556) * improvements to diagnostics plots add markersize parameter, add tests, support dataset_id for pairs_samples Fixes #554. * simplify test_calibration_ecdf_from_quantiles * Add pairs plot for arbitrary quantities (#550) Add pairs_quantity and plot_quantity functions that allow plotting of quantities that can be calculated for each individual dataset. Currently, for the provided metrics this is only useful for posterior contraction, but could be useful for posterior z-score and other quantities as well. * minor fix in diffusion edm schedule (#560) * minor fix in diffusion edm schedule * DeepSet: Adapt output dimension of invariant module inside the equivariant module (#557) (#561) * adapt output dim of invariant module in equivariant module See #557. The DeepSet showed bad performance and was not able to learn diverse summary statistics. Reducing the dimension of the output of the invariant module inside the equivariant module improves this, probably because the invidividual information of each set member gains importance compared to the shared information provided by the invariant module. There might be better settings for this, so we might update the default later on. However, this is already an improvement over the previous setting. * DeepSet: adapt docstring to reflect code * pairs_postorior: inconsistent type hint fix (#562) * allow exploding variance type in EDM schedule * fix type hint * Bump up version [skip ci] * Fix instructions for backend spec [skip ci] * Add New Flow Matching Schedules (#565) * add fm schedule * add fm schedule * add comments * expose time_power_law_alpha * Improve doc [skip ci] --------- Co-authored-by: stefanradev93 <stefan.radev93@gmail.com> * change default integration method to rk45 for DiffusionModel and FlowMatching. Euler shows significant deviations when computing the log-prob, which risks misleading users regarding the performance of the networks. rk45 is slower, but the problem is heavily reduced with this method. * fix nan to num inverse * fix setting markersize in lotka volterra notebook * fix: actually set KERAS_BACKEND to chosen backend Add warning if KERAS_BACKEND and actually loaded backend do not match. This can happen if keras is imported before BayesFlow. * Fix warning msg --------- Co-authored-by: Valentin Pratz <112951103+vpratz@users.noreply.github.com> Co-authored-by: han-ol <g@hans.olischlaeger.com> Co-authored-by: Daniel Habermann <133031176+daniel-habermann@users.noreply.github.com> Co-authored-by: Valentin Pratz <git@valentinpratz.de> Co-authored-by: arrjon <jonas.arruda@uni-bonn.de> Co-authored-by: Lars <lars@kuehmichel.de> Co-authored-by: Hans Olischläger <106988117+han-ol@users.noreply.github.com> Co-authored-by: Lasse Elsemüller <60779710+elseml@users.noreply.github.com> Co-authored-by: Leona Odole <88601208+eodole@users.noreply.github.com> Co-authored-by: Jonas Arruda <69197639+arrjon@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Paul-Christian Bürkner <paul.buerkner@gmail.com> Co-authored-by: The-Gia Leo Nguyen <Leo.Nguyen@gmx.de>

vpratz requested review from LarsKue and stefanradev93 August 4, 2025 11:53

vpratz added the fix Pull request that fixes a bug label Aug 4, 2025

fix: remove extra deserialize call for SummaryNetwork

3d391d8

The extra call leads to the DTypePolicy to be deserialized. This is then passed as a class, and cannot be handled by autoconf, leading to the error discussed in #549

Compatibility: deserialize when get_config was overridden

952862c

vpratz merged commit 952862c into dev Aug 5, 2025
9 checks passed

vpratz deleted the fix-workflow-optimizer branch August 5, 2025 08:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: optimizer was not used in workflow with multiple fits #549

Fix: optimizer was not used in workflow with multiple fits #549

Uh oh!

vpratz commented Aug 4, 2025

Uh oh!

stefanradev93 commented Aug 4, 2025 •

edited

Loading

Uh oh!

vpratz commented Aug 4, 2025

Uh oh!

vpratz commented Aug 4, 2025

Uh oh!

vpratz commented Aug 4, 2025

Uh oh!

vpratz commented Aug 4, 2025

Uh oh!

vpratz commented Aug 4, 2025 •

edited

Loading

Uh oh!

stefanradev93 commented Aug 4, 2025

Uh oh!

vpratz commented Aug 5, 2025 •

edited

Loading

Uh oh!

codecov bot commented Aug 5, 2025 •

edited

Loading

Uh oh!

vpratz commented Aug 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix: optimizer was not used in workflow with multiple fits #549

Fix: optimizer was not used in workflow with multiple fits #549

Uh oh!

Conversation

vpratz commented Aug 4, 2025

Uh oh!

stefanradev93 commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vpratz commented Aug 4, 2025

Uh oh!

vpratz commented Aug 4, 2025

Uh oh!

vpratz commented Aug 4, 2025

Uh oh!

vpratz commented Aug 4, 2025

Uh oh!

vpratz commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stefanradev93 commented Aug 4, 2025

Uh oh!

vpratz commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vpratz commented Aug 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stefanradev93 commented Aug 4, 2025 •

edited

Loading

vpratz commented Aug 4, 2025 •

edited

Loading

vpratz commented Aug 5, 2025 •

edited

Loading

codecov bot commented Aug 5, 2025 •

edited

Loading