-
Notifications
You must be signed in to change notification settings - Fork 74
Fix: optimizer was not used in workflow with multiple fits #549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For the optimizer to be used, the approximator.compile function has to be called. This was not the case. I adapted the `setup_optimizer` function to match the description in its docstring, and made the compilation conditional on its output. The output indicates if a new optimizer was configured.
The serialization tests seem to be suddenly failing for the multimodal network? Apart from that, the fix look good to me. |
Thanks for taking a look! I will check if I can reproduce the issue locally with updated dependencies and try to fix the failure. |
The error concerns all summary networks and all three backends. I could only reproduce it after an update of my dependencies, my suspicion is that the handling of The |
The issue does not exist with Keras 3.10 and appears when upgrading to Keras 3.11. |
This commit introduces the regression: keras-team/keras@24f104e, more specifically the code that was removed in this block and not moved to the Layer class. |
Reintroducing the removed code into Keras fixes the issue. How do we want to proceed? Do we want to try to get a fix included into Keras? With the changes in #500 we should also be able to work around the issue... We might also want to set diff --git a/keras/src/ops/operation.py b/keras/src/ops/operation.py
index edda5a01d..355492a83 100644
--- a/keras/src/ops/operation.py
+++ b/keras/src/ops/operation.py
@@ -128,6 +128,17 @@ class Operation(KerasSaveable):
arg_names = inspect.getfullargspec(cls.__init__).args
kwargs.update(dict(zip(arg_names[1 : len(args) + 1], args)))
+ # Explicitly serialize `dtype` to support auto_config
+ dtype = kwargs.get("dtype", None)
+ if dtype is not None and isinstance(dtype, dtype_policies.DTypePolicy):
+ # For backward compatibility, we use a str (`name`) for
+ # `DTypePolicy`
+ if dtype.quantization_mode is None:
+ kwargs["dtype"] = dtype.name
+ # Otherwise, use `dtype_policies.serialize`
+ else:
+ kwargs["dtype"] = dtype_policies.serialize(dtype)
+
# For safety, we only rely on auto-configs for a small set of
# serializable types.
supported_types = (str, int, float, bool, type(None)) |
Wait, does this break keras serialization in general or is it an artifact from using our custom monkey patch? As far as I can see, the commit simply removes the |
The extra call leads to the DTypePolicy to be deserialized. This is then passed as a class, and cannot be handled by autoconf, leading to the error discussed in #549
Good call, @stefanradev93. The problem arises here because we deserialize the config before passing it to the constructor. This instantiates the Removing the deserialize call from the @classmethod
def from_config(cls, config, custom_objects=None):
if hasattr(cls.get_config, "_is_default") and cls.get_config._is_default:
return cls(**config)
return cls(**deserialize(config, custom_objects=custom_objects)) |
Codecov Report✅ All modified and coverable lines are covered by tests.
|
Merging this now to make the state of the |
For the optimizer to be used, the approximator.compile function has to be called. This was not the case for repeated calls to
fit_...
, even withkeep_optimizer
set toFalse
. I adapted thesetup_optimizer
function to match the description in its docstring, and made the compilation conditional on its output. The output indicates if a new optimizer was configured.To reproduce, you can run the following code. In the version without the fix, one and two will give the same value, and three a different one. With the fix, all three produce different values, as expected.