-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transformed Variable not trainable in Keras model #946
Comments
Try adding a `self.foo = self.distribution.variables`
Unfortunately keras doesn't recognize the variables inside a tf.Module. No
idea why.
…On Fri, May 22, 2020, 9:09 PM Keyon Vafa ***@***.***> wrote:
Hi,
I am trying to train a positive variable using
tfp.util.TransformedVariable as an attribute of a tf.keras.Model object.
However, the model does not recognize the object as a trainable variable,
and it does not receive gradients. This behavior holds for
tensorflow_probability==0.10.0 and tensorflow==2.2.0, as well as for the
nightly builds of both.
Here is a colab notebook illustrating this behavior:
https://colab.research.google.com/drive/1XGCcm8l0OGRiy35lr3XcHAyZMuBNpsIB?usp=sharing
In this example, we are trying to train both an unconstrained variable (
loc) and a constrained variable (scale). Only the loc variable updates.
import numpy as npimport tensorflow as tfimport tensorflow_probability as tfp
class Model(tf.keras.Model):
def __init__(self):
super(Model, self).__init__()
self.loc = tf.Variable(tf.ones(shape=[5]), name="loc")
self.scale = tfp.util.TransformedVariable(
tf.ones([5]),
bijector=tfp.bijectors.Softplus(),
name="scale")
self.distribution = tfp.distributions.Normal(loc=self.loc, scale=self.scale)
def call(self, inputs):
samples = self.distribution.sample()
assigned_means = tf.gather(samples, inputs)
return tfp.distributions.Normal(loc=assigned_means, scale=1.)
model = Model()print(model.trainable_weights) # only 'loc' shows up
optimizer = tf.keras.optimizers.Adam(learning_rate=0.1)loss = lambda x, rv: -tf.reduce_sum(rv.log_prob(x))inputs = np.array([0, 1, 2, 3, 4]).astype(np.int32)outputs = np.array([0., 1., 2., 3., 4.]).astype(np.float32)dataset = tf.data.Dataset.from_tensor_slices((inputs, outputs))dataset = dataset.batch(5)model.compile(optimizer=optimizer, loss=loss)model.fit(dataset, epochs=100, verbose=0)
# Check if the location parameters have moved from their original values.assert(not (np.isclose(model.loc.numpy(), np.ones(5))).all()) # Passes
# Check if the scale parameters have moved from their original values.assert(not (np.isclose(model.scale.numpy(), np.ones(5))).all()) # Fails
Thanks!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#946>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFJFSI5JKDL7CQMTSGAVPDLRS4O5DANCNFSM4NIG3RXQ>
.
|
Thank you. That works for me. Hopefully it's possible for keras to recognize variables inside a tf.Module. |
Hello, I have the same problem with the def random_variable_scope(next_creator, **kwargs):
iv = kwargs['initial_value']
if callable(iv):
return tfn.util.RandomVariable(tfd.Normal(tf.Variable(iv()), 1))
return next_creator(**kwargs)
with tf.variable_creator_scope(random_variable_scope):
d = tf.keras.layers.Dense(2)
d(tf.zeros([3,4]))
[type(v) for v in d.variables] gives
The foo trick kind of works but it makes the variables list polluted: d.foo = [ v.variables for v in d.variables]
[type(v) for v in d.variables]
|
@keyonvafa this is a core TensorFlow issue (see tensorflow/tensorflow#47264). You can use the class Model(tf.keras.Model, TrackableLayer):
def __init__(self):
super().__init__() # this is the recommended style in Python3 anyways With your original Model definition,
When inheriting from TrackableLayer,
as expected. @krzysztofrusek you can use a similar workaround for your own issue as well by using a class TrackableDense(tf.keras.layers.Dense, TrackableLayer):
pass though I'm not sure it resolves your duplicated-variable issue. 🤔 |
Thank you @st-- ! I'll give that a try. |
Also, it looks like this issue will finally be fixed by TensorFlow 2.5: tensorflow/tensorflow#47264 (comment) |
Hi,
I am trying to train a positive variable using
tfp.util.TransformedVariable
as an attribute of atf.keras.Model
object. However, the model does not recognize it as a trainable variable, and it does not receive gradients. This behavior holds fortensorflow_probability==0.10.0
andtensorflow==2.2.0
, as well as for the nightly builds of both.Here is a colab notebook illustrating this behavior: https://colab.research.google.com/drive/1XGCcm8l0OGRiy35lr3XcHAyZMuBNpsIB?usp=sharing
In this example, we are trying to train both an unconstrained variable (
loc
) and a constrained variable (scale
). Only theloc
variable updates.Thanks!
The text was updated successfully, but these errors were encountered: