Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graphing Model Created with Model Subclassing #3527

Open
ryanmaxwell96 opened this issue Apr 17, 2020 · 6 comments
Open

Graphing Model Created with Model Subclassing #3527

ryanmaxwell96 opened this issue Apr 17, 2020 · 6 comments

Comments

@ryanmaxwell96
Copy link

I'm trying to plot a model in TF2 that was made with the model subclassing method. (The code I've been modifying has built its models with the model subclassing method and I have not been able to find any way to plot the fractal model.) However, I have yet to find a functional way of plotting a model made this way.

Is there anyway to plot a model if it has been made with the subclassing method?

Thanks,

Ryan

@caisq
Copy link
Contributor

caisq commented Apr 18, 2020

One way I know of is to use get_concrete_function() to get the forward path of the model, and then use summary_ops_v2.graph() to write it to the logdir. See the example below:

import numpy as np
import tensorflow as tf
from tensorflow.python.ops import summary_ops_v2

class SubclassedModel(tf.keras.Model):

  def __init__(self):
    super(SubclassedModel, self).__init__()
    self.layer_1 = tf.keras.layers.Dense(
        1, activation="sigmoid", input_shape=[3])

  @tf.function
  def call(self, inputs):
    return self.layer_1(inputs)


model = SubclassedModel()
model.compile(loss="binary_crossentropy", optimizer="adam")

logdir = "/tmp/subclassed_model_logdir"
xs = np.ones([4, 3])
ys = np.zeros([4, 1])
model.fit(xs, ys, epochs=1, 
          callbacks=tf.keras.callbacks.TensorBoard(logdir))

writer = summary_ops_v2.create_file_writer_v2(logdir)
with writer.as_default():
  summary_ops_v2.graph(model.call.get_concrete_function(xs).graph)  # <--
  writer.flush()

Does this suite your need, @ryanmaxwell96 ?

@ryanmaxwell96
Copy link
Author

ryanmaxwell96 commented Apr 18, 2020

I'm getting this error now:

[python3 train.py AntBulletEnv-v0 pybullet build time: Mar 24 2020 20:06:11 2020-04-18 10:07:07.535777: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library libnvinfer.so.6; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/home/ryan/.mujoco/mujoco200/bin 2020-04-18 10:07:07.535866: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library libnvinfer_plugin.so.6; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/home/ryan/.mujoco/mujoco200/bin 2020-04-18 10:07:07.535876: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. Value Params -- lr: 0.000884 2020-04-18 10:07:08.508923: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library libcuda.so.1; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/home/ryan/.mujoco/mujoco200/bin 2020-04-18 10:07:08.508946: E tensorflow/stream_executor/cuda/cuda_driver.cc:351] failed call to cuInit: UNKNOWN ERROR (303) 2020-04-18 10:07:08.508960: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ryan-ThinkPad-X1-Carbon-6th): /proc/driver/nvidia/version does not exist 2020-04-18 10:07:08.509105: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2020-04-18 10:07:08.531574: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1999965000 Hz 2020-04-18 10:07:08.532399: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x49a93a0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-04-18 10:07:08.532456: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version Policy Params -- lr: 7.95e-05, logvar_speed: 16 Traceback (most recent call last): File train.py, line 375, in <module> main(**vars(args)) File train.py, line 310, in main policy = Policy(bl,c,layersizes,dropout,deepest,obs_dim,kl_targ, init_logvar) File /home/ryan/trpo_fractal1NN_3/trpo/policy.py, line 51, in __init__ self.trpo.fit(xs,ys,epochs=1,callbacks=tf.keras.callbacks.TensorBoard(logdir)) File /home/ryan/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py, line 819, in fit use_multiprocessing=use_multiprocessing) File /home/ryan/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py, line 216, in fit optimizer=model.optimizer) File /home/ryan/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/distribute/distributed_training_utils.py, line 254, in validate_callbacks for callback in input_callbacks: TypeError: TensorBoard object is not iterable](url)
Here is the relevant code:
`
NN Policy with KL Divergence Constraint

Written by Patrick Coady (pat-coady.github.io)
import tensorflow.keras.backend as K
from tensorflow.keras import Model
from tensorflow.keras.layers import Dense, Layer
from tensorflow.keras.optimizers import Adam
import numpy as np

import tensorflow as tf
tf.config.experimental_run_functions_eagerly(True) # solves the tried to create variables on non-first call

from fractalnet_regularNN import *

from tensorflow.python.ops import summary_ops_v2

class Policy(object):
def init(self, bl,c,layersizes,dropout,deepest,obs_dim,kl_targ, init_logvar):
Args:
obs_dim: num observation dimensions (int)
act_dim: num action dimensions (int)
kl_targ: target KL divergence between pi_old and pi_new
hid1_mult: size of first hidden layer, multiplier of obs_dim (default is 10)
init_logvar: natural log of initial policy variance

    self.beta = 1.0  # dynamically adjusted D_KL loss multiplier
    eta = 50  # multiplier for D_KL-kl_targ hinge-squared loss
    self.kl_targ = kl_targ
    self.epochs = 20
    self.lr_multiplier = 1.0  # dynamically adjust lr when D_KL out of control
    self.trpo = TRPO(bl,c,layersizes,dropout,deepest,obs_dim, kl_targ, init_logvar, eta)
    self.policy = self.trpo.get_layer("policy_nn") # output layer which gives the mean. (var is also given by this but still unsure how as it is not given by the NN, but is computed by the KERAS package somehow)
    self.lr = self.policy.get_lr()  # lr calculated based on size of PolicyNN
    self.trpo.compile(optimizer=Adam(self.lr * self.lr_multiplier)) # Configures model for training
    
    logdir = "/subclassed_model_logdir"
    xs = np.ones([29,])
    ys = np.zeros([8,])
    self.trpo.fit(xs,ys,epochs=1,callbacks=tf.keras.callbacks.TensorBoard(logdir)) 
    writer = summary_ops_v2.create_file_writer_v2(logdir)
    with writer.as_default():
        summary_ops_v2.graph(self.trpo.call.get_concrete_function(xs).graph)  
        writer.flush()

    self.logprob_calc = LogProb()

def sample(self, obs):
    """Draw sample from policy."""
    act_means, act_logvars = self.policy(obs) 
    act_stddevs = np.exp(act_logvars / 2)
    temp = np.random.normal(act_means, act_stddevs).astype(np.float32)
    return temp

def update(self, observes, actions, advantages, logger, disc_sum_rew):
    K.set_value(self.trpo.optimizer.lr, self.lr * self.lr_multiplier)
    K.set_value(self.trpo.beta, self.beta)
    old_means, old_logvars = self.policy(observes)
    old_means = old_means.numpy()
    old_logvars = old_logvars.numpy()
    old_logp = self.logprob_calc([actions, old_means, old_logvars])
    old_logp = old_logp.numpy()
    loss, kl, entropy = 0, 0, 0
    for e in range(self.epochs):
        loss = self.trpo.train_on_batch([observes, actions, advantages,
                                         old_means, old_logvars, old_logp])
        kl, entropy = self.trpo.predict_on_batch([observes, actions, advantages,
                                                  old_means, old_logvars, old_logp])
        kl, entropy = np.mean(kl), np.mean(entropy)
        if kl > self.kl_targ * 4:  # early stopping if D_KL diverges badly
            break
    if kl > self.kl_targ * 2:  
        self.beta = np.minimum(35, 1.5 * self.beta)  
        if self.beta > 30 and self.lr_multiplier > 0.1:
            self.lr_multiplier /= 1.5
    elif kl < self.kl_targ / 2:
        self.beta = np.maximum(1 / 35, self.beta / 1.5)  
        if self.beta < (1 / 30) and self.lr_multiplier < 10:
            self.lr_multiplier *= 1.5

    logger.log({"PolicyLoss": loss,
                "PolicyEntropy": entropy,
                "KL": kl,
                "Beta": self.beta,
                "_lr_multiplier": self.lr_multiplier})

class PolicyNN(Layer):
""" Neural net for policy approximation function.

Policy parameterized by Gaussian means and variances. NN outputs mean
 action based on observation. Trainable variables hold log-variances
 for each action dimension (i.e. variances not determined by NN).
 Variances can be calculated just with the vector containing all of 
 the probabilities and the mean from the NN.
"""
def __init__(self, bl,c,layersizes, dropout,deepest, obs_dim, init_logvar, **kwargs):
    super(PolicyNN, self).__init__(**kwargs)
    self.bl = bl
    self.c = c
    self.layersizes = layersizes
    self.dropout = dropout
    self.deepest = deepest
    self.batch_sz = None
    self.init_logvar = init_logvar
    hid1_units = 512
    hid3_units = 32  # 10 empirically determined
    hid2_units = 128
    self.lr = 9e-4 / np.sqrt(hid2_units)  # 9e-4 empirically determined
   
    self.dense1 = Dense(512, activation="relu", input_shape=(29,))
    self.dense2 = Dense(128, activation="relu", input_shape=(512,)) # hid1_units = 270 for halfcheetah
    self.dense3 = Dense(32, activation="relu", input_shape=(128,)) # hid2_units = 127 for halfcheetah
    self.dense4 = Dense(8, input_shape=(32,)) # hid3_units = 60 for halfcheetah
    
    logvar_speed = (10 * hid3_units) // 48
    logvar_speed = 16
    self.logvars = self.add_weight(shape=(logvar_speed, int(self.layersizes[4][0])),
                                   trainable=True, initializer="zeros")
    print("Policy Params -- lr: {:.3g}, logvar_speed: {}"
          .format(self.lr, logvar_speed))

def build(self, input_shape):
    self.batch_sz = input_shape[0] # input_shape = (1,27)

def call(self, inputs, **kwargs):
    means = fractal_net(self,bl=self.bl,c=self.c,layersizes=self.layersizes,
        drop_path=0.15,dropout=self.dropout,
        deepest=self.deepest)(inputs)
    logvars = K.sum(self.logvars, axis=0, keepdims=True) + self.init_logvar
    logvars = K.tile(logvars, (self.batch_sz, 1))
   
    return [means, logvars]

def get_lr(self):
    return self.lr

class KLEntropy(Layer):
"""
Layer calculates:
1. KL divergence between old and new policy distributions
2. Entropy of present policy

https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Kullback.E2.80.93Leibler_divergence
https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Entropy
"""
def __init__(self, **kwargs):
    super(KLEntropy, self).__init__(**kwargs)
    self.act_dim = None

def build(self, input_shape):
    self.act_dim = input_shape[0][1]

def call(self, inputs, **kwargs):
    old_means, old_logvars, new_means, new_logvars = inputs
    log_det_cov_old = K.sum(old_logvars, axis=-1, keepdims=True)
    log_det_cov_new = K.sum(new_logvars, axis=-1, keepdims=True)
    trace_old_new = K.sum(K.exp(old_logvars - new_logvars), axis=-1, keepdims=True)
    kl = 0.5 * (log_det_cov_new - log_det_cov_old + trace_old_new +
                K.sum(K.square(new_means - old_means) /
                      K.exp(new_logvars), axis=-1, keepdims=True) -
                np.float32(self.act_dim))
    entropy = 0.5 * (np.float32(self.act_dim) * (np.log(2 * np.pi) + 1.0) +
                     K.sum(new_logvars, axis=-1, keepdims=True))

    return [kl, entropy]

class LogProb(Layer):
"""Layer calculates log probabilities of a batch of actions."""
def init(self, **kwargs):
super(LogProb, self).init(**kwargs)

def call(self, inputs, **kwargs):
    actions, act_means, act_logvars = inputs
    logp = -0.5 * K.sum(act_logvars, axis=-1, keepdims=True)
    logp += -0.5 * K.sum(K.square(actions - act_means) / K.exp(act_logvars),
                         axis=-1, keepdims=True)

    return logp

class TRPO(Model):
def init(self, bl,c,layersizes,dropout,deepest,obs_dim, kl_targ, init_logvar, eta, **kwargs):
super(TRPO, self).init(**kwargs)
self.kl_targ = kl_targ
self.eta = eta
self.beta = self.add_weight("beta", initializer="zeros", trainable=False)
self.policy = PolicyNN(bl,c,layersizes,dropout,deepest,obs_dim, init_logvar)
self.logprob = LogProb()
self.kl_entropy = KLEntropy()

def call(self, inputs):
    obs, act, adv, old_means, old_logvars, old_logp = inputs # array called in Policy to update policy in "update" "train_on_batch" and "predict_on_batch"
    new_means, new_logvars = self.policy(obs) # PolicyNN "call" func that outputs the new means and log-vars under current policy
    new_logp = self.logprob([act, new_means, new_logvars])
    kl, entropy = self.kl_entropy([old_means, old_logvars,
                                   new_means, new_logvars])
    loss1 = -K.mean(adv * K.exp(new_logp - old_logp))
    loss2 = K.mean(self.beta * kl)
    loss3 = self.eta * K.square(K.maximum(0.0, K.mean(kl) - 2.0 * self.kl_targ))
    self.add_loss(loss1 + loss2 + loss3)

    return [kl, entropy]

`

@ryanmaxwell96
Copy link
Author

Sorry, that code is really hard to read. I have uploaded the applicable file where I have added your suggested code to Policy class line 48-55. https://github.com/ryanmaxwell96/trpo_fractal1NN_3/blob/master/policy.py

@ryanmaxwell96
Copy link
Author

ryanmaxwell96 commented Apr 18, 2020

My bad, I forgot to include the @tf.function part. But after I do this (which is now on my github), I get the same error. It is now on lines 136 to 150 in policy.py with a call to the appropriate function in train.py at line 311

@caisq
Copy link
Contributor

caisq commented Apr 19, 2020

The error you referred to above (related to libnvinfer_plugin) seems to be due to the fact that you're trying to call summary_v2_ops.graph() inside a tf.function. Can you do this outside, like the example I provided?

@ryanmaxwell96
Copy link
Author

I was running into the same problem when I tried to rearrange things as you suggested so I decided to try to run your code and I'm getting the same error as well:

TypeError: 'TensorBoard' object is not iterable

I have the following installed:
tensorflow version is 2.1.0
keras version is 2.3.1
tensorboard version is 3.1.0

I'm not sure if that helps at all or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants