Unsupported loss function in seq2seq model. #163
Description
I am exploring the following tensorflow example: https://github.com/googledatalab/notebooks/blob/master/samples/TensorFlow/LSTM%20Punctuation%20Model%20With%20TensorFlow.ipynb which apparently is written in tf v1, so I upgraded with the v2 upgrade script and there were three main inconsistencies:
ERROR: Using member tf.contrib.rnn.DropoutWrapper in deprecated module tf.contrib. tf.contrib.rnn.DropoutWrapper cannot be converted automatically. tf.contrib will not be distributed with TensorFlow 2.0, please consider an alternative in non-contrib TensorFlow, a community-maintained repository such as tensorflow/addons, or fork the required code.
ERROR: Using member tf.contrib.legacy_seq2seq.sequence_loss_by_example in deprecated module tf.contrib. tf.contrib.legacy_seq2seq.sequence_loss_by_example cannot be converted automatically. tf.contrib will not be distributed with TensorFlow 2.0, please consider an alternative in non-contrib TensorFlow, a community-maintained repository such as tensorflow/addons, or fork the required code.
ERROR: Using member tf.contrib.framework.get_or_create_global_step in deprecated module tf.contrib. tf.contrib.framework.get_or_create_global_step cannot be converted automatically. tf.contrib will not be distributed with TensorFlow 2.0, please consider an alternative in non-contrib TensorFlow, a community-maintained repository such as tensorflow/addons, or fork the required code.
So for compatibility I manually replaced framework.get_or_create_global_step
with tf.compat.v1.train.get_or_create_global_step
, and also rnn.DropoutWrapper
with tf.compat.v1.nn.rnn_cell.DropoutWrapper
.
But I was unable to find a solution on how to handle the tf.contrib.legacy_seq2seq.sequence_loss_by_example
method, since I cannot find a backwards compatible alternative. I tried installing Tensroflow Addons and use its seq2seq loss function, but wasn't able to figure out how to adapt it to work with the rest of the code.
Stumbled across some errors like Consider casting elements to a supported type.
or Logits must be a [batch_size x sequence_length x logits] tensor
, because probably i am not implementing something correctly.
My question: How to implement supported tensorflow v2 alternative of this loss function, so it acts similarly to the code below?
output = tf.reshape(tf.concat(axis=1, values=outputs), [-1, size])
softmax_w = tf.compat.v1.get_variable("softmax_w", [size, len(TARGETS)], dtype=tf.float32)
softmax_b = tf.compat.v1.get_variable("softmax_b", [len(TARGETS)], dtype=tf.float32)
logits = tf.matmul(output, softmax_w) + softmax_b
self._predictions = tf.argmax(input=logits, axis=1)
self._targets = tf.reshape(input_.targets, [-1])
loss = tfa.seq2seq.sequence_loss(
[logits],
[tf.reshape(input_.targets, [-1])],
[tf.ones([batch_size * num_steps], dtype=tf.float32)])
self._cost = cost = tf.reduce_sum(input_tensor=loss) / batch_size
self._final_state = state
Full code here.
My proposal: When this is resolved please update the notebook with newer version example.