Closed
Description
Hi,
I'm training Transformer on a custom dataset on CPU. I've used spm encoding and have followed instructions to a T, but the training always fails with the below error trace. The same error occurs regardless of BPE, SPM or raw encodings. Kindly help!
step: 400, loss: 5.6811
step: 500, loss: 5.3085
Traceback (most recent call last):
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[55,128] = 128 is not in [0, 128)
[[{{node sinusoid_posisiton_embedder/embedding_lookup}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "transformer_main.py", line 308, in <module>
main()
File "transformer_main.py", line 293, in main
step = _train_epoch(sess, epoch, step, smry_writer)
File "transformer_main.py", line 273, in _train_epoch
_eval_epoch(sess, epoch, mode='eval')
File "transformer_main.py", line 193, in _eval_epoch
fetches_ = sess.run(fetches, feed_dict=feed_dict)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[55,128] = 128 is not in [0, 128)
[[node sinusoid_posisiton_embedder/embedding_lookup (defined at /home/karthik/installs/texar/texar/modules/embedders/position_embedders.py:327) ]]
Caused by op 'sinusoid_posisiton_embedder/embedding_lookup', defined at:
File "transformer_main.py", line 308, in <module>
main()
File "transformer_main.py", line 96, in main
src_pos_embeds = pos_embedder(sequence_length=src_seq_len)
File "/home/karthik/installs/texar/texar/module_base.py", line 116, in __call__
return self._template(*args, **kwargs)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 360, in __call__
return self._call_func(args, kwargs)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 311, in _call_func
result = self._func(*args, **kwargs)
File "/home/karthik/installs/texar/texar/modules/embedders/position_embedders.py", line 327, in _build
outputs = tf.nn.embedding_lookup(embedding, inputs)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/embedding_ops.py", line 316, in embedding_lookup
transform_fn=None)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/embedding_ops.py", line 133, in _embedding_lookup_and_transform
result = _clip(array_ops.gather(params[0], ids, name=name),
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 3273, in gather
return gen_array_ops.gather_v2(params, indices, axis, name=name)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3748, in gather_v2
"GatherV2", params=params, indices=indices, axis=axis, name=name)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): indices[55,128] = 128 is not in [0, 128)
[[node sinusoid_posisiton_embedder/embedding_lookup (defined at /home/karthik/installs/texar/texar/modules/embedders/position_embedders.py:327) ]]