Skip to content

Transformer Training fails on custom dataset #174

Closed
@karthikb23

Description

@karthikb23

Hi,
I'm training Transformer on a custom dataset on CPU. I've used spm encoding and have followed instructions to a T, but the training always fails with the below error trace. The same error occurs regardless of BPE, SPM or raw encodings. Kindly help!

step: 400, loss: 5.6811
step: 500, loss: 5.3085
Traceback (most recent call last):
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[55,128] = 128 is not in [0, 128)
	 [[{{node sinusoid_posisiton_embedder/embedding_lookup}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "transformer_main.py", line 308, in <module>
    main()
  File "transformer_main.py", line 293, in main
    step = _train_epoch(sess, epoch, step, smry_writer)
  File "transformer_main.py", line 273, in _train_epoch
    _eval_epoch(sess, epoch, mode='eval')
  File "transformer_main.py", line 193, in _eval_epoch
    fetches_ = sess.run(fetches, feed_dict=feed_dict)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[55,128] = 128 is not in [0, 128)
	 [[node sinusoid_posisiton_embedder/embedding_lookup (defined at /home/karthik/installs/texar/texar/modules/embedders/position_embedders.py:327) ]]

Caused by op 'sinusoid_posisiton_embedder/embedding_lookup', defined at:
  File "transformer_main.py", line 308, in <module>
    main()
  File "transformer_main.py", line 96, in main
    src_pos_embeds = pos_embedder(sequence_length=src_seq_len)
  File "/home/karthik/installs/texar/texar/module_base.py", line 116, in __call__
    return self._template(*args, **kwargs)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 360, in __call__
    return self._call_func(args, kwargs)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 311, in _call_func
    result = self._func(*args, **kwargs)
  File "/home/karthik/installs/texar/texar/modules/embedders/position_embedders.py", line 327, in _build
    outputs = tf.nn.embedding_lookup(embedding, inputs)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/embedding_ops.py", line 316, in embedding_lookup
    transform_fn=None)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/embedding_ops.py", line 133, in _embedding_lookup_and_transform
    result = _clip(array_ops.gather(params[0], ids, name=name),
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 3273, in gather
    return gen_array_ops.gather_v2(params, indices, axis, name=name)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3748, in gather_v2
    "GatherV2", params=params, indices=indices, axis=axis, name=name)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): indices[55,128] = 128 is not in [0, 128)
	 [[node sinusoid_posisiton_embedder/embedding_lookup (defined at /home/karthik/installs/texar/texar/modules/embedders/position_embedders.py:327) ]]

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions