Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input Data Size placeholder missing when using reduce and repeat #1102

Closed
Atticus1806 opened this issue Aug 29, 2022 · 12 comments · Fixed by #1104
Closed

Input Data Size placeholder missing when using reduce and repeat #1102

Atticus1806 opened this issue Aug 29, 2022 · 12 comments · Fixed by #1104

Comments

@Atticus1806
Copy link
Collaborator

When trying to use reduce with repeat, if the reduce happens over the repeat out time dim returnn crashes with the following error:

TensorFlow exception: 2 root error(s) found.
  (0) Invalid argument: You must feed a value for placeholder tensor 'extern_data/placeholders/audio_features/audio_features_dim0_size' with dtype int32 and shape [?]
         [[node extern_data/placeholders/audio_features/audio_features_dim0_size (defined at work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/r
epository/returnn/tf/util/data.py:5628) ]]
  (1) Invalid argument: You must feed a value for placeholder tensor 'extern_data/placeholders/audio_features/audio_features_dim0_size' with dtype int32 and shape [?]
         [[node extern_data/placeholders/audio_features/audio_features_dim0_size (defined at work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/r
epository/returnn/tf/util/data.py:5628) ]]
         [[_arg_extern_data/placeholders/audio_features/audio_features_0_0/_44]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'extern_data/placeholders/audio_features/audio_features_dim0_size':
  File "u/hilmes/experiments/tts_new_sis/work/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/rnn.py", line 11, in <module>
    main()
  File "work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/__main__.py", line 657, in main
    execute_main_task()
  File "work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/__main__.py", line 456, in execute_main_task
    engine.train()
  File "work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1451, in train
    self.init_train_epoch()
  File "work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1476, in init_train_epoch
    self.init_new_network(new_network_desc)
  File "work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1386, in init_new_network
    self._init_network(net_desc)
  File "work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1297, in _init_network
    extern_data.init_from_config(config=self.config, auto_create_placeholders=not use_dataset_pipeline)
  File "work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 68, in init_from_config
    self.data[key] = Data(name=key, auto_create_placeholders=auto_create_placeholders, **init_args)
  File "work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/util/data.py", line 2710, in __init__
    _auto_create_size_placeholders_on_dim_tags(name=name, dim_tags=dim_tags)
  File "work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/util/data.py", line 5767, in _auto_create_size_placeholders_on_d
im_tags
    _create_size_placeholder(name=name, axis_wo_b=axis_wo_b, tag=tag, batch_dim=batch_dim_)
  File "work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/util/data.py", line 5628, in _create_size_placeholder
    dyn_size = tf_compat.v1.placeholder(
  File "work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 3100, in placeholder
    return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
  File "work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/ops/gen_array_ops.py", line 6808, in placeholder
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 742, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 3477, in _create_op_internal
    ret = Operation(
  File "work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 1949, in __init__
    self._traceback = tf_stack.extract_stack()

Exception InvalidArgumentError() in step 0. (pid 3683)
Failing op: <tf.Operation 'extern_data/placeholders/audio_features/audio_features_dim0_size' type=Placeholder>
Used by: [<tf.Operation 'reduce/audio_features_dim0_size_copy_add_dim_by_tag/ExpandDims' type=ExpandDims>, <tf.Operation 'extern_data/placeholders/audio_features/SequenceMask/ExpandDims' type=ExpandDims>, <tf.Op
eration 'extern_data/placeholders/audio_features/SequenceMask/Max' type=Max>]
  File "/u/hilmes/experiments/tts_new_sis/work/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/rnn.py", line 11, in <module>
    main()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/__main__.py", line 657, in main
    execute_main_task()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/__main__.py", line 456, in execute_main_task
    engine.train()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1451, in train
    self.init_train_epoch()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1476, in init_train_epoch
    self.init_new_network(new_network_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1386, in init_new_network
    self._init_network(net_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1303, in _init_network
    self.network, self.updater = self.create_network(
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1344, in create_network
    network.construct_from_dict(net_dict)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 646, in construct_from_dict
    self.construct_layer(net_dict, name, get_layer=get_layer)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 959, in construct_layer
    layer_class.transform_config_dict(layer_desc, network=net, get_layer=get_layer)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/basic.py", line 363, in transform_config_dict
    super(CopyLayer, cls).transform_config_dict(d, network=network, get_layer=get_layer)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/base.py", line 616, in transform_config_dict
    d["sources"] = [
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/base.py", line 617, in <listcomp>
    get_layer(src_name)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 3312, in __call__
    return get_layer.network.construct_layer(
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 966, in construct_layer
    return add_layer(name=name_with_prefix, layer_class=layer_class, **layer_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 1128, in add_layer
    layer = self._create_layer(name=name, layer_class=layer_class, **layer_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 1042, in _create_layer
    layer = layer_class(**layer_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/basic.py", line 5921, in __init__
    self.output.placeholder = self.reduce(
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/basic.py", line 5995, in reduce
    seq_len_bc = x.get_sequence_lengths_broadcast(axis=axis)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/util/data.py", line 5142, in get_sequence_lengths_broadcast
    return tag.dyn_size_ext.copy_compatible_to(self, check_dtype=False, check_sparse=False).placeholder
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/util/data.py", line 3664, in copy_compatible_to
    v = v.copy_add_dim_by_tag(data.get_dim_tag(target_axis), axis=new_v_axis, unbroadcast=unbroadcast_axis)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/util/data.py", line 3514, in copy_add_dim_by_tag
    placeholder = tf.expand_dims(self.placeholder, axis)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 435, in expand_dims_v2
    return gen_array_ops.expand_dims(input, axis, name)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2322, in expand_dims
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 742, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 3477, in _create_op_internal
    ret = Operation(
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 1949, in __init__
    self._traceback = tf_stack.extract_stack()

  File "/u/hilmes/experiments/tts_new_sis/work/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/rnn.py", line 11, in <module>
    main()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/__main__.py", line 657, in main
    execute_main_task()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/__main__.py", line 456, in execute_main_task
    engine.train()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1451, in train
    self.init_train_epoch()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1476, in init_train_epoch
    self.init_new_network(new_network_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1386, in init_new_network
    self._init_network(net_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1303, in _init_network
    self.network, self.updater = self.create_network(
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1344, in create_network
    network.construct_from_dict(net_dict)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 646, in construct_from_dict
    self.construct_layer(net_dict, name, get_layer=get_layer)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 959, in construct_layer
    layer_class.transform_config_dict(layer_desc, network=net, get_layer=get_layer)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/basic.py", line 363, in transform_config_dict
    super(CopyLayer, cls).transform_config_dict(d, network=network, get_layer=get_layer)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/base.py", line 616, in transform_config_dict
    d["sources"] = [
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/base.py", line 617, in <listcomp>
    get_layer(src_name)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 3312, in __call__
    return get_layer.network.construct_layer(
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 966, in construct_layer
    return add_layer(name=name_with_prefix, layer_class=layer_class, **layer_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 1128, in add_layer
    layer = self._create_layer(name=name, layer_class=layer_class, **layer_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 1042, in _create_layer
    layer = layer_class(**layer_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/basic.py", line 5921, in __init__
    self.output.placeholder = self.reduce(
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/basic.py", line 5975, in reduce
    mask = x.get_sequence_mask_broadcast(axis=axis)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/util/data.py", line 5099, in get_sequence_mask_broadcast
    seq_mask = sequence_mask(size)  # (B,T)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/util/basic.py", line 1901, in sequence_mask
    mask = tf.sequence_mask(lengths, name=name)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 4198, in sequence_mask
    matrix = gen_math_ops.cast(expand_dims(lengths, -1), maxlen.dtype)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 365, in expand_dims
    return expand_dims_v2(input, axis, name)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 435, in expand_dims_v2
    return gen_array_ops.expand_dims(input, axis, name)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2322, in expand_dims
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 742, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 3477, in _create_op_internal
    ret = Operation(
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 1949, in __init__
    self._traceback = tf_stack.extract_stack()

  File "/u/hilmes/experiments/tts_new_sis/work/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/rnn.py", line 11, in <module>
    main()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/__main__.py", line 657, in main
    execute_main_task()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/__main__.py", line 456, in execute_main_task
    engine.train()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1451, in train
    self.init_train_epoch()
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1476, in init_train_epoch
    self.init_new_network(new_network_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1386, in init_new_network
    self._init_network(net_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1303, in _init_network
    self.network, self.updater = self.create_network(
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/engine.py", line 1344, in create_network
    network.construct_from_dict(net_dict)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 646, in construct_from_dict
    self.construct_layer(net_dict, name, get_layer=get_layer)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 959, in construct_layer
    layer_class.transform_config_dict(layer_desc, network=net, get_layer=get_layer)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/basic.py", line 363, in transform_config_dict
    super(CopyLayer, cls).transform_config_dict(d, network=network, get_layer=get_layer)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/base.py", line 616, in transform_config_dict
    d["sources"] = [
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/base.py", line 617, in <listcomp>
    get_layer(src_name)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 3312, in __call__
    return get_layer.network.construct_layer(
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 966, in construct_layer
    return add_layer(name=name_with_prefix, layer_class=layer_class, **layer_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 1128, in add_layer
    layer = self._create_layer(name=name, layer_class=layer_class, **layer_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/network.py", line 1042, in _create_layer
    layer = layer_class(**layer_desc)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/basic.py", line 5921, in __init__
    self.output.placeholder = self.reduce(
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/layers/basic.py", line 5975, in reduce
    mask = x.get_sequence_mask_broadcast(axis=axis)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/util/data.py", line 5099, in get_sequence_mask_broadcast
    seq_mask = sequence_mask(size)  # (B,T)
  File "/work/asr3/rossenbach/hilmes/sisyphus_experiments/new_tts/i6_core/tools/git/CloneGitRepositoryJob.XNz9DTYuolX2/output/repository/returnn/tf/util/basic.py", line 1901, in sequence_mask
    mask = tf.sequence_mask(lengths, name=name)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 4182, in sequence_mask
    maxlen = gen_math_ops._max(lengths, _all_dimensions(lengths))
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/ops/gen_math_ops.py", line 5719, in _max
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 742, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 3477, in _create_op_internal
    ret = Operation(
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 1949, in __init__
    self._traceback = tf_stack.extract_stack()

Input to output:
[<tf.Operation 'extern_data/placeholders/audio_features/audio_features_dim0_size' type=Placeholder>,
 <tf.Operation 'reduce/audio_features_dim0_size_copy_add_dim_by_tag/ExpandDims' type=ExpandDims>,
 <tf.Operation 'reduce/audio_features_dim0_size_copy_add_dim_by_tag_1/ExpandDims' type=ExpandDims>,
 <tf.Operation 'reduce/Cast' type=Cast>,
 <tf.Operation 'reduce/truediv' type=RealDiv>,
 <tf.Operation 'reduce/Sum' type=Sum>,
 <tf.Operation 'objective/loss/loss/Sum' type=Sum>]
Step meta information:
{'seq_idx': [0,
             1,
             2,
             3,
             4,
             5,
             6,
             7,
             8,
             9,
             10,
             11,
             12,
             13,
             14,
             15,
             16,
             17,
             18,
             19,
             20,
             21,
             22,
             23,
             24,
             25,
             26,
             27,
             28,
             29,
             30,
             31,
             32,
             33,
             34,
             35,
             36,
             37,
             38,
             39,
             40,
             41,
             42,
             43,
             44],
 'seq_tag': ['train-clean-100/5339-14134-0071/5339-14134-0071',
             'train-clean-100/1963-142393-0028/1963-142393-0028',
             'train-clean-100/1263-138246-0006/1263-138246-0006',
             'train-clean-100/8629-261139-0007/8629-261139-0007',
             'train-clean-100/887-123290-0030/887-123290-0030',
             'train-clean-100/254-27760-0003/254-27760-0003',
             'train-clean-100/2989-138028-0005/2989-138028-0005',
             'train-clean-100/3240-131231-0060/3240-131231-0060',
             'train-clean-100/26-496-0026/26-496-0026',
             'train-clean-100/1116-132847-0017/1116-132847-0017',
             'train-clean-100/2136-5143-0038/2136-5143-0038',
             'train-clean-100/3879-174923-0052/3879-174923-0052',
             'train-clean-100/4640-19188-0038/4640-19188-0038',
             'train-clean-100/3807-4923-0047/3807-4923-0047',
             'train-clean-100/7190-90543-0043/7190-90543-0043',
             'train-clean-100/481-123719-0015/481-123719-0015',
             'train-clean-100/4014-186175-0041/4014-186175-0041',
             'train-clean-100/328-129766-0000/328-129766-0000',
             'train-clean-100/4267-72637-0017/4267-72637-0017',
             'train-clean-100/2289-152253-0021/2289-152253-0021',
             'train-clean-100/5867-48852-0061/5867-48852-0061',
             'train-clean-100/8747-293952-0101/8747-293952-0101',
             'train-clean-100/7517-100429-0032/7517-100429-0032',
             'train-clean-100/2989-138035-0072/2989-138035-0072',
             'train-clean-100/1963-142393-0007/1963-142393-0007',
             'train-clean-100/200-124140-0012/200-124140-0012',
             'train-clean-100/2989-138035-0009/2989-138035-0009',
             'train-clean-100/198-209-0031/198-209-0031',
             'train-clean-100/7148-59157-0022/7148-59157-0022',
             'train-clean-100/4014-186175-0017/4014-186175-0017',
             'train-clean-100/5750-35690-0026/5750-35690-0026',
             'train-clean-100/2989-138028-0064/2989-138028-0064',
             'train-clean-100/8063-274116-0016/8063-274116-0016',
             'train-clean-100/1098-133695-0015/1098-133695-0015',
             'train-clean-100/8975-270782-0029/8975-270782-0029',
             'train-clean-100/7113-86041-0044/7113-86041-0044',
             'train-clean-100/254-27760-0010/254-27760-0010',
             'train-clean-100/1116-132851-0001/1116-132851-0001',
             'train-clean-100/26-496-0006/26-496-0006',
             'train-clean-100/3982-178459-0007/3982-178459-0007',
             'train-clean-100/2989-138035-0048/2989-138035-0048',
             'train-clean-100/6531-61334-0076/6531-61334-0076',
             'train-clean-100/2136-5143-0007/2136-5143-0007',
             'train-clean-100/125-121124-0003/125-121124-0003',
             'train-clean-100/6078-54013-0024/6078-54013-0024']}
Feed dict:
  <tf.Tensor 'extern_data/placeholders/audio_features/audio_features:0' shape=(?, ?, 80) dtype=float32>: shape (45, 261, 80), dtype float32, min/max -3.9974585/3.4060714, mean/stddev -0.0086959805/0.88941413, Da
ta{'audio_features', [B,T|'audio_features_time'[B],F|F'audio_features_feature'(80)]}
  <tf.Tensor 'extern_data/placeholders/duration_data/duration_data:0' shape=(?, ?, 1) dtype=int32>: shape (45, 54, 1), dtype int32, min/max 0/40, Data{'duration_data', [B,T|'duration_data_time'[B],F|F'duration_d
ata_feature'(1)], dtype='int32'}
  <tf.Tensor 'extern_data/placeholders/duration_data/duration_data_dim0_size:0' shape=(?,) dtype=int32>: shape (45,), dtype int32, min/max 15/54, ([15 25 20 30 27 31 35 30 27 28 28 33 24 36 37 40 32 36 21 31 22 
36 32 36
 36 34 46 45 34 53 31 39 49 37 41 39 38 52 35 54 42 50 52 51 47])
  <tf.Tensor 'extern_data/placeholders/phonemes/phonemes:0' shape=(?, ?) dtype=int32>: shape (45, 54), dtype int32, min/max 0/42, Data{'phonemes', [B,T|'phonemes_time'[B]], dtype='int32', sparse_dim=Dim{F'phonem
es_indices'(44)}}
  <tf.Tensor 'extern_data/placeholders/phonemes/phonemes_dim0_size:0' shape=(?,) dtype=int32>: shape (45,), dtype int32, min/max 15/54, ([15 25 20 30 27 31 35 30 27 28 28 33 24 36 37 40 32 36 21 31 22 36 32 36
 36 34 46 45 34 53 31 39 49 37 41 39 38 52 35 54 42 50 52 51 47])
  <tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>: bool(True)
  <tf.Tensor 'repeat/Sum:0' shape=(?,) dtype=int32>: shape (45,), dtype int32, min/max 145/261, ([145 146 147 149 152 152 153 158 158 160 162 167 168 173 174 173 178 180
 184 184 185 188 190 191 195 195 196 198 198 199 212 213 215 220 221 225
 229 232 237 239 249 251 253 261 261])
EXCEPTION
Traceback (most recent call last):
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1365, in BaseSession._do_call
    line: return fn(*args)
    locals:
      fn = <local> <function BaseSession._do_run.<locals>._run_fn at 0x7f8319a54c10>
      args = <local> ({<tensorflow.python._pywrap_tf_session.TF_Output object at 0x7f8319618bf0>: array([[[-2.7788811e+00, -2.8116775e+00, -2.9827762e+00, ...,
                              -2.5127969e+00, -2.6725872e+00, -2.5381758e+00],
                             [-2.2800264e+00, -2.1537054e+00, -1.9116858e+00, ...,
                              -2.1909335e+00, -2.3426156e+0..., _[0]: {len = 7}
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1349, in BaseSession._do_run.<locals>._run_fn
    line: return self._call_tf_sessionrun(options, feed_dict, fetch_list,
                                          target_list, run_metadata)
    locals:
      self = <local> <tensorflow.python.client.session.Session object at 0x7f83197b3cd0>
      self._call_tf_sessionrun = <local> <bound method BaseSession._call_tf_sessionrun of <tensorflow.python.client.session.Session object at 0x7f83197b3cd0>>
      options = <local> None
      feed_dict = <local> {<tensorflow.python._pywrap_tf_session.TF_Output object at 0x7f8319618bf0>: array([[[-2.7788811e+00, -2.8116775e+00, -2.9827762e+00, ...,
                                   -2.5127969e+00, -2.6725872e+00, -2.5381758e+00],
                                  [-2.2800264e+00, -2.1537054e+00, -1.9116858e+00, ...,
                                   -2.1909335e+00, -2.3426156e+00..., len = 7
      fetch_list = <local> [<tensorflow.python._pywrap_tf_session.TF_Output object at 0x7f83195a4170>, <tensorflow.python._pywrap_tf_session.TF_Output object at 0x7f83195a1a30>, <tensorflow.python._pywrap_tf_ses
sion.TF_Output object at 0x7f82f5b992f0>, <tensorflow.python._pywrap_tf_session.TF_Output object at 0x7f82f5b99330>]
      target_list = <local> [<tensorflow.python._pywrap_tf_session.TF_Operation object at 0x7f82f5b952f0>]
      run_metadata = <local> None
  File "/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1441, in BaseSession._call_tf_sessionrun
    line: return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
                                                  fetch_list, target_list,
                                                  run_metadata)
    locals:
      tf_session = <global> <module 'tensorflow.python.client.pywrap_tf_session' from '/work/asr3/rossenbach/hilmes/librosa_tf/lib/python3.8/site-packages/tensorflow/python/client/pywrap_tf_session.py'>
      tf_session.TF_SessionRun_wrapper = <global> <built-in method TF_SessionRun_wrapper of PyCapsule object at 0x7f84052928a0>
      self = <local> <tensorflow.python.client.session.Session object at 0x7f83197b3cd0>
      self._session = <local> <tensorflow.python._pywrap_tf_session.TF_Session object at 0x7f83196af830>
      options = <local> None
      feed_dict = <local> {<tensorflow.python._pywrap_tf_session.TF_Output object at 0x7f8319618bf0>: array([[[-2.7788811e+00, -2.8116775e+00, -2.9827762e+00, ...,
                                   -2.5127969e+00, -2.6725872e+00, -2.5381758e+00],
                                  [-2.2800264e+00, -2.1537054e+00, -1.9116858e+00, ...,
                                   -2.1909335e+00, -2.3426156e+00..., len = 7
      fetch_list = <local> [<tensorflow.python._pywrap_tf_session.TF_Output object at 0x7f83195a4170>, <tensorflow.python._pywrap_tf_session.TF_Output object at 0x7f83195a1a30>, <tensorflow.python._pywrap_tf_ses
sion.TF_Output object at 0x7f82f5b992f0>, <tensorflow.python._pywrap_tf_session.TF_Output object at 0x7f82f5b99330>]
      target_list = <local> [<tensorflow.python._pywrap_tf_session.TF_Operation object at 0x7f82f5b952f0>]
      run_metadata = <local> None
InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: You must feed a value for placeholder tensor 'extern_data/placeholders/audio_features/audio_features_dim0_size' with dtype int32 and shape [?]
         [[{{node extern_data/placeholders/audio_features/audio_features_dim0_size}}]]
  (1) Invalid argument: You must feed a value for placeholder tensor 'extern_data/placeholders/audio_features/audio_features_dim0_size' with dtype int32 and shape [?]
         [[{{node extern_data/placeholders/audio_features/audio_features_dim0_size}}]]
         [[_arg_extern_data/placeholders/audio_features/audio_features_0_0/_44]]

The (reduced) network looks like this:

class NARTTSModel(nn.Module):

  def __init__(
    self,
  ):
    super(NARTTSModel, self).__init__()
    self.vae_embedding = nn.Linear(nn.FeatureDim("vae_embedding", 512))
    self.embedding = nn.Linear(nn.FeatureDim("embedding_dim", 512))

  def __call__(
    self,
    text: nn.Tensor,
    durations: nn.Tensor,
    target_speech: nn.Tensor,
    time_dim: nn.Dim,
    speech_time: nn.Dim,
    duration_time: nn.Dim,
    ) -> nn.Tensor:

    durations, _ = nn.reinterpret_new_dim(
        durations, in_dim=duration_time, out_dim=time_dim
      )
    durations = nn.squeeze(durations, axis=durations.feature_dim)
    duration_int = nn.cast(durations, dtype="int32")

    vae_speaker_embedding = self.vae_embedding(target_speech)
    x = nn.reduce(vae_speaker_embedding, mode="mean", axis=speech_time, use_time_mask=True)
    x.mark_as_loss()
    emb = self.embedding(text)
    rep, rep_dim = nn.repeat(
      emb, axis=time_dim, repetitions=duration_int, out_dim=speech_time
    )
    return rep

and I uploaded a full config with data here:
https://gist.github.com/Atticus1806/3795c7193b1f022b5a1b107f4c1f28c9
Removing the x here and marking vae_speaker_embedding as loss does not produce this error, so it seems the reduce layer is interfering.

@albertz
Copy link
Member

albertz commented Aug 30, 2022

I don't understand, why do you need/use reinterpret_new_dim? If you know that duration_time is the same than time_dim, why do you need duration_time?

I also don't understand, why do the durations have a feature dim?

@Atticus1806
Copy link
Collaborator Author

This is leftover from the original full config, in general this could be removed but with the way the data is right now this is required.

@albertz
Copy link
Member

albertz commented Aug 30, 2022

But for reproducing the test case, is this needed? Have you prepared a standalone test case?

@Atticus1806
Copy link
Collaborator Author

Right now to run the test to fail "successfully" yes.
I am currently working on a standalone test case, but I did not have the time to finish it yet.

@albertz
Copy link
Member

albertz commented Aug 30, 2022

Test on RETURNN-common side:

def test_reduce_repeat_1102():
  # https://github.com/rwth-i6/returnn/issues/1102
  class _NARTTSModel(nn.Module):
    # noinspection PyShadowingNames
    def __call__(
      self,
      emb: nn.Tensor,
      durations: nn.Tensor,
      target_speech: nn.Tensor,
      time_dim: nn.Dim,
      speech_time: nn.Dim,
    ) -> nn.Tensor:
      x = nn.reduce(target_speech, mode="mean", axis=speech_time)
      x.mark_as_loss()
      rep, rep_dim = nn.repeat(emb, axis=time_dim, repetitions=durations, out_dim=speech_time)
      return rep

  nn.reset_default_root_name_ctx()
  time_dim = nn.SpatialDim("time")
  speech_time = nn.SpatialDim("speech")
  emb = nn.get_extern_data(nn.Data('emb', dim_tags=[nn.batch_dim, time_dim, nn.FeatureDim('F', 1)]))
  durations = nn.get_extern_data(
    nn.Data('durations', dim_tags=[nn.batch_dim, time_dim], dtype="int32"))
  target_speech = nn.get_extern_data(
    nn.Data('target_speech', dim_tags=[nn.batch_dim, speech_time, nn.FeatureDim('speech-feat', 3)]))
  net = _NARTTSModel()
  out = net(emb, durations, target_speech, time_dim, speech_time)
  out.mark_as_default_output()
  config = nn.get_returnn_config().get_complete_py_code_str(net)

  def _make_feed_dict(extern_data):
    d = extern_data.data
    return {
      d["emb"].placeholder: [[[1.], [2.], [0.]]],
      d["emb"].size_placeholder[0]: [3],
      d["durations"].placeholder: [[1, 2, 1]],
      d["target_speech"].placeholder: [[[1., 2., 3.], [4., 5., 6.], [1., 2., 3.], [1., 2., 3.]]],
      d["target_speech"].size_placeholder[0]: [4],
    }

  dummy_run_net_single_custom(config, make_feed_dict=_make_feed_dict, eval_flag=True)

@Atticus1806
Copy link
Collaborator Author

Ah, now I see. When producing the test I was not sure how to manually change the feed dict, but this makes a lot of sense.

@albertz
Copy link
Member

albertz commented Aug 30, 2022

Test on RETURNN side:

def test_reduce_repeat_1102():
  # https://github.com/rwth-i6/returnn/issues/1102
  from returnn.tf.util.data import batch_dim, SpatialDim, FeatureDim

  time_dim = SpatialDim('time')
  F_dim = FeatureDim('F', 1)
  speech_dim = SpatialDim('speech')
  speech_feat_dim = FeatureDim('speech-feat', 3)

  config = Config(dict(extern_data={
    'emb': {
      'dim_tags': (batch_dim, time_dim, F_dim),
      'dtype': 'float32',
      'available_for_inference': True
    },
    'durations': {
      'dim_tags': (batch_dim, time_dim),
      'dtype': 'int32',
      'available_for_inference': True
    },
    'target_speech': {
      'dim_tags': (batch_dim, speech_dim, speech_feat_dim),
      'dtype': 'float32',
      'available_for_inference': True
    }
  }))

  net_dict = {
    'nartts_model_reduce': {
      'class': 'copy',
      'from': 'reduce',
      'loss': 'as_is',
      'out_shape': {batch_dim, speech_feat_dim}
    },
    'output': {
      'class': 'copy',
      'from': 'repeat',
      'out_shape': {batch_dim, F_dim, speech_dim}
    },
    'reduce': {
      'class': 'reduce',
      'from': 'data:target_speech',
      'mode': 'mean',
      'axis': speech_dim,
      'out_shape': {batch_dim, speech_feat_dim}
    },
    'repeat': {
      'class': 'repeat',
      'from': 'data:emb',
      'repetitions': 'data:durations',
      'axis': time_dim,
      'out_dim': speech_dim,
      'out_shape': {batch_dim, F_dim, speech_dim}
    }
  }

  with make_scope() as session:
    net = TFNetwork(config=config, eval_flag=True)
    net.construct_from_dict(net_dict)

    d = net.extern_data.data
    feed_dict = {
      d["emb"].placeholder: [[[1.], [2.], [0.]]],
      d["emb"].size_placeholder[0]: [3],
      d["durations"].placeholder: [[1, 2, 1]],
      d["target_speech"].placeholder: [[[1., 2., 3.], [4., 5., 6.], [1., 2., 3.], [1., 2., 3.]]],
      d["target_speech"].size_placeholder[0]: [4],
    }
    fetches = net.get_fetches_dict()
    session.run(fetches, feed_dict=feed_dict)

@albertz
Copy link
Member

albertz commented Aug 30, 2022

My current hypothesis:

The repeat layer out_dim might overwrite the speed_dim. So the order of layer creations matters. Which is already bad. This should be fixed somehow.

So, probably the reduce layer was created first, used the original speed_dim (via extern data), then repeat layer overwrites speed_dim, and in the end, when preparing the feed dict, there is no ref to the original speed_dim dyn_size anymore.

@albertz
Copy link
Member

albertz commented Aug 30, 2022

Note that this is also somewhat ambiguous (wrong? not well defined?) in the original code. In nn.repeat, you are setting out_dim to an already existing different dim. The more correct code would be:

rep, rep_dim = nn.repeat(emb, axis=time_dim, repetitions=duration_int)

Basically you almost never should set out_dim or similar arguments, or only to a new undefined dim, but actually RETURNN-common would automatically create it anyway for you.
The new rep_dim is also still a different tag than speech_time. If you know they must be the same, and need that for further calculations, then you can do now:

rep_dim.declare_same_as(speech_time)

But, as said, I'm not really sure if this code would actually behave just the same as your current code.

@albertz
Copy link
Member

albertz commented Aug 30, 2022

It is also somewhat ambiguous (wrong? not well defined?) in the original code. In nn.repeat, you are setting out_dim to an already existing different dim.

So, the question is, what should actually happen in this case? I.e. some layer which produces a new dim (like RepeatLayer) gets some existing defined out_dim. Options:

  • Automatically create internally a new dim, and call declare_same_as on it to the given out_dim. I think actually some (most?) other layers already behave this way?
  • Throw an error?

Actually, I think the first case is what we already have mostly, so let's keep it that way. But that just moves the question over to declare_same_as. What should happen in there when it would merge two already defined dims? Which dyn_size will be used?

Looking at RepeatLayer, it currently does:

    if isinstance(repetitions, int):
      out_dim_ = tag * repetitions
    else:
      out_dim_ = Dim(description="repeated:%s" % name, kind=tag.kind, derived_from_tag=tag, auto_generated=True)
    if out_dim:
      out_dim_.declare_same_as(out_dim)

Maybe the order of the declare_same_as call matters? If the argument (out_dim) is already defined, it would use that? This would be the correct thing here. So then it is the responsibility of the layer implementation. Looking at ConvLayer, we have:

    ...
      out_spatial_dims_ = output.dim_tags[num_batch_dims + 1:]
    ...
    if out_spatial_dims:
      assert len(out_spatial_dims_) == len(out_spatial_dims)
      for i, (out_spatial_dim_, out_spatial_dim) in enumerate(zip(out_spatial_dims_, out_spatial_dims)):
        out_spatial_dim_.declare_same_as(out_spatial_dim)

So again it is correct.

@albertz
Copy link
Member

albertz commented Aug 30, 2022

Ongoing PR is in #1104.

@albertz
Copy link
Member

albertz commented Aug 30, 2022

It should be fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants