Dim tag description and identifier name inconsistent and not optimal 

Some current examples:

```
    self.filter_size = [
      s if isinstance(s, nn.Dim) else nn.SpatialDim(f"filter-dim{i}", s)
      for i, s in enumerate(filter_size)]
```

```
    out_spatial_dims = _default_out_spatial_dims(
      description_prefix=nn.NameCtx.current_ctx().layer_abs_name_scope,
...
```

```
    out_spatial_dims = _default_out_spatial_dims(
      description_prefix=nn.NameCtx.current_ctx().get_abs_name(),
...
```

```
    out_spatial_dims = [
      nn.SpatialDim(f"{nn.NameCtx.current_ctx().layer_abs_name_scope}:out-spatial-dim{i}")
      for i, s in enumerate(self.filter_size)]
```

```
    if isinstance(num_heads, int):
      num_heads = nn.SpatialDim("num_heads", num_heads)
```

```
    expand_dim = nn.SpatialDim("self_att_expand_dim_init", 0)
```

```
    hist_dim = nn.SpatialDim(f"{nn.NameCtx.current_ctx().layer_abs_name_scope}:history")
```

These dim tags vary widely in their description which is inconsistent and not nice. This makes debugging also difficult.

Further, maybe even more importantly, the description is currently used to derive the Python identifier from for the Python serialization of the config. This leads then to sth like:
```
time_dim = SpatialDim('time')
input_dim = FeatureDim('input', 10)
_3_input_dim = 3 * input_dim
num_heads_dim = SpatialDim('num_heads', 2)
truediv_left_input__num_heads__dim = input_dim.div_left(num_heads_dim)
_3__truediv_left_input__num_heads___dim = 3 * truediv_left_input__num_heads__dim
encoder_layers_0_self_attn_history_dim = SpatialDim('encoder/layers/0/self_attn:history')
input_4_dim = input_dim * 4
encoder_layers_1_self_attn_history_dim = SpatialDim('encoder/layers/1/self_attn:history')
target_dim = FeatureDim('target', 7)
decoder_layers_0_self_attn_history_dim = SpatialDim('decoder/layers/0/self_attn:history')
decoder_layers_1_self_attn_history_dim = SpatialDim('decoder/layers/1/self_attn:history')
loop_dim = SpatialDim('loop-dim')
```
Or:
```
time_dim = SpatialDim('time')
input_dim = FeatureDim('input', 10)
dummy_input_feature_dim = FeatureDim('dummy-input-feature-dim', 1)
filter_dim0_dim = SpatialDim('filter-dim0', 3)
filter_dim1_dim = SpatialDim('filter-dim1', 3)
intermediate_out_sub_sample_dim = FeatureDim('intermediate_out_sub_sample', 14)
conv_subsample_layer_out_spatial_dim0_dim = time_dim.ceildiv_right(2)
conv_subsample_layer_out_spatial_dim1_dim = input_dim // 2
filter_dim0_0_dim = SpatialDim('filter-dim0', 3)
filter_dim1_0_dim = SpatialDim('filter-dim1', 3)
out_dim = FeatureDim('out', 14)
conv_subsample_layer_out_spatial_dim0_0_dim = conv_subsample_layer_out_spatial_dim0_dim.ceildiv_right(2)
conv_subsample_layer_out_spatial_dim1_0_dim = conv_subsample_layer_out_spatial_dim1_dim.ceildiv_right(2)
conv_subsample_layer_out_dim = SpatialDim('conv_subsample_layer:out_dim')
ff_dim = FeatureDim('ff', 17)
_3_out_dim = 3 * out_dim
num_heads_dim = SpatialDim('num_heads', 2)
truediv_left_out__num_heads__dim = out_dim.div_left(num_heads_dim)
_3__truediv_left_out__num_heads___dim = 3 * truediv_left_out__num_heads__dim
layers_0_self_att_history_dim = SpatialDim('layers/0/self_att:history')
_2_out_dim = 2 * out_dim
filter_dim0_1_dim = SpatialDim('filter-dim0', 32)
out__14_dim = out_dim // 14
layers_1_self_att_history_dim = SpatialDim('layers/1/self_att:history')
```

As long as we have not dealt with explicit hashing (#51), this is probably some code which will change in its logic (names, descriptions), which is a problem for Sisyphus hashing.

Some other issues:

* Some of the dim tags descriptions (names) lack context where they are created.
* There is no good way to have a consistent context due to the difference between a module `__init__` (which is not a call) or a module `__call__` or just a functional API (e.g. `pool`). Thus we have `nn.NameCtx.current_ctx().layer_abs_name_scope` and `nn.NameCtx.current_ctx().get_abs_name()` for some dim tag description prefixes.
* The default description of dim tag arithmetic (e.g. `3 * input_dim`) is just the expression itself, which is reasonable. However, we probably should overwrite this explicitly when it is used here to again add the context and meaning of it. E.g. here it is `qkv` for self-attention.
* We could derive some description from the attribute name of a module, if it was assigned to a module. But this is not always the case.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dim tag description and identifier name inconsistent and not optimal #119

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dim tag description and identifier name inconsistent and not optimal #119

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions