Redesign of network construction logic

Related is the flat net construction logic (#992). However, I think the current implementation of the flat net construction logic is too difficult and too messy, using exceptions to fill the queue of layers to construct.

Rather, I think in the layer `transform_config_dict` function, the `get_layer` would return always a template layer (maybe the existing `_TemplateLayer` class), and at the same time make an entry in the layer construction queue. Then in the most outer loop we would get the next layer from the queue, and repeat until the queue is empty. Then we have build up the complete layer graph.

Remember that `transform_config_dict` is mostly there to resolve the dependencies, and do little preparation of the layer config dict.

Also remember that this also goes into subnetworks (`SubnetworkLayer`, `RecLayer`, `CondLayer`) which makes it more difficult.

This builds up the layer graph. However, the actual shapes etc are all irrelevant at this point. But now that we have the layer graph, we also know in what order we need to start the actual construction.

Note that circular dependencies via `transform_config_dict`, e.g. due to `"prev:..."` in the rec layer, are not really a problem at this point. If we see that some layer is already in the construction queue, or the template has been constructed before, we would simply skip over it.

We should try to avoid calling `transform_config_dict` multiple times, as this might have created bigger sub structures such as `SubnetworkRecCell` or `Subnetwork`. But when actually getting to the point of construction the layer, we would replace the template layers by real instances.

It's a bit unclear when to handle `get_out_data_from_opts`. We could do a second pass, still purely based on the templates, to resolve that. Here we get to the problem of circular dependencies in the rec layer, and we need similar heuristics as we have currently. Simplifying these heuristics for the rec layer subnet template construction is a topic on its own, and maybe the redesign here does not really influence this much (I'm not sure; separate issue: #1129). For the rec layer, we also need to have called all `get_out_data_from_opts` of its subnet such that we know the shapes of rec state. So this second pass through the network (now forward instead of backward) would call all `get_out_data_from_opts` and fill in the `Data` to the template layers.

So far, this is all on template logic, and no actual TF operation has been created or touched. Actually this logic is completely independent of the backend (Theano, TF, PyTorch). So when we do this implementation, maybe we can do it directly backend independent (thus related for PyTorch, #1120).

Then, a third pass would actually construct the layers.

The recent problem with too slow net construction (#1127) lead to this issue, although maybe #1127 would be solved in a different way, as such redesign as proposed here would probably be a larger undertaking.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesign of network construction logic #1128

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Redesign of network construction logic #1128

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions