Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign of rec layer subnet template construction heuristics #1129

Closed
albertz opened this issue Sep 22, 2022 · 6 comments
Closed

Redesign of rec layer subnet template construction heuristics #1129

albertz opened this issue Sep 22, 2022 · 6 comments

Comments

@albertz
Copy link
Member

albertz commented Sep 22, 2022

Related is the redesign of the main net construction (#1128). However, for the rec layer subnet, we need some special handling in any case, due to potential circular dependencies (via "prev:...").

The current heuristics are written in a very complicated and somewhat messy way, and it seems they can lead to very strange behavior, like very slow construction, or maybe infinite loops (#1127). Such redesign is intended to clean up and simplify the code and also solve #1127 and similar problems.

How to do it?

@albertz
Copy link
Member Author

albertz commented Sep 22, 2022

It's not really so obvious how to do it. It's even not so obvious what cases to cover. The code has grown kind of evolutionary and just works now for all existing net dicts, but behaves bad (non-obvious) when there are bugs, or even buggy / extremely unusable slow (#1127).

@albertz
Copy link
Member Author

albertz commented Sep 22, 2022

Maybe one problem currently: transform_config_dict and get_out_data_from_opts is done jointly. We need transform_config_dict to figure out the dependencies, and get_out_data_from_opts to figure out the Data templates (shapes). E.g. the ChoiceLayer could ignore the input and just infer the shape (when target is given) but that would then not correctly define the dependencies (for beam search). But getting a dummy input would also be fine, so that's not really a problem in principle. So not sure.

albertz added a commit that referenced this issue Sep 22, 2022
Also make it much faster in case of exceptions,
to not keep trying again and again.

Related: #1129.
Fix #1127.
albertz added a commit that referenced this issue Sep 22, 2022
Also make it much faster in case of exceptions,
to not keep trying again and again.

Related: #1129.
Fix #1127.
@albertz
Copy link
Member Author

albertz commented Sep 22, 2022

#1130 actually did some redesign. Maybe we can close this for now.

@albertz albertz closed this as completed Sep 22, 2022
@patrick-wilken
Copy link
Contributor

This maybe a bit naive as this whole loop construction code is complicated and I don't have a good understanding of it, but wouldn't it be a good long term goal to get rid of this try-except approach to layer construction and make it more explicit? So shouldn't it be possible to formulate the conditions under which a layer can be constructed and if it doesn't this is a fatal error in any case? This condition would be that all dependency layers are already constructed in my mind (which should be the ones where get_layer() is used in tranform_config_dict()?), but surely it's more complicated. 😅
Right now, because we catch any exception as part of the construction logic debugging recurrent nets that don't build is really pain. You get a huge output of tracebacks and have to figure out what are expected errors and what are the actual bugs that have to be solved. (I actually regularly get error reports where people point to an irrelevant part of the error log....)

@albertz
Copy link
Member Author

albertz commented Sep 27, 2022

wouldn't it be a good long term goal to get rid of this try-except approach to layer construction and make it more explicit

Yes sure. With returnn-common, we already kind of have that. Always initial outputs (initial state) are explicit, and it always sets initial_output.

Then of course, we never want to break old configs. So it means we must somehow keep this logic.

So shouldn't it be possible to formulate the conditions under which a layer can be constructed and if it doesn't this is a fatal error in any case?

The problem is a bit different. You have circular dependencies. And in the beginning, everything is unknown, no Data is really known. So where do you start, and where can you use incomplete information?

You get a huge output of tracebacks ...

Yes. But actually when you go through them, some (many) of them are valid errors, and we can fix those, and reduce the number of exceptions. It's getting less and less. It's mostly that no-one (except me) so far has cared to improve and work on this. Maybe you want to do this?

@albertz
Copy link
Member Author

albertz commented Sep 30, 2022

@patrick-wilken Actually, regarding getting rid of it, we can do that when the net dict is prepared accordingly, which is the case for returnn_common (but you could also do it by hand), see #1138 for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants