-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redesign of rec layer subnet template construction heuristics #1129
Comments
It's not really so obvious how to do it. It's even not so obvious what cases to cover. The code has grown kind of evolutionary and just works now for all existing net dicts, but behaves bad (non-obvious) when there are bugs, or even buggy / extremely unusable slow (#1127). |
Maybe one problem currently: |
#1130 actually did some redesign. Maybe we can close this for now. |
This maybe a bit naive as this whole loop construction code is complicated and I don't have a good understanding of it, but wouldn't it be a good long term goal to get rid of this try-except approach to layer construction and make it more explicit? So shouldn't it be possible to formulate the conditions under which a layer can be constructed and if it doesn't this is a fatal error in any case? This condition would be that all dependency layers are already constructed in my mind (which should be the ones where |
Yes sure. With returnn-common, we already kind of have that. Always initial outputs (initial state) are explicit, and it always sets Then of course, we never want to break old configs. So it means we must somehow keep this logic.
The problem is a bit different. You have circular dependencies. And in the beginning, everything is unknown, no
Yes. But actually when you go through them, some (many) of them are valid errors, and we can fix those, and reduce the number of exceptions. It's getting less and less. It's mostly that no-one (except me) so far has cared to improve and work on this. Maybe you want to do this? |
@patrick-wilken Actually, regarding getting rid of it, we can do that when the net dict is prepared accordingly, which is the case for returnn_common (but you could also do it by hand), see #1138 for this. |
Related is the redesign of the main net construction (#1128). However, for the rec layer subnet, we need some special handling in any case, due to potential circular dependencies (via
"prev:..."
).The current heuristics are written in a very complicated and somewhat messy way, and it seems they can lead to very strange behavior, like very slow construction, or maybe infinite loops (#1127). Such redesign is intended to clean up and simplify the code and also solve #1127 and similar problems.
How to do it?
The text was updated successfully, but these errors were encountered: