Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_rnn_size vs custom ActorCritic cores #279

Open
alex404 opened this issue Jul 20, 2023 · 1 comment
Open

get_rnn_size vs custom ActorCritic cores #279

alex404 opened this issue Jul 20, 2023 · 1 comment

Comments

@alex404
Copy link
Contributor

alex404 commented Jul 20, 2023

So I've been using my own custom ActorCritic class, and one thing I've noticed is that the get_rnn_size function plays a low-level role initializing the learners. This makes it a bit tricky to create custom cores, because they have to play nice with the Config variables like use_rnn and rnn_size.

In practice this isn't too hard to work around, but intuitively it would be nicer if get_rnn_size could be e.g. a method of a Core. The problem is of course that get_rnn_size is accessed before the ActorCritics are initialized, so it's not entirely trivial to solve. Do you think this could be improved? If you point out a way to make this idiomatic with your code I can write it up.

@alex-petrenko
Copy link
Owner

Good point!

The rnn size (which should really be called "hidden state size") is called early to allocate a proper amount of space for the tensor to store all the hidden states of all agents.

I believe this function implements some hacky logic where we return rnn_size for GRU and 2*rnn_size for LSTM or something like that.

I don't know what's the best way to work around that, but how's that for an idea:
Make this function a method of a base "core" class, by default just call the current function. This will allow custom code to override core the API.

Then whenever you need get_rnn_size before the actor critic initialization, maybe just instantiate a local version of the core really quickly just to call this method? Maybe not ideal but it solves the problem?

Alternatively (maybe much easier to implement), maybe make an additional parameter hidden_state_size which should be -1 by default, and if it is -1 then we call the existing function calculating hidden state from rnn_size.
Otherwise we use the value of this parameter. Let me know if this works. I'll be happy to review the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants