-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add evaluation worker #43
Comments
This would involve adding an evaluation/render TaskType? |
The simplest thing I can think of is to just make one of the actor workers a "special" one. I.e. add some sort of flag that will also save environment frames to some memory location, or to disk, instead of just sending them to the learner. It can do it every x minutes (instead of every trajectory) to avoid using too many resources. For the environments that do not have pixel observations we can additionally do something like render(mode='rgb_array') to get the rendered frames to begin with. Once the episode trajectory is collected, the actor worker can send it to the master process which currently writes all the summaries. Gifs can be generated with https://github.com/alex-petrenko/sample-factory/blob/master/utils/gifs.py |
Hi, I found this thread looking for any details on how to periodically evaluate the model on a separate evaluation environment during training, so we can compare training and validation on the same tensorboard. Would you comment on how you'd go about it and I could submit a PR? |
@signalprime An important requirement is that users should not pay for the evaluation worker if they don't use it. So if the option is disabled there should be an absolute minimum of overhead for supporting this feature. The existing mechanism for sending summaries to the main process can also be reused (see report_queue). The reports from the evaluation worker can be marked with a special flag such that summaries can be written with separate keys (I'd just add eval_ prefix to everything to have a completely separate TB section). An alternative option is to just turn one of the existing workers into an evaluation worker using a set of flags. This can probably be a lot more simple than adding another class and another process.
The above options can be switched on/off separately to allow for a more flexible configuration. |
hi @alex-petrenko, The second idea does sound better. I've not forgotten about this thread, there are a few elements I'm trying to understand for the changes. I've forked the project and have been reviewing what needs to be done. Inside APPO.init():
So we have an additional summary dir that will overlap training summaries. Creating the evaluation ActorWorker:
Things get a little more complex after this. I'm questioning if we need to add logic inside the following definitions, or create new ones with the 'eval' prefix or suffix:
It seems the following need modified copies:
Could you talk about what items, other than cfg for environment variables, need changes inside the ActorWorker class? Thanks! |
Hi @signalprime ! Thank you for looking into this! I guess if you like the second idea of just repurposing the ActorWorker class, maybe it makes sense to reuse the existing infrastracture for creating the ActorWorkers, as well as shared buffers, queues, etc. Again, if you do this, you probably don't need to worry about
|
Perfect, yes, so if the evaluation flag is enabled, we'll assume the last worker is for evaluation. |
It's called on line 461 in appo.py in the function init_workers():
self.actor_workers = []
max_parallel_init = int(1e9) # might be useful to limit this for some envs
worker_indices = list(range(self.cfg.num_workers))
for i in range(0, self.cfg.num_workers, max_parallel_init):
workers = self.init_subset(worker_indices[i:i +
max_parallel_init], actor_queues)
self.actor_workers.extend(workers)
init_workers is called in the run() function in the same file
вт, 5 янв. 2021 г. в 12:02, signalprime <notifications@github.com>:
… Hi @alex-petrenko <https://github.com/alex-petrenko> , happy 2021! I have
been studying the code for the changes needed, specifically inside APPO
<https://github.com/alex-petrenko/sample-factory/blob/master/algorithms/appo/appo.py>,
and I can't seem to find where init_subset
<https://github.com/alex-petrenko/sample-factory/blob/bc8d1cc5b9733566d7919ea5979c8871abc5275d/algorithms/appo/appo.py#L328>
is called externally. We see here
<https://github.com/alex-petrenko/sample-factory/blob/bc8d1cc5b9733566d7919ea5979c8871abc5275d/algorithms/appo/appo.py#L461>
it's referenced from inside itself, but that's the only reference listed.
Can you explain how it's
<https://github.com/alex-petrenko/sample-factory/blob/bc8d1cc5b9733566d7919ea5979c8871abc5275d/algorithms/appo/appo.py#L328>
being called originally? Thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#43 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJ6HLZRJAKAC2RR23PFWATSYNV7BANCNFSM4PWOYKTQ>
.
|
Thank you Alex. I deleted my post after finding the location as well. My greatest compliments go to you because Sample Factory is truly a work of art. I find it to be 10x faster than other implementations. |
No worries! Thank you, I'm glad you find SF helpful! I wish I had more time
to work on it and turn it into a proper framework, but it is what it is
now. I understand the code is convoluted and hard to read (the price you
pay for optimization)
чт, 7 янв. 2021 г. в 08:39, signalprime <notifications@github.com>:
… Thank you Alex. I deleted my post after finding the location as well. My
greatest compliments go to you because Sample Factory is truly a work of
art. I find it to be 10x faster than other implementations.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#43 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJ6HL7TM2HRYVAPDK2MUUDSYXPSRANCNFSM4PWOYKTQ>
.
|
This issue is stale because it has been open for 30 days with no activity. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Re opened to determine requirements |
Hello, is this planned for implementation. This is a very good/important feature to have |
Yes, I would like to have this in 2.1.0 Any feedback would be very helpful. I.e. what's your use case, how would you use this feature, what implementation details are important. There is no definite timeline for implementing this, but most likely in the next couple of months. |
Ideally, we want a worker that can render an episode every x minutes and post a gif animation or video on Tensorboard.
The best solution (I think) is to repurpose the existing ActorWorker for this. We can just add a regime where it also saves the environment frames (rendering them if necessary).
The text was updated successfully, but these errors were encountered: