-
Hello,
For the moment, I have to manually do each of them manually on my own package with this kind of function : function Flux.testmode!(lh::LearnedHeuristic, mode = true)
lh.agent.policy.explorer.is_training = !mode #freeze explorer's epsilon value
lh.trainMode = !mode #stop filling the trajectory with evaluation samples
Flux.testmode!(lh.agent, mode) #stop updating the weights and biaises
end I am certainly doing something wrong, can someone help me ? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Unfortunately, no. And this is by design. Because in RL, the Assuming you've read the tutorial, you'll see an For example: Note that the second line created another instance, though I reused the symbol So back to your question, I would remove the constraint of Let me know if you are still unsure how to do it. |
Beta Was this translation helpful? Give feedback.
Unfortunately, no. And this is by design. Because in RL, the
testmode!
is kind of vague. For example, in some cases, we may still want to explore the action space with a small epsilon value, not simply set it to zero.Assuming you've read the tutorial, you'll see an
Agent
is a wrapper of anAbstractPolicy
and it is in the training mode naturally. So to test theAgent
, I usually extract the inner policy and use it to interact with an environment (of course I still need to do some extra work here, like modify the exploration rate and set the model totestmode!
).For example:
ReinforcementLearning.jl/src/ReinforcementLearningExperiments/deps/experiments/experiments/DQN/Dopamine_DQN_Atari.jl