Is there a way to switch an agent to testmode ? #585

3rdCore · 2022-02-18T14:37:22Z

3rdCore
Feb 18, 2022

Hello,
I just figured out that the explorer epsilon value of my agent was decaying during the evaluation, which was not expected. So I was wondering if it exists a kind of function: agent.testmode!() that will :

freeze explorer's epsilon value.
stop filling the trajectory with evaluation samples
stop updating the weights and biaises of the model (in case of using a deepNet)

For the moment, I have to manually do each of them manually on my own package with this kind of function :
lh is an object of custom type LearnedHeuristic that contains an RL.agent.

function Flux.testmode!(lh::LearnedHeuristic, mode = true)
    lh.agent.policy.explorer.is_training = !mode   #freeze explorer's epsilon value
    lh.trainMode = !mode     #stop filling the trajectory with evaluation samples
    Flux.testmode!(lh.agent, mode)   #stop updating the weights and biaises 
end

I am certainly doing something wrong, can someone help me ?

Answered by findmyway

Feb 18, 2022

Unfortunately, no. And this is by design. Because in RL, the testmode! is kind of vague. For example, in some cases, we may still want to explore the action space with a small epsilon value, not simply set it to zero.

Assuming you've read the tutorial, you'll see an Agent is a wrapper of an AbstractPolicy and it is in the training mode naturally. So to test the Agent, I usually extract the inner policy and use it to interact with an environment (of course I still need to do some extra work here, like modify the exploration rate and set the model to testmode!).

For example:

ReinforcementLearning.jl/src/ReinforcementLearningExperiments/deps/experiments/experiments/DQN/Dopamine_DQN_Atari.jl

View full answer

findmyway · 2022-02-18T15:30:34Z

findmyway
Feb 18, 2022
Maintainer

Unfortunately, no. And this is by design. Because in RL, the testmode! is kind of vague. For example, in some cases, we may still want to explore the action space with a small epsilon value, not simply set it to zero.

Assuming you've read the tutorial, you'll see an Agent is a wrapper of an AbstractPolicy and it is in the training mode naturally. So to test the Agent, I usually extract the inner policy and use it to interact with an environment (of course I still need to do some extra work here, like modify the exploration rate and set the model to testmode!).

For example:

ReinforcementLearning.jl/src/ReinforcementLearningExperiments/deps/experiments/experiments/DQN/Dopamine_DQN_Atari.jl

Lines 262 to 263 in 25127ac

    
           p = agent.policy 
        
           p = @set p.explorer = EpsilonGreedyExplorer(0.001; rng = rng)  # set evaluation epsilon

Note that the second line created another instance, though I reused the symbol p here.

So back to your question, I would remove the constraint of RL.Agent to the agent field in LearnedHeuristic, so that, when I want to do evaluating, I just replace the agent field with another policy instance like what I did above.

Let me know if you are still unsure how to do it.

1 reply

3rdCore Feb 18, 2022
Author

thanks for this quick answer, it is perfectly clear now !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Is there a way to switch an agent to testmode ? #585

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Is there a way to switch an agent to testmode ? #585

Uh oh!

Uh oh!

3rdCore Feb 18, 2022

Replies: 1 comment · 1 reply

Uh oh!

findmyway Feb 18, 2022 Maintainer

Uh oh!

3rdCore Feb 18, 2022 Author

3rdCore
Feb 18, 2022

Replies: 1 comment 1 reply

findmyway
Feb 18, 2022
Maintainer

3rdCore Feb 18, 2022
Author