Closed
Description
As said here JuliaReinforcementLearning/ReinforcementLearningZoo.jl#93 (comment), id like to write down some thoughts regarding the network handling in this framework. Maybe this is also relevant to https://github.com/JuliaReinforcementLearning/ReinforcementLearningCore.jl.
- I would like to have small collection of commonly employed network styles in RL like the
GaussianNetwork
(used in VPG, PPO and SAC) or a Twin Q Network (like in TD3 and SAC). These could then be enhanced with basic structural integrity asserts (output size of mu and sigma layer are identical) or convenience functions (e.g return test or train action from a gaussian network). - Im really unhappy with the definition of target networks. At the moment, these networks are commonly defined as
NeuralNetworkApproximator
including a dedicated optimizer, while they are never directly trained on. Maybe it would make sense to implement aTargetNetwork
struct which can be constructed by just passing the original network to it and offers function for e.g. polyak averaging or hard updates (recommended in some MuJoCo environments). I have never seen an implementation in which target networks differ from their source ones...
Im not sure if it would be reasonable to implement these changes in ReinforcementLearningCore.jl
or ReinforcementLearningZoo.jl
as these are very DRL related.
Any thoughts on this?