-
Notifications
You must be signed in to change notification settings - Fork 224
Open
Description
According to the docstring below, t_max in PCL is equivalent to update_interval in DQN.
chainerrl/chainerrl/agents/pcl.py
Line 49 in 4ca100c
| t_max (int): The model is updated after every t_max local steps |
However, if t_max == None by default (also in the example script), line 388 (below), the update, will never be executed.
chainerrl/chainerrl/agents/pcl.py
Line 388 in 4b051a4
| if self.t - self.t_start == self.t_max: |
In addition, when sampling from the replay buffer, I think that max_len should be set with another parameter, at least not with t_max unless we change the docstring. (I also think there is no such limit on the length of episode in the original paper)
chainerrl/chainerrl/agents/pcl.py
Line 284 in 4b051a4
| self.batchsize, max_len=self.t_max) |
Metadata
Metadata
Assignees
Labels
No labels