Skip to content

Argument t_max in PCL misused #236

@lyx-x

Description

@lyx-x

According to the docstring below, t_max in PCL is equivalent to update_interval in DQN.

t_max (int): The model is updated after every t_max local steps

However, if t_max == None by default (also in the example script), line 388 (below), the update, will never be executed.

if self.t - self.t_start == self.t_max:

In addition, when sampling from the replay buffer, I think that max_len should be set with another parameter, at least not with t_max unless we change the docstring. (I also think there is no such limit on the length of episode in the original paper)

self.batchsize, max_len=self.t_max)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions