Skip to content

Setup shortterm replay buffer #32

@philkuz

Description

@philkuz

Need to be able to do batched conditionals in tensorflow.

At the current moment we aren't calculating gamma loss with the reward function.

Add a replay_memory to the subcritic network instead of the polynomial critic network.

Mini-batch of 64 instead of 1 (online to mini-batch)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions