This is a general remark after I got memory issue with PCL on Atari games. I noticed that losses are sometimes appended into a list (as shown below) and a weighted loss is computed (much) later. This is in theory correct, but memory inefficient as Chainer retrains all intermediate computation result in memory, e.g. when we apply a network on 2 inputs, there will be 2 copies of intermediate results and 2 losses.
|
pi_losses.append(C_pi ** 2) |
Since Functions and Links accumulates gradients and gradients are linearly additives, I would suggest we call backward immediately after getting a loss instead of saving them to a list. The cleargrads can be called in some high level function such as act_and_train.