Skip to content

Commit

Permalink
reward leak fix
Browse files Browse the repository at this point in the history
  • Loading branch information
nikhilbarhate99 authored Sep 20, 2019
1 parent d02da8d commit 6c9a2ef
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions PPO.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,10 +90,10 @@ def update(self, memory):
rewards = []
discounted_reward = 0
for reward, is_terminal in zip(reversed(memory.rewards), reversed(memory.is_terminals)):
discounted_reward = reward + (self.gamma * discounted_reward)
rewards.insert(0, discounted_reward)
if is_terminal:
discounted_reward = 0
discounted_reward = reward + (self.gamma * discounted_reward)
rewards.insert(0, discounted_reward)

# Normalizing the rewards:
rewards = torch.tensor(rewards).to(device)
Expand Down

0 comments on commit 6c9a2ef

Please sign in to comment.