Skip to content

Commit 3ef67f0

Browse files
committed
fix a bug in computing return
1 parent 467895a commit 3ef67f0

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

crowd_nav/policy/model_predictive_rl.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -250,8 +250,7 @@ def V_planning(self, state, depth, width):
250250
next_state_est = self.state_predictor(state, action)
251251
reward_est = self.estimate_reward(state, action)
252252
next_value, next_traj = self.V_planning(next_state_est, depth - 1, self.planning_width)
253-
# TODO: verify this equation
254-
return_value = current_state_value / depth + (depth - 1) / depth * (reward_est + next_value)
253+
return_value = current_state_value / depth + (depth - 1) / depth * (self.get_normalized_gamma() * next_value + reward_est)
255254

256255
returns.append(return_value)
257256
trajs.append([(state, action, reward_est)] + next_traj)

0 commit comments

Comments
 (0)