Skip to content

Commit 0cc226c

Browse files
committed
fix trasure_on_right choose_action bug
1 parent e6e59bc commit 0cc226c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

contents/1_command_line_reinforcement_learning/treasure_on_right.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ def build_q_table(n_states, actions):
3434
def choose_action(state, q_table):
3535
# This is how to choose an action
3636
state_actions = q_table.iloc[state, :]
37-
if (np.random.uniform() > EPSILON) or (state_actions.all() == 0): # act non-greedy or state-action have no value
37+
if (np.random.uniform() > EPSILON) or (not state_actions.any()): # act non-greedy or state-action have no value
3838
action_name = np.random.choice(ACTIONS)
3939
else: # act greedy
4040
action_name = state_actions.idxmax() # replace argmax to idxmax as argmax means a different function in newer version of pandas

0 commit comments

Comments
 (0)