Skip to content

Commit 63e9b9e

Browse files
committed
Added in decaying egreedy. Also printing out kwargs for each progress bar.
1 parent 1e26f76 commit 63e9b9e

File tree

2 files changed

+7
-9
lines changed

2 files changed

+7
-9
lines changed

rl/agents/policy/decaying_egreedy_policy_agent.py

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -30,13 +30,7 @@ def act(self, state: numpy.ndarray, available_actions: numpy.ndarray) -> int:
3030
:param available_actions: A list of available possible actions (positions on the board to mark)
3131
:return: an action
3232
"""
33-
action, state = self.egreedy_policy(state, available_actions)
34-
value = self.value_model(action)
35-
36-
if value < self.previous_value:
37-
self.reset_exploratory_rate()
38-
39-
return action
33+
return self.egreedy_policy(state, available_actions)
4034

4135
def egreedy_policy(self, state: numpy.ndarray, available_actions: numpy.ndarray) -> int:
4236
"""
@@ -50,7 +44,11 @@ def egreedy_policy(self, state: numpy.ndarray, available_actions: numpy.ndarray)
5044
if e < self.exploratory_rate:
5145
action: int = numpy.random.choice(available_actions)
5246
else:
53-
action: int = self.greedy_action(state, available_actions)
47+
action, state = self.greedy_action(state, available_actions)
48+
value = self.value_model(state)
49+
50+
if value < self.previous_value:
51+
self.reset_exploratory_rate()
5452

5553
return action
5654

rl/book/chapter_2/bandits.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ agents: [
2424
learning: "WeightedAveraging",
2525
kwargs: {
2626
decay_rate: 0.5 ,
27-
exploratory_rate: 0.1,
27+
exploratory_rate: 0.2,
2828
learning_rate: 0.1,
2929
}
3030
},

0 commit comments

Comments
 (0)