Skip to content

Commit 76a33ec

Browse files
Ervin Tanupambhatnagar
Ervin T
authored andcommitted
[bug-fix] Update the gail config for the new steps in 0.14.0 (#3475)
1 parent 7c863f1 commit 76a33ec

File tree

2 files changed

+14
-14
lines changed

2 files changed

+14
-14
lines changed

com.unity.ml-agents/CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
2020
### Bug Fixes
2121
- Fixed an issue which caused self-play training sessions to consume a lot of memory. (#3451)
2222
- Fixed an IndexError when using GAIL or behavioral cloning with demonstrations recorded with 0.14.0 or later (#3464)
23+
- Updated the `gail_config.yaml` to work with per-Agent steps (#3475)
2324

2425

2526
## [0.14.0-preview] - 2020-02-13

config/gail_config.yaml

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -14,27 +14,27 @@ default:
1414
num_layers: 2
1515
time_horizon: 64
1616
sequence_length: 64
17-
summary_freq: 1000
17+
summary_freq: 10000
1818
use_recurrent: false
1919
reward_signals:
2020
extrinsic:
2121
strength: 1.0
2222
gamma: 0.99
2323

2424
Pyramids:
25-
summary_freq: 2000
25+
summary_freq: 30000
2626
time_horizon: 128
2727
batch_size: 128
2828
buffer_size: 2048
2929
hidden_units: 512
3030
num_layers: 2
3131
beta: 1.0e-2
32-
max_steps: 5.0e5
32+
max_steps: 1.0e7
3333
num_epoch: 3
3434
behavioral_cloning:
3535
demo_path: Project/Assets/ML-Agents/Examples/Pyramids/Demos/ExpertPyramid.demo
3636
strength: 0.5
37-
steps: 10000
37+
steps: 150000
3838
reward_signals:
3939
extrinsic:
4040
strength: 1.0
@@ -55,14 +55,14 @@ CrawlerStatic:
5555
time_horizon: 1000
5656
batch_size: 2024
5757
buffer_size: 20240
58-
max_steps: 1e6
59-
summary_freq: 3000
58+
max_steps: 1e7
59+
summary_freq: 30000
6060
num_layers: 3
6161
hidden_units: 512
6262
behavioral_cloning:
6363
demo_path: Project/Assets/ML-Agents/Examples/Crawler/Demos/ExpertCrawlerSta.demo
6464
strength: 0.5
65-
steps: 5000
65+
steps: 50000
6666
reward_signals:
6767
gail:
6868
strength: 1.0
@@ -71,20 +71,20 @@ CrawlerStatic:
7171
demo_path: Project/Assets/ML-Agents/Examples/Crawler/Demos/ExpertCrawlerSta.demo
7272

7373
PushBlock:
74-
max_steps: 5.0e4
74+
max_steps: 1.5e7
7575
batch_size: 128
7676
buffer_size: 2048
7777
beta: 1.0e-2
7878
hidden_units: 256
79-
summary_freq: 2000
79+
summary_freq: 60000
8080
time_horizon: 64
8181
num_layers: 2
8282
reward_signals:
8383
gail:
8484
strength: 1.0
8585
gamma: 0.99
8686
encoding_size: 128
87-
demo_path: Project/Assets/ML-Agents/Examples/PushBlock/Demos/ExpertPush.demo
87+
demo_path: Project/Assets/Demonstrations/PushblockDemo.demo
8888

8989
Hallway:
9090
use_recurrent: true
@@ -96,8 +96,8 @@ Hallway:
9696
num_epoch: 3
9797
buffer_size: 1024
9898
batch_size: 128
99-
max_steps: 5.0e5
100-
summary_freq: 1000
99+
max_steps: 1.0e7
100+
summary_freq: 10000
101101
time_horizon: 64
102102
reward_signals:
103103
extrinsic:
@@ -111,8 +111,7 @@ Hallway:
111111

112112
FoodCollector:
113113
batch_size: 64
114-
summary_freq: 1000
115-
max_steps: 5.0e4
114+
max_steps: 2.0e6
116115
use_recurrent: false
117116
hidden_units: 128
118117
learning_rate: 3.0e-4

0 commit comments

Comments
 (0)