Closed
Description
When attempting to initialize a run from a previous run, instead of starting at step 0 it starts at wherever the other run left off.
Looking at the Pull Request for this feature, it looks like _set_step function was defined but I don't see it used anywhere.
To Reproduce
Steps to reproduce the behavior:
- load 3DBall example scene
- train the model (but pause environment before completion)
- Attempt to train a new run using the previous run to initialize
- Expected - Starts from 0
Actual - Starts from where the previous run left off.
mlagents-learn config/trainer_config.yaml --initialize-from=first3DBallRun --run-id==second3DBallRun
2020-05-12 20:04:55.558358: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
WARNING:tensorflow:From c:\users\teelr\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:88: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
▄▄▄▓▓▓▓
╓▓▓▓▓▓▓█▓▓▓▓▓
,▄▄▄m▀▀▀' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌
▄▓▓▓▀' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄
▄▓▓▓▀ ▄▓▓▀ ▐▓▓▌ ▓▓▌ ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌ ╒▓▓▌
▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓ ▓▀ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▄ ▓▓▌
▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄ ▓▓ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▐▓▓
^█▓▓▓ ▀▓▓▄ ▐▓▓▌ ▓▓▓▓▄▓▓▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▓▄ ▓▓▓▓`
'▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '▀▀ ▐▓▓▌
▀▀▀▀▓▄▄▄ ▓▓▓▓▓▓, ▓▓▓▓▀
`▀█▓▓▓▓▓▓▓▓▓▌
¬`▀▀▀█▓
Version information:
ml-agents: 0.16.0,
ml-agents-envs: 0.16.0,
Communicator API: 1.0.0,
TensorFlow: 2.1.0
2020-05-12 20:04:59.737490: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
WARNING:tensorflow:From c:\users\teelr\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:88: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
2020-05-12 20:05:01 INFO [environment.py:201] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
2020-05-12 20:05:06 INFO [environment.py:111] Connected to Unity environment with package version 1.0.0-preview and communication version 1.0.0
2020-05-12 20:05:06 INFO [environment.py:342] Connected new brain:
3DBall?team=0
2020-05-12 20:05:06.290407: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-05-12 20:05:06.307036: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-05-12 20:05:06.361342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 with Max-Q Design computeCapability: 7.5
coreClock: 1.095GHz coreCount: 46 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 357.69GiB/s
2020-05-12 20:05:06.373009: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-12 20:05:06.442669: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-12 20:05:06.480843: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-12 20:05:06.504039: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-12 20:05:06.563529: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-12 20:05:06.603801: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-12 20:05:06.699862: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-12 20:05:06.704810: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-12 20:05:10.748173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-12 20:05:10.753119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-05-12 20:05:10.756694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-05-12 20:05:10.762259: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6269 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-05-12 20:05:10 INFO [stats.py:130] Hyperparameters for behavior name =second3DBallRun_3DBall:
trainer: ppo
batch_size: 64
beta: 0.001
buffer_size: 12000
epsilon: 0.2
hidden_units: 128
lambd: 0.99
learning_rate: 0.0003
learning_rate_schedule: linear
max_steps: 5.0e5
memory_size: 128
normalize: True
num_epoch: 3
num_layers: 2
time_horizon: 1000
sequence_length: 64
summary_freq: 12000
use_recurrent: False
vis_encode_type: simple
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.99
summary_path: =second3DBallRun_3DBall
model_path: ./models/=second3DBallRun/3DBall
init_path: ./models/first3DBallRun/3DBall
keep_checkpoints: 5
2020-05-12 20:05:10.893625: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 with Max-Q Design computeCapability: 7.5
coreClock: 1.095GHz coreCount: 46 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 357.69GiB/s
2020-05-12 20:05:10.904673: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-12 20:05:10.908116: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-12 20:05:10.914349: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-12 20:05:10.917813: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-12 20:05:10.923354: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-12 20:05:10.927416: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-12 20:05:10.935354: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-12 20:05:10.938944: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-12 20:05:10.944470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-12 20:05:10.947842: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-05-12 20:05:10.949961: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-05-12 20:05:10.954546: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6269 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-05-12 20:05:11 INFO [tf_policy.py:118] Loading model for brain 3DBall?team=0 from ./models/first3DBallRun/3DBall.
2020-05-12 20:05:11 INFO [tf_policy.py:142] Starting training from step 0 and saving to ./models/=second3DBallRun/3DBall.
2020-05-12 20:05:11.787400: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-12 20:05:22 INFO [stats.py:111] =second3DBallRun_3DBall: Step: 276000. Time Elapsed: 22.734 s Mean Reward: 100.000. Std of Reward: 0.000. Training.
2020-05-12 20:05:32 INFO [stats.py:111] =second3DBallRun_3DBall: Step: 288000. Time Elapsed: 33.533 s Mean Reward: 92.523. Std of Reward: 25.901. Training.
2020-05-12 20:05:43 INFO [stats.py:111] =second3DBallRun_3DBall: Step: 300000. Time Elapsed: 43.901 s Mean Reward: 84.307. Std of Reward: 33.978. Training.
2020-05-12 20:05:55 INFO [stats.py:111] =second3DBallRun_3DBall: Step: 312000. Time Elapsed: 56.405 s Mean Reward: 100.000. Std of Reward: 0.000. Training.
2020-05-12 20:05:59 INFO [subprocess_env_manager.py:191] UnityEnvironment worker 0: environment stopping.
2020-05-12 20:05:59 INFO [trainer_controller.py:116] Learning was interrupted. Please wait while the graph is generated.
2020-05-12 20:06:00 INFO [trainer_controller.py:112] Saved Model
2020-05-12 20:06:00 INFO [model_serialization.py:221] List of nodes to export for brain :3DBall?team=0
2020-05-12 20:06:00 INFO [model_serialization.py:223] is_continuous_control
2020-05-12 20:06:00 INFO [model_serialization.py:223] version_number
2020-05-12 20:06:00 INFO [model_serialization.py:223] memory_size
2020-05-12 20:06:00 INFO [model_serialization.py:223] action_output_shape
2020-05-12 20:06:00 INFO [model_serialization.py:223] action
2020-05-12 20:06:00 INFO [model_serialization.py:223] action_probs
Converting ./models/=second3DBallRun/3DBall/frozen_graph_def.pb to ./models/=second3DBallRun/3DBall.nn
IGNORED: Cast unknown layer
IGNORED: Shape unknown layer
IGNORED: StopGradient unknown layer
GLOBALS: 'is_continuous_control', 'version_number', 'memory_size', 'action_output_shape'
IN: 'vector_observation': [-1, 1, 1, 8] => 'sub_2'
OUT: 'action', 'action_probs'
DONE: wrote ./models/=second3DBallRun/3DBall.nn file.
2020-05-12 20:06:00 INFO [model_serialization.py:76] Exported ./models/=second3DBallRun/3DBall.nn file
Environment (please complete the following information):
- Unity Version: Unity 2018.4.22f1
- OS + version: Windows 10
- ML-Agents version: v1.0
- TensorFlow version: 2.1.0
- Environment: 3DBall (also repros in my custom enviroment)
NOTE: We are unable to help reproduce bugs with custom environments. Please attempt to reproduce your issue with one of the example environments, or provide a minimal patch to one of the environments needed to reproduce the issue.