Skip to content

Removing TensorFlow Trainers #4707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Dec 15, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 5 additions & 8 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,8 @@ jobs:
python-version: [3.6.x, 3.7.x, 3.8.x]
include:
- python-version: 3.6.x
pip_constraints: test_constraints_min_version.txt
- python-version: 3.7.x
pip_constraints: test_constraints_max_tf1_version.txt
- python-version: 3.8.x
pip_constraints: test_constraints_max_tf2_version.txt
steps:
- uses: actions/checkout@v2
- name: Set up Python
Expand All @@ -37,7 +34,7 @@ jobs:
# This path is specific to Ubuntu
path: ~/.cache/pip
# Look to see if there is a cache hit for the corresponding requirements file
key: ${{ runner.os }}-pip-${{ hashFiles('ml-agents/setup.py', 'ml-agents-envs/setup.py', 'gym-unity/setup.py', 'test_requirements.txt', matrix.pip_constraints) }}
key: ${{ runner.os }}-pip-${{ hashFiles('ml-agents/setup.py', 'ml-agents-envs/setup.py', 'gym-unity/setup.py', 'test_requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
${{ runner.os }}-
Expand All @@ -48,10 +45,10 @@ jobs:
# pin pip to workaround https://github.com/pypa/pip/issues/9180
python -m pip install pip==20.2
python -m pip install --upgrade setuptools
python -m pip install --progress-bar=off -e ./ml-agents-envs -c ${{ matrix.pip_constraints }}
python -m pip install --progress-bar=off -e ./ml-agents -c ${{ matrix.pip_constraints }}
python -m pip install --progress-bar=off -r test_requirements.txt -c ${{ matrix.pip_constraints }}
python -m pip install --progress-bar=off -e ./gym-unity -c ${{ matrix.pip_constraints }}
python -m pip install --progress-bar=off -e ./ml-agents-envs
python -m pip install --progress-bar=off -e ./ml-agents
python -m pip install --progress-bar=off -r test_requirements.txt
python -m pip install --progress-bar=off -e ./gym-unity
- name: Save python dependencies
run: |
pip freeze > pip_versions-${{ matrix.python-version }}.txt
Expand Down
1 change: 1 addition & 0 deletions com.unity.ml-agents/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ and this project adheres to
### Major Changes
#### com.unity.ml-agents (C#)
#### ml-agents / ml-agents-envs / gym-unity (Python)
- TensorFlow trainers have been removed, please use the Torch trainers instead. (#4707)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will have to move to move to the 'Unreleased' section after release 11 (unless we are aiming to get this in for the release)

- PyTorch trainers now support training agents with both continuous and discrete action spaces. (#4702)
### Minor Changes
#### com.unity.ml-agents / com.unity.ml-agents.extensions (C#)
Expand Down
4 changes: 1 addition & 3 deletions docs/ML-Agents-Overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,7 @@ your agent's behavior:
below).
- `rnd`: represents an intrinsic reward signal that encourages exploration
in sparse-reward environments that is defined by the Curiosity module (see
below). (Not available for TensorFlow trainers)
below).

### Deep Reinforcement Learning

Expand Down Expand Up @@ -437,8 +437,6 @@ of the trained model is used as intrinsic reward. The more an Agent visits a sta
more accurate the predictions and the lower the rewards which encourages the Agent to
explore new states with higher prediction errors.

__Note:__ RND is not available for TensorFlow trainers (only PyTorch trainers)

### Imitation Learning

It is often more intuitive to simply demonstrate the behavior we want an agent
Expand Down
2 changes: 1 addition & 1 deletion docs/Training-Configuration-File.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ choice of the trainer (which we review on subsequent sections).
| `time_horizon` | (default = `64`) How many steps of experience to collect per-agent before adding it to the experience buffer. When this limit is reached before the end of an episode, a value estimate is used to predict the overall expected reward from the agent's current state. As such, this parameter trades off between a less biased, but higher variance estimate (long time horizon) and more biased, but less varied estimate (short time horizon). In cases where there are frequent rewards within an episode, or episodes are prohibitively large, a smaller number can be more ideal. This number should be large enough to capture all the important behavior within a sequence of an agent's actions. <br><br> Typical range: `32` - `2048` |
| `max_steps` | (default = `500000`) Total number of steps (i.e., observation collected and action taken) that must be taken in the environment (or across all environments if using multiple in parallel) before ending the training process. If you have multiple agents with the same behavior name within your environment, all steps taken by those agents will contribute to the same `max_steps` count. <br><br>Typical range: `5e5` - `1e7` |
| `keep_checkpoints` | (default = `5`) The maximum number of model checkpoints to keep. Checkpoints are saved after the number of steps specified by the checkpoint_interval option. Once the maximum number of checkpoints has been reached, the oldest checkpoint is deleted when saving a new checkpoint. |
| `checkpoint_interval` | (default = `500000`) The number of experiences collected between each checkpoint by the trainer. A maximum of `keep_checkpoints` checkpoints are saved before old ones are deleted. Each checkpoint saves the `.onnx` (and `.nn` if using TensorFlow) files in `results/` folder.|
| `checkpoint_interval` | (default = `500000`) The number of experiences collected between each checkpoint by the trainer. A maximum of `keep_checkpoints` checkpoints are saved before old ones are deleted. Each checkpoint saves the `.onnx` files in `results/` folder.|
| `init_path` | (default = None) Initialize trainer from a previously saved model. Note that the prior run should have used the same trainer configurations as the current run, and have been saved with the same version of ML-Agents. <br><br>You should provide the full path to the folder where the checkpoints were saved, e.g. `./models/{run-id}/{behavior_name}`. This option is provided in case you want to initialize different behaviors from different runs; in most cases, it is sufficient to use the `--initialize-from` CLI parameter to initialize all models from the same run. |
| `threaded` | (default = `true`) By default, model updates can happen while the environment is being stepped. This violates the [on-policy](https://spinningup.openai.com/en/latest/user/algorithms.html#the-on-policy-algorithms) assumption of PPO slightly in exchange for a training speedup. To maintain the strict on-policyness of PPO, you can disable parallel updates by setting `threaded` to `false`. There is usually no reason to turn `threaded` off for SAC. |
| `hyperparameters -> learning_rate` | (default = `3e-4`) Initial learning rate for gradient descent. Corresponds to the strength of each gradient descent update step. This should typically be decreased if training is unstable, and the reward does not consistently increase. <br><br>Typical range: `1e-5` - `1e-3` |
Expand Down
3 changes: 0 additions & 3 deletions docs/Training-ML-Agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -317,9 +317,6 @@ behaviors:
save_steps: 50000
swap_steps: 2000
team_change: 100000

# use TensorFlow backend
framework: tensorflow
```

Here is an equivalent file if we use an SAC trainer instead. Notice that the
Expand Down
17 changes: 1 addition & 16 deletions docs/Unity-Inference-Engine.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,19 +19,6 @@ Graphics Emulation is set to **OpenGL(ES) 3.0 or 2.0 emulation**. Also there
might be non-fatal build time errors when target platform includes Graphics API
that does not support **Unity Compute Shaders**.

## Supported formats

There are currently two supported model formats:

- Barracuda (`.nn`) files use a proprietary format produced by the
[`tensorflow_to_barracuda.py`]() script.
- ONNX (`.onnx`) files use an
[industry-standard open format](https://onnx.ai/about.html) produced by the
[tf2onnx package](https://github.com/onnx/tensorflow-onnx).

Export to ONNX is used if using PyTorch (the default). To enable it
while using TensorFlow, make sure `tf2onnx>=1.6.1` is installed in pip.

## Using the Unity Inference Engine

When using a model, drag the model file into the **Model** field in the
Expand All @@ -56,7 +43,5 @@ If you wish to run inference on an externally trained model, you should use
Barracuda directly, instead of trying to run it through ML-Agents.

## Model inference outside of Unity
We do not provide support for inference anywhere outside of Unity. The
`frozen_graph_def.pb` and `.onnx` files produced by training are open formats
for TensorFlow and ONNX respectively; if you wish to convert these to another
We do not provide support for inference anywhere outside of Unity. The `.onnx` files produced by training use the open format ONNX; if you wish to convert a `.onnx` file to another
format or run inference with them, refer to their documentation.
2 changes: 1 addition & 1 deletion ml-agents-envs/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def run(self):
install_requires=[
"cloudpickle",
"grpcio>=1.11.0",
"numpy>=1.14.1,<1.19.0",
"numpy>=1.14.1",
"Pillow>=4.2.1",
"protobuf>=3.6",
"pyyaml>=3.1.0",
Expand Down
4 changes: 0 additions & 4 deletions ml-agents/mlagents/tf_utils/__init__.py

This file was deleted.

60 changes: 0 additions & 60 deletions ml-agents/mlagents/tf_utils/tf.py

This file was deleted.

25 changes: 19 additions & 6 deletions ml-agents/mlagents/trainers/cli_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,21 @@
from mlagents.trainers.exception import TrainerConfigError
from mlagents_envs.environment import UnityEnvironment
import argparse
from mlagents_envs import logging_util

logger = logging_util.get_logger(__name__)


class RaiseRemovedWarning(argparse.Action):
"""
Internal custom Action to raise warning when argument is called.
"""

def __init__(self, nargs=0, **kwargs):
super().__init__(nargs=nargs, **kwargs)

def __call__(self, arg_parser, namespace, values, option_string=None):
logger.warning(f"The command line argument {option_string} was removed.")


class DetectDefault(argparse.Action):
Expand Down Expand Up @@ -171,16 +186,14 @@ def _create_parser() -> argparse.ArgumentParser:
argparser.add_argument(
"--torch",
default=False,
action=DetectDefaultStoreTrue,
help="Use the PyTorch framework. Note that this option is not required anymore as PyTorch is the"
"default framework, and will be removed in the next release.",
action=RaiseRemovedWarning,
help="(Removed) Use the PyTorch framework.",
)
argparser.add_argument(
"--tensorflow",
default=False,
action=DetectDefaultStoreTrue,
help="(Deprecated) Use the TensorFlow framework instead of PyTorch. Install TensorFlow "
"before using this option.",
action=RaiseRemovedWarning,
help="(Removed) Use the TensorFlow framework.",
)

eng_conf = argparser.add_argument_group(title="Engine Configuration")
Expand Down
7 changes: 1 addition & 6 deletions ml-agents/mlagents/trainers/learn.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@

import mlagents.trainers
import mlagents_envs
from mlagents import tf_utils
from mlagents.trainers.trainer_controller import TrainerController
from mlagents.trainers.environment_parameter_manager import EnvironmentParameterManager
from mlagents.trainers.trainer import TrainerFactory
Expand All @@ -21,7 +20,7 @@
GaugeWriter,
ConsoleWriter,
)
from mlagents.trainers.cli_utils import parser, DetectDefault
from mlagents.trainers.cli_utils import parser
from mlagents_envs.environment import UnityEnvironment
from mlagents.trainers.settings import RunOptions

Expand Down Expand Up @@ -135,8 +134,6 @@ def run_training(run_seed: int, options: RunOptions) -> None:
param_manager=env_parameter_manager,
init_path=maybe_init_path,
multi_gpu=False,
force_torch="torch" in DetectDefault.non_default_args,
force_tensorflow="tensorflow" in DetectDefault.non_default_args,
)
# Create controller and begin training.
tc = TrainerController(
Expand Down Expand Up @@ -242,8 +239,6 @@ def run_cli(options: RunOptions) -> None:
log_level = logging_util.DEBUG
else:
log_level = logging_util.INFO
# disable noisy warnings from tensorflow
tf_utils.set_warnings_enabled(False)

logging_util.set_log_level(log_level)

Expand Down
Loading