Skip to content

Commit

Permalink
chore: bumpenvs reflecting tf27 not having pytorch (determined-ai#6182)
Browse files Browse the repository at this point in the history
  • Loading branch information
mpkouznetsov authored Mar 9, 2023
1 parent 744bd1b commit 77f8ec4
Show file tree
Hide file tree
Showing 37 changed files with 187 additions and 183 deletions.
4 changes: 2 additions & 2 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -182,11 +182,11 @@ commands:
- when:
condition: <<parameters.tf1>>
steps:
- run: docker pull determinedai/environments:py-3.7-pytorch-1.7-tf-1.15-cpu-7aa5364
- run: docker pull determinedai/environments:py-3.7-pytorch-1.7-tf-1.15-cpu-0e4beb5
- when:
condition: <<parameters.tf2>>
steps:
- run: docker pull determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-7aa5364
- run: docker pull determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-0e4beb5

login-docker:
parameters:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,11 @@ by default in this version of Determined are described below.
+-------------+-------------------------------------------------------------------------+
| Environment | File Name |
+=============+=========================================================================+
| CPUs | ``determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-7aa5364`` |
| CPUs | ``determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-0e4beb5`` |
+-------------+-------------------------------------------------------------------------+
| Nvidia GPUs | ``determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-7aa5364`` |
| Nvidia GPUs | ``determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0e4beb5`` |
+-------------+-------------------------------------------------------------------------+
| AMD GPUs | ``determinedai/environments:rocm-5.0-pytorch-1.10-tf-2.7-rocm-7aa5364`` |
| AMD GPUs | ``determinedai/environments:rocm-5.0-pytorch-1.10-tf-2.7-rocm-0e4beb5`` |
+-------------+-------------------------------------------------------------------------+

See :doc:`/training/setup-guide/set-environment-images` for the images Docker Hub location, and add
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -349,7 +349,7 @@ platform. There may be additional per-user configuration that is required.

.. code:: bash
image=determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-7aa5364
image=determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0e4beb5
cd /shared/enroot/images
enroot import docker://$image
enroot create /shared/enroot/images/${image//[\/:]/\+}.sqsh
Expand Down
4 changes: 2 additions & 2 deletions docs/training/apis-howto/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,13 +72,13 @@ Determined provides prebuilt Docker images that include TensorFlow 2.8, 1.15, an

- ``determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0.20.1`` (default)
- ``determinedai/environments:cuda-10.2-pytorch-1.7-tf-1.15-gpu-0.20.1``
- ``determinedai/environments:cuda-11.2-pytorch-1.12-tf-2.7-gpu-0.20.1``
- ``determinedai/environments:cuda-11.2-tf-2.7-gpu-0.20.1``

We also provide lightweight CPU-only counterparts:

- ``determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-0.20.1``
- ``determinedai/environments:py-3.7-pytorch-1.7-tf-1.15-cpu-0.20.1``
- ``determinedai/environments:py-3.8-pytorch-1.12-tf-2.7-cpu-0.20.1``
- ``determinedai/environments:py-3.8-tf-2.7-cpu-0.20.1``

To change the container image used for an experiment, specify :ref:`environment.image
<exp-environment-image>` in the experiment configuration file. Please see :ref:`container-images`
Expand Down
12 changes: 6 additions & 6 deletions e2e_tests/tests/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@
MAX_TRIAL_BUILD_SECS = 90


DEFAULT_TF1_CPU_IMAGE = "determinedai/environments:py-3.7-pytorch-1.7-tf-1.15-cpu-7aa5364"
DEFAULT_TF2_CPU_IMAGE = "determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-7aa5364"
DEFAULT_TF1_GPU_IMAGE = "determinedai/environments:cuda-10.2-pytorch-1.7-tf-1.15-gpu-7aa5364"
DEFAULT_TF2_GPU_IMAGE = "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-7aa5364"
DEFAULT_PT_CPU_IMAGE = "determinedai/environments:py-3.8-pytorch-1.12-cpu-7aa5364"
DEFAULT_PT_GPU_IMAGE = "determinedai/environments:cuda-11.3-pytorch-1.12-gpu-7aa5364"
DEFAULT_TF1_CPU_IMAGE = "determinedai/environments:py-3.7-pytorch-1.7-tf-1.15-cpu-0e4beb5"
DEFAULT_TF2_CPU_IMAGE = "determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-0e4beb5"
DEFAULT_TF1_GPU_IMAGE = "determinedai/environments:cuda-10.2-pytorch-1.7-tf-1.15-gpu-0e4beb5"
DEFAULT_TF2_GPU_IMAGE = "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0e4beb5"
DEFAULT_PT_CPU_IMAGE = "determinedai/environments:py-3.8-pytorch-1.12-cpu-0e4beb5"
DEFAULT_PT_GPU_IMAGE = "determinedai/environments:cuda-11.3-pytorch-1.12-gpu-0e4beb5"

TF1_CPU_IMAGE = os.environ.get("TF1_CPU_IMAGE") or DEFAULT_TF1_CPU_IMAGE
TF2_CPU_IMAGE = os.environ.get("TF2_CPU_IMAGE") or DEFAULT_TF2_CPU_IMAGE
Expand Down
2 changes: 1 addition & 1 deletion examples/computer_vision/unets_tf_keras/const.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,4 @@ min_validation_period:
entrypoint: model_def:UNetsTrial
scheduling_unit: 57
environment:
image: determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-7aa5364
image: determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0e4beb5
2 changes: 1 addition & 1 deletion examples/computer_vision/unets_tf_keras/distributed.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,4 @@ min_validation_period:
scheduling_unit: 8
entrypoint: model_def:UNetsTrial
environment:
image: determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-7aa5364
image: determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0e4beb5
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ environment:
# You may need to modify this to match your network configuration.
- NCCL_SOCKET_IFNAME=ens,eth,ib
image:
gpu: determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-deepspeed-0.7.0-gpu-7aa5364
gpu: determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-deepspeed-0.7.0-gpu-0e4beb5
bind_mounts:
- host_path: /tmp
container_path: /data
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ environment:
# You may need to modify this to match your network configuration.
- NCCL_SOCKET_IFNAME=ens,eth,ib
image:
gpu: determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-deepspeed-0.7.0-gpu-7aa5364
gpu: determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-deepspeed-0.7.0-gpu-0e4beb5
bind_mounts:
- host_path: /tmp
container_path: /data
Expand Down
2 changes: 1 addition & 1 deletion examples/deepspeed/cifar10_moe/moe.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ environment:
# - NCCL_BLOCKING_WAIT=1
# - NCCL_IB_DISABLE=1
image:
gpu: determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-deepspeed-0.7.0-gpu-7aa5364
gpu: determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-deepspeed-0.7.0-gpu-0e4beb5
bind_mounts:
- host_path: /tmp
container_path: /data
Expand Down
2 changes: 1 addition & 1 deletion examples/deepspeed/cifar10_moe/zero_stages.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ environment:
# - NCCL_BLOCKING_WAIT=1
# - NCCL_IB_DISABLE=1
image:
gpu: determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-deepspeed-0.7.0-gpu-7aa5364
gpu: determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-deepspeed-0.7.0-gpu-0e4beb5
bind_mounts:
- host_path: /tmp
container_path: /data
Expand Down
2 changes: 1 addition & 1 deletion examples/deepspeed/gpt_neox/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-gpt-neox-deepspeed-gpu-7aa5364
FROM determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-gpt-neox-deepspeed-gpu-0e4beb5

# Install deepspeed & dependencies
RUN apt-get install -y mpich
Expand Down
2 changes: 1 addition & 1 deletion examples/deepspeed/pipeline_parallelism/distributed.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ environment:
# - CUDA_LAUNCH_BLOCKING=1
# - NCCL_BLOCKING_WAIT=1
image:
gpu: determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-deepspeed-0.7.0-gpu-7aa5364
gpu: determinedai/environments:cuda-11.3-pytorch-1.10-tf-2.8-deepspeed-0.7.0-gpu-0e4beb5
resources:
slots_per_trial: 2
records_per_epoch: 50000
Expand Down
2 changes: 1 addition & 1 deletion examples/graphs/proteins_pytorch_geometric/adaptive.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,4 @@ searcher:
entrypoint: model_def:GraphConvTrial
environment:
image:
cuda: determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-7aa5364
cuda: determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0e4beb5
2 changes: 1 addition & 1 deletion examples/graphs/proteins_pytorch_geometric/const.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ searcher:
entrypoint: model_def:GraphConvTrial
environment:
image:
cuda: determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-7aa5364
cuda: determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0e4beb5
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,6 @@ searcher:
entrypoint: model_def:GraphConvTrial
environment:
image:
cuda: determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-7aa5364
cuda: determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0e4beb5
resources:
slots_per_trial: 4
6 changes: 3 additions & 3 deletions harness/determined/common/schemas/expconf/_v0.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,12 +223,12 @@ def from_dict(cls, d: Union[dict, str], prevalidated: bool = False) -> "Environm

def runtime_defaults(self) -> None:
if self.cpu is None:
self.cpu = "determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-7aa5364"
self.cpu = "determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-0e4beb5"
if self.rocm is None:
self.rocm = "determinedai/environments:rocm-5.0-pytorch-1.10-tf-2.7-rocm-7aa5364"
self.rocm = "determinedai/environments:rocm-5.0-pytorch-1.10-tf-2.7-rocm-0e4beb5"

if self.cuda is None:
self.cuda = "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-7aa5364"
self.cuda = "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0e4beb5"


class EnvironmentVariablesV0(schemas.SchemaBase):
Expand Down
20 changes: 10 additions & 10 deletions harness/determined/deploy/aws/templates/efs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,35 +3,35 @@ Mappings:
RegionMap:
ap-northeast-1:
Master: ami-0c7cb70d3eb61492b
Agent: ami-09c9ea05363b4fe2d
Agent: ami-07ff017fdcb873d21
# TODO(DET-4258) Uncomment these when we fully support all P3 regions.
# ap-northeast-2:
# Master: ami-003bb1772f36a39a3
# Agent: ami-07b13ec47230370d5
# Agent: ami-0f4325d6bb0693ea8
# ap-southeast-1:
# Master: ami-09f03fa5572692399
# Agent: ami-07affc5a669acb69f
# Agent: ami-08e217c995685b256
# ap-southeast-2:
# Master: ami-06139e5e22cc2f7b1
# Agent: ami-0ef38d26d9f61a0db
# Agent: ami-032ab618bc592a2f3
eu-central-1:
Master: ami-0b81e95bb0a06ea8c
Agent: ami-0c09eb6495c53c53c
Agent: ami-0108d2301feb1d1a6
eu-west-1:
Master: ami-029cfca952b331b52
Agent: ami-06de67607d7b543dd
Agent: ami-051b575f3d38c142f
# eu-west-2:
# Master: ami-035469b606478d63d
# Agent: ami-0130ec158c7260bf1
# Agent: ami-086a85853ca268f19
us-east-1:
Master: ami-0b93ce03dcbcb10f6
Agent: ami-027145fb15545cd96
Agent: ami-03ec29f0d7c14e20c
us-east-2:
Master: ami-0cbea92f2377277a4
Agent: ami-0fe46b7b5fdba7b51
Agent: ami-06332e9495a7eebda
us-west-2:
Master: ami-0d31d7c9fc9503726
Agent: ami-03bad0b25097fe67d
Agent: ami-016612711ba734ade

Parameters:
VpcCIDR:
Expand Down
20 changes: 10 additions & 10 deletions harness/determined/deploy/aws/templates/fsx.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,35 +3,35 @@ Mappings:
RegionMap:
ap-northeast-1:
Master: ami-0c7cb70d3eb61492b
Agent: ami-09c9ea05363b4fe2d
Agent: ami-07ff017fdcb873d21
# TODO(DET-4258) Uncomment these when we fully support all P3 regions.
# ap-northeast-2:
# Master: ami-003bb1772f36a39a3
# Agent: ami-07b13ec47230370d5
# Agent: ami-0f4325d6bb0693ea8
# ap-southeast-1:
# Master: ami-09f03fa5572692399
# Agent: ami-07affc5a669acb69f
# Agent: ami-08e217c995685b256
# ap-southeast-2:
# Master: ami-06139e5e22cc2f7b1
# Agent: ami-0ef38d26d9f61a0db
# Agent: ami-032ab618bc592a2f3
eu-central-1:
Master: ami-0b81e95bb0a06ea8c
Agent: ami-0c09eb6495c53c53c
Agent: ami-0108d2301feb1d1a6
eu-west-1:
Master: ami-029cfca952b331b52
Agent: ami-06de67607d7b543dd
Agent: ami-051b575f3d38c142f
# eu-west-2:
# Master: ami-035469b606478d63d
# Agent: ami-0130ec158c7260bf1
# Agent: ami-086a85853ca268f19
us-east-1:
Master: ami-0b93ce03dcbcb10f6
Agent: ami-027145fb15545cd96
Agent: ami-03ec29f0d7c14e20c
us-east-2:
Master: ami-0cbea92f2377277a4
Agent: ami-0fe46b7b5fdba7b51
Agent: ami-06332e9495a7eebda
us-west-2:
Master: ami-0d31d7c9fc9503726
Agent: ami-03bad0b25097fe67d
Agent: ami-016612711ba734ade

Parameters:
VpcCIDR:
Expand Down
20 changes: 10 additions & 10 deletions harness/determined/deploy/aws/templates/secure.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,44 +4,44 @@ Mappings:
RegionMap:
ap-northeast-1:
Master: ami-0c7cb70d3eb61492b
Agent: ami-09c9ea05363b4fe2d
Agent: ami-07ff017fdcb873d21
Bastion: ami-0c7cb70d3eb61492b
# TODO(DET-4258) Uncomment these when we fully support all P3 regions.
# ap-northeast-2:
# Master: ami-003bb1772f36a39a3
# Agent: ami-07b13ec47230370d5
# Agent: ami-0f4325d6bb0693ea8
# Bastion: ami-003bb1772f36a39a3
# ap-southeast-1:
# Master: ami-09f03fa5572692399
# Agent: ami-07affc5a669acb69f
# Agent: ami-08e217c995685b256
# Bastion: ami-09f03fa5572692399
# ap-southeast-2:
# Master: ami-06139e5e22cc2f7b1
# Agent: ami-0ef38d26d9f61a0db
# Agent: ami-032ab618bc592a2f3
# Bastion: ami-06139e5e22cc2f7b1
eu-central-1:
Master: ami-0b81e95bb0a06ea8c
Agent: ami-0c09eb6495c53c53c
Agent: ami-0108d2301feb1d1a6
Bastion: ami-0b81e95bb0a06ea8c
eu-west-1:
Master: ami-029cfca952b331b52
Agent: ami-06de67607d7b543dd
Agent: ami-051b575f3d38c142f
Bastion: ami-029cfca952b331b52
# eu-west-2:
# Master: ami-035469b606478d63d
# Agent: ami-0130ec158c7260bf1
# Agent: ami-086a85853ca268f19
# Bastion: ami-035469b606478d63d
us-east-1:
Master: ami-0b93ce03dcbcb10f6
Agent: ami-027145fb15545cd96
Agent: ami-03ec29f0d7c14e20c
Bastion: ami-0b93ce03dcbcb10f6
us-east-2:
Master: ami-0cbea92f2377277a4
Agent: ami-0fe46b7b5fdba7b51
Agent: ami-06332e9495a7eebda
Bastion: ami-0cbea92f2377277a4
us-west-2:
Master: ami-0d31d7c9fc9503726
Agent: ami-03bad0b25097fe67d
Agent: ami-016612711ba734ade
Bastion: ami-0d31d7c9fc9503726

Parameters:
Expand Down
20 changes: 10 additions & 10 deletions harness/determined/deploy/aws/templates/simple.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,35 +5,35 @@ Mappings:
RegionMap:
ap-northeast-1:
Master: ami-0c7cb70d3eb61492b
Agent: ami-09c9ea05363b4fe2d
Agent: ami-07ff017fdcb873d21
# TODO(DET-4258) Uncomment these when we fully support all P3 regions.
# ap-northeast-2:
# Master: ami-003bb1772f36a39a3
# Agent: ami-07b13ec47230370d5
# Agent: ami-0f4325d6bb0693ea8
# ap-southeast-1:
# Master: ami-09f03fa5572692399
# Agent: ami-07affc5a669acb69f
# Agent: ami-08e217c995685b256
# ap-southeast-2:
# Master: ami-06139e5e22cc2f7b1
# Agent: ami-0ef38d26d9f61a0db
# Agent: ami-032ab618bc592a2f3
eu-central-1:
Master: ami-0b81e95bb0a06ea8c
Agent: ami-0c09eb6495c53c53c
Agent: ami-0108d2301feb1d1a6
eu-west-1:
Master: ami-029cfca952b331b52
Agent: ami-06de67607d7b543dd
Agent: ami-051b575f3d38c142f
# eu-west-2:
# Master: ami-035469b606478d63d
# Agent: ami-0130ec158c7260bf1
# Agent: ami-086a85853ca268f19
us-east-1:
Master: ami-0b93ce03dcbcb10f6
Agent: ami-027145fb15545cd96
Agent: ami-03ec29f0d7c14e20c
us-east-2:
Master: ami-0cbea92f2377277a4
Agent: ami-0fe46b7b5fdba7b51
Agent: ami-06332e9495a7eebda
us-west-2:
Master: ami-0d31d7c9fc9503726
Agent: ami-03bad0b25097fe67d
Agent: ami-016612711ba734ade

Parameters:
Keypair:
Expand Down
2 changes: 1 addition & 1 deletion harness/determined/deploy/gcp/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ class defaults:
AUX_AGENT_INSTANCE_TYPE = "n1-standard-4"
COMPUTE_AGENT_INSTANCE_TYPE = "n1-standard-32"
DB_PASSWORD = "postgres"
ENVIRONMENT_IMAGE = "det-environments-7aa5364"
ENVIRONMENT_IMAGE = "det-environments-0e4beb5"
GPU_NUM = 8
GPU_TYPE = "nvidia-tesla-k80"
MASTER_INSTANCE_TYPE = "n1-standard-2"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,9 @@
},
"force_pull_image": false,
"image": {
"cpu": "determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-7aa5364",
"cuda": "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-7aa5364",
"rocm": "determinedai/environments:rocm-5.0-pytorch-1.10-tf-2.7-rocm-7aa5364"
"cpu": "determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-0e4beb5",
"cuda": "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0e4beb5",
"rocm": "determinedai/environments:rocm-5.0-pytorch-1.10-tf-2.7-rocm-0e4beb5"
},
"pod_spec": null,
"ports": {},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,9 @@
},
"force_pull_image": false,
"image": {
"cpu": "determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-7aa5364",
"cuda": "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-7aa5364",
"rocm": "determinedai/environments:rocm-5.0-pytorch-1.10-tf-2.7-rocm-7aa5364"
"cpu": "determinedai/environments:py-3.8-pytorch-1.12-tf-2.8-cpu-0e4beb5",
"cuda": "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.8-gpu-0e4beb5",
"rocm": "determinedai/environments:rocm-5.0-pytorch-1.10-tf-2.7-rocm-0e4beb5"
},
"pod_spec": null,
"ports": {},
Expand Down
Loading

0 comments on commit 77f8ec4

Please sign in to comment.