Skip to content

Commit

Permalink
Fix some serious bugs in the NGC image (cresset-template#129)
Browse files Browse the repository at this point in the history
* Fix bug in the NGC Docker image where the year and month variables were not being passed from the docker-compose.yaml file.

* Update default NGC image to the one released in April 2023.

* Remove redundant `pytest` requirement.

* Fix serious bug where `--ignore-installed` was mistaken for an option to always leave existing packages as-is, which unfortunately does not exist in `pip`.
The default behavior is to update only if version incompatibilities are found.

* Reformat the README.md file.
  • Loading branch information
veritas9872 authored May 1, 2023
1 parent e6f196f commit cc26208
Show file tree
Hide file tree
Showing 4 changed files with 14 additions and 8 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -498,7 +498,7 @@ which is useful if `sudo` permissions are unavailable on the host.
Also, when one user switches between multiple Cresset-based containers
on a single machine, VSCode may not be able to find the container workspace.
This is because the `docker-compose.yaml` file mounts the host's
`~/.vscode-server` directory to the `/home/${USR}/.vscode-server` directory
`~/.vscode-server` directory to the `/home/${USR}/.vscode-server` directory
of all containers to preserve VSCode extensions between containers.
To fix this issue, create a new directory on the host
to mount the containers' `.vscode-server` directories.
Expand Down Expand Up @@ -529,8 +529,8 @@ For other VSCode problems, try deleting `~/.vscode-server` on the host.
networking issues during installation. Updating git submodules is
[not fail-safe](https://stackoverflow.com/a/8573310/9289275).

4. `torch.cuda.is_available()` will return a
`... UserWarning: CUDA initialization:...`
4. `torch.cuda.is_available()` will return a
`... UserWarning: CUDA initialization:...`
error or the image will simply not start if the host CUDA driver is
incompatible with the CUDA version on the Docker image.
Either upgrade the host CUDA driver or downgrade the CUDA version of the image.
Expand Down
2 changes: 1 addition & 1 deletion docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ services:
dockerfile: dockerfiles/ngc.Dockerfile
args:
NGC_YEAR: ${NGC_YEAR:-23}
NGC_MONTH: ${NGC_MONTH:-03}
NGC_MONTH: ${NGC_MONTH:-04}

hub: # Service based on the official PyTorch Docker images from Docker Hub.
extends: # Available images: https://hub.docker.com/r/pytorch/pytorch/tags
Expand Down
10 changes: 7 additions & 3 deletions dockerfiles/ngc.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# syntax = docker/dockerfile:1
# The top line is used by BuildKit. _**DO NOT ERASE IT**_.

ARG NGC_YEAR
ARG NGC_MONTH
ARG INTERACTIVE_MODE
ARG GIT_IMAGE=bitnami/git:latest
ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:${NGC_YEAR:-23}.${NGC_MONTH:-03}-py3
ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:${NGC_YEAR}.${NGC_MONTH}-py3

########################################################################
FROM ${GIT_IMAGE} AS stash
Expand Down Expand Up @@ -45,11 +47,13 @@ RUN --mount=type=bind,from=stash,source=/tmp/apt,target=/tmp/apt \
rm -rf /var/lib/apt/lists/*

# Use `sudo` to install new `pip` packages during development if necessary.
# Previous installations are preserved via the `--ignore-installed` flag.
# Note that new `pip` packages may overwrite existing packages if incompatible.
# Check the installed packages before and after `pip` installation and minimize
# the number of requirements to keep overwriting to a minumum.
ARG PIP_CACHE_DIR=/root/.cache/pip
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
--mount=type=bind,from=stash,source=/tmp/req,target=/tmp/req \
python -m pip install --ignore-installed -r /tmp/req/requirements.txt && ldconfig
python -m pip install -r /tmp/req/requirements.txt && ldconfig

# Enable Intel MKL optimizations on AMD CPUs.
# https://danieldk.eu/Posts/2020-08-31-MKL-Zen.html
Expand Down
4 changes: 3 additions & 1 deletion reqs/ngc-pip.requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
# Pre-existing Python packages may be overwritten by new packages.
# Minimize the number of requirements and check the installed packages
# before and after `pip` installation to find any discrepencies.
hydra-core
pytest

0 comments on commit cc26208

Please sign in to comment.