Skip to content

Error when trying to use latest-gpu container inside GitHub actions workflow. #1428

Open

Description

Workflow is here:

https://github.com/iterative/example-get-started-experiments/blob/main/.github/workflows/dvc-studio.yml

Example failure is here:

https://github.com/iterative/example-get-started-experiments/actions/runs/6310277365/job/17131981606

  Status: Downloaded newer image for iterativeai/cml:latest-gpu
  docker.io/iterativeai/cml:latest-gpu
  /usr/bin/docker create --name d36559e92e4847fcb5d0a04521f541f1_iterativeaicmllatestgpu_f5f4d0 --label 70c3d0 --workdir /__w/example-get-started-experiments/example-get-started-experiments --network github_network_5dc1361cfcd641c69071c53a762bc452 --gpus all --ipc host -e "HOME=/github/home" -e GITHUB_ACTIONS=true -e CI=true -v "/var/run/docker.sock":"/var/run/docker.sock" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work":"/__w" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/externals":"/__e":ro -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work/_temp":"/__w/_temp" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work/_actions":"/__w/_actions" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work/_tool":"/__w/_tool" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work/_temp/_github_home":"/github/home" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work/_temp/_github_workflow":"/github/workflow" --entrypoint "tail" iterativeai/cml:latest-gpu "-f" "/dev/null"
  e45972c2305532a031a11328f784587dd0ec9b98581fdfab529d350955e6a2ba
  /usr/bin/docker start e45972c2305532a031a11328f784587dd0ec9b98581fdfab529d350955e6a2ba
  Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
  nvidia-container-cli: initialization error: nvml error: driver not loaded: unknown
  Error: failed to start containers: e45972c2305532a031a11328f784587dd0ec9b98581fdfab529d350955e6a2ba
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions