Skip to content

Conversation

@lfu77
Copy link

@lfu77 lfu77 commented Oct 23, 2025

This PR adds a Makefile recipe to build a Determined base image with CUDA 12.9.1 and PyTorch 2.8.0

I encountered some issues trying to build the image with PyTorch 2.9.0, I believe that we should be overriding this anyways when we build the actual augment images though

I was also unable to add 10.0 to the TORCH_CUDA_ARCH_LIST. I think this should be the Blackwell version.

TESTED:

  • Ran make build-gpt-neox-deepspeed-gpu-torch-280 and then docker run -it 77824367d1e6 /bin/bash to check the nvcc version is 12.9.1

@lfu77 lfu77 requested a review from mmonaco October 23, 2025 18:58
@lfu77 lfu77 closed this Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant