Skip to content

Pytorch CUDA Upgrade to 11.7 and Decommsion 11.3 and 10.2 #1042

Open
@atalman

Description

This issue will track the current progress on upgrading CUDA 11.7 support, and decommission legacy CUDA version

Cuda Support Matrix as of Pytorch 1.12

CUDA CUDNN additional details
10.2 7.6.5.32 Legacy CUDA Release, to be decommissioned issue
11.3 8.3.2.44 Stable CUDA Release
11.6 8.3.2.44 Latest CUDA Release

Pre CUDA 11.7 Upgrade

This issue is required to move CUDA 11.6 to Stable version. And we want to address it before CUDA 11.7.

  • Follow Up on the usage for cudatoolkit across pytorch projects pytorch#69691 Conda-forge dependency for 11.6 for cudatoolkit. In short Since CUDA 11.5, cudatoolkit is only available on conda-forge channel. We should migrate from cudatoolkit to cuda and abandon usage of conda-forge from pytorch, torchvision and torchaudio. This work should be scheduled and addressed as soon as we cut release 1.12 for pytorch and all domain libraries.

Decommission CUDA 10.2

This can be done in parallel to CUDA 11.7 upgrade. We want to ultimately address it before 11.7, but can also be done in parallel.

Upgrade CUDA 11.7

As per https://github.com/pytorch/builder/blob/main/CUDA_UPGRADE_GUIDE.MD

  • Installing to conda-builder and libtorch containers
    • Push pytorch/conda-builder
    • Push the libtorch image
  • Add setup to manywheels
    • Push pytorch/manylinux-builder
  • Update MAGMA
    • Push magma-cuda117 to conda
    • Add magma for windows into our S3
  • Add Windows builder for 11.7
    • Check if driver needs to be updated
    • Add fixes that had to come up
  • Include CUDA 11.7 into our nightly matrix
    • Update conda build_pytorch.sh script and add conda binaries
    • Windows
    • Linux
    • MacOS
    • Add fixes that had to come up
  • Create 11.7 CI
    • Windows
    • Linux + add MAGMA to CI conda
  • Add 11.7 to torchvision CI
  • Add 11.7 to torchaudio CI

Past Issues to be Resolved by upgrade (needs to be retested)

Post CUDA 11.7 Upgrade

Target End State

CUDA 11.6 - Stable, CUDA 11.7 - Latest Experimental
CUDA 10.2 and CUDA 11.3 Decommissioned

BE tasks for Meta Team

cc @ptrblck @malfet @seemethere @ezyang @pytorch/pytorch-dev-infra @ngimel

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions