Skip to content

KEP-2401: Determine the tag for torchtune trainer & Add support for multiple accelerators #2518

Open
@Electronic-Waste

Description

@Electronic-Waste

What you would like to be added?

As we discussed in #2516 (comment), we need to:

  • Decide what is included in the tag of torchtune-trainer, like the version of pytorch, torchtune, CUDA, cuDNN.
  • Add support for multiple accelerators in torchtune-trainer, like amd and ascend GPU, and reflect it in the image tag

/cc @kubeflow/wg-training-leads @astefanutti @deepanker13 @saileshd1402 @seanlaii

Why is this needed?

This will help users get clear information of our torchtune-trainer image.

Love this feature?

Give it a 👍 We prioritize the features with most 👍

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions