Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU detection: Not only check for runtime, but also number of GPUs #10307

Open
chrmarti opened this issue Sep 24, 2024 · 3 comments
Open

GPU detection: Not only check for runtime, but also number of GPUs #10307

chrmarti opened this issue Sep 24, 2024 · 3 comments
Assignees
Labels
bug Issue identified by VS Code Team member as probable bug containers Issue in vscode-remote containers
Milestone

Comments

@chrmarti
Copy link
Contributor

@chrmarti

DevContainers v0.386.0 (pre-release)

Hello,

It seems that this feature is still broken (v0.386.0). If I create a remote machine (GCP) with GPU and fully installed nvidia-stack, I can build and run the devcontainer using

"hostRequirements": {

    "gpu": "optional"

},

But if I remove the GPU from my remote machine I can't start the docker container anymore as it claims having detected a GPU despite the fact that no GPU is attached:

Output of devcontainer console is:

[21551 ms] Start: Run: docker info -f {{.Runtimes.nvidia}}

[21755 ms] GPU support found, add GPU flags to docker call.

...

If I run the command you have used in your ts-scripts on the machine (no GPU anymore) I get:

{nvidia-container-runtime [] }

I think you are just checking whether the nvidia-container-runtime is available but not whether an actual gpu is attached.

const runtimeFound = result.stdout.includes('nvidia-container-runtime');

So,

`export async function extraRunArgs(common: ResolverParameters, params: DockerResolverParameters, config: DevContainerFromDockerfileConfig | DevContainerFromImageConfig) {

const extraArguments: string[] = [];

if (config.hostRequirements?.gpu) {

  if (await checkDockerSupportForGPU(params)) {

  	common.output.write(`GPU support found, add GPU flags to docker call.`);

  	extraArguments.push('--gpus', 'all');

  } else {

  	if (config.hostRequirements?.gpu !== 'optional') {

  		common.output.write('No GPU support found yet a GPU was required - consider marking it as "optional"', LogLevel.Warning);

  	}

  }

}

return extraArguments;

}`

Will add --gpus 'all' if the runtime is available even if no gpu is attached. Unfortunately the container won't start if --gpus all is given but no GPU is attached to the computer. Am I missing something here?

Originally posted by @maro-otto in #9385

@chrmarti chrmarti self-assigned this Sep 24, 2024
@chrmarti chrmarti added bug Issue identified by VS Code Team member as probable bug containers Issue in vscode-remote containers labels Sep 24, 2024
@chrmarti chrmarti added this to the September 2024 milestone Sep 24, 2024
@chrmarti
Copy link
Contributor Author

@maro-otto Could you share the output of docker info --format '{{json .}}' when you have a GPU installed? I think we might additionally have to check what the default runtime is.

@chrmarti chrmarti modified the milestones: September 2024, October 2024 Sep 26, 2024
@maro-otto
Copy link

@chrmarti
docker info --format '{{json .}}'
gives me (no GPU attached)
{nvidia-container-runtime [] }

@chrmarti
Copy link
Contributor Author

@maro-otto This looks like the output from docker info -f {{.Runtimes.nvidia}}, could you also run docker info --format '{{json .}}' with the GPU present?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue identified by VS Code Team member as probable bug containers Issue in vscode-remote containers
Projects
None yet
Development

No branches or pull requests

2 participants