Skip to content

Conversation

@landerlini
Copy link
Collaborator

The workflow can be easily parallelized on 4 parallel worker by Snakemake, but natively there is no logic to assign different gpus to different workers.
I have created a small gpu_picker module that keep tracks of which GPUs are allocated and never assign the same gpu to two jobs.
I'll try first to run on a single worker, but enabling GPU allocation via gpu_picker, if that works, I'll plug the 4-gpu runner.

@github-actions
Copy link

🤖 A new training is being planned.

  • Name: pp-2016-MU-Sim10b-gha_multi_gpu
  • Repository sub-dir: pidgan
  • Snakemake targets: cache_container validate_all
  • Selected runner: aiinfn-lamarrsim-gpu

At the end of the training, the models will be released and this PR will be notified again.

@github-actions
Copy link

🚀 Models for pp-2016-MU-Sim10b-gha_multi_gpu were released

You can review the models developed in this PR in Release pp-2016-MU-Sim10b-gha_multi_gpu-2025-08-29T12h05m47

@github-actions
Copy link

🤖 A new training is being planned.

  • Name: pp-2016-MU-Sim10b-gha_multi_gpu
  • Repository sub-dir: pidgan
  • Snakemake targets: cache_container validate_all
  • Selected runner: aiinfn-lamarrsim-4gpus

At the end of the training, the models will be released and this PR will be notified again.

@github-actions
Copy link

🚀 Models for pp-2016-MU-Sim10b-gha_multi_gpu were released

You can review the models developed in this PR in Release pp-2016-MU-Sim10b-gha_multi_gpu-2025-08-29T13h33m54

@github-actions
Copy link

🤖 A new training is being planned.

  • Name: pp-2016-MU-Sim10b-gha_multi_gpu
  • Repository sub-dir: pidgan
  • Snakemake targets: cache_container validate_all
  • Selected runner: aiinfn-lamarrsim-4gpus

At the end of the training, the models will be released and this PR will be notified again.

@github-actions
Copy link

github-actions bot commented Sep 1, 2025

🤖 A new training is being planned.

  • Name: pp-2016-MU-Sim10b-gha_multi_gpu
  • Repository sub-dir: pidgan
  • Snakemake targets: cache_container validate_all
  • Selected runner: aiinfn-lamarrsim-4gpus

At the end of the training, the models will be released and this PR will be notified again.

@github-actions
Copy link

github-actions bot commented Sep 1, 2025

🚀 Models for pp-2016-MU-Sim10b-gha_multi_gpu were released

You can review the models developed in this PR in Release pp-2016-MU-Sim10b-gha_multi_gpu-2025-09-01T11h28m17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants