Skip to content

Wait method for jobs / higher level job API #240

Closed
@JonaOtto

Description

@JonaOtto

Hello pyslurm developers,
I work on an HPC performance tool for my university. We want to enable the tool to dispatch measurement executions of a target code to our cluster, which uses SLURM. Ideally, we want to use pyslurm for this.
What we need is a way to:

  1. Dispatch jobs to the cluster: Already possible with job.submit_batch_job.
  2. Wait for a job to finish, so that we can examine the results. So ideally something like a blocking method job.wait(job_id) would be nice, which you could call to wait for a job (referenced by the job_id) to finish.
    I'm a pyslurm newbie, but as far as I understand, there is no such thing in pyslurm at the moment. As far as I understand there would be several possibilities building such behavior with some combinations of the find, find_id and get methods from the job class.

How do you think would be the approach to do this? Would you think it would be applicable to build such behavior into pyslurm? Or that this is a thing that our tool should care about?

I have to dive deeper into the code, but if there is a thing on this topic I can help with, I would be happy to do so. Generally, we would like to offer to contribute back our knowledge we may obtain during the process, if it is in code or not. It would maybe also be a possibility just to see how it turns out on our side, and we contribute back our code/interface we developed, or even just some comments for others on how we did it.

Thanks for doing this great project, I'm exited to hear your thoughts!

Best,
Jonathan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions