Skip to content

PBSProProvider doesn't detect jobs completing #3906

Open
@WardLT

Description

@WardLT

Describe the bug
PBSProProvider fails to detect a job finishing, which means blocks stay active and Parsl will not start new ones.

The error in the logs is:

1752246824.600159 2025-07-11 15:13:44 MainProcess-931080 JobStatusPoller-Timer-Thread-140649297520800-140648986441472 parsl.providers.pbspro.pbspro:103 _status WARNING: qstat failed with retcode:35 STDOUT:{
    "timestamp":1752246824,
    "pbs_version":"2024.1.2.20241017100211",
    "pbs_server":"polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov"
} STDERR:qstat: 5487654.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov Job has finished, use -x or -H to obtain historical job information

To Reproduce
Run Parsl with a short walltime for a job.

Expected behavior
New blocks are submitted after old

Environment

  • Python version: 3.12
  • Parsl version: 2025.07.07

Distributed Environment
ALCF's Polaris. DFK resides on the head node

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions