You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My workflow failed and I got no logs from any of the tasks.
When I manually added --batchLogsDir logs/ I got a vaguely informative error:
[2024-07-03T13:31:46-0700] [MainThread] [W] [toil.leader] The batch system left an empty file logs/toil_5afe8eb3-2e69-47a4-aa37-6802bd7a6bd2.6.3865603.out.log
[2024-07-03T13:31:46-0700] [MainThread] [W] [toil.leader] The batch system left a non-empty file logs/toil_5afe8eb3-2e69-47a4-aa37-6802bd7a6bd2.6.3865603.err.log:
[2024-07-03T13:31:46-0700] [MainThread] [W] [toil.leader] Log from job "kind-WDLTaskJob/instance-g25utj5q" follows:
=========>
Traceback (most recent call last):
File "/private/home/anovak/workspace/toil/venv/bin/_toil_worker", line 33, in <module>
sys.exit(load_entry_point('toil', 'console_scripts', '_toil_worker')())
File "/private/home/anovak/workspace/toil/src/toil/worker.py", line 772, in main
job_store = Toil.resumeJobStore(options.jobStoreLocator)
File "/private/home/anovak/workspace/toil/src/toil/common.py", line 1036, in resumeJobStore
jobStore.resume()
File "/private/home/anovak/workspace/toil/src/toil/jobStores/fileJobStore.py", line 130, in resume
raise NoSuchJobStoreException(self.jobStoreDir, "file")
toil.jobStores.abstractJobStore.NoSuchJobStoreException: The job store 'file:/data/tmp/tmpjn0glxf6/tree' does not exist, so there is nothing to restart.
<=========
The error message shouldn't say there is nothing to restart if it happens when a worker is tryign to connect to the job store; it should say something else.
Also, we should make it harder for the user to get into this situation where Toil has selected a job store path that can't work. Maybe when the batch system is Slurm or one of the other grid engine ones, toil-wdl-runner should pick a default job store in the current directory?
We could also give the worker a special magic exit code to use to complain specifically that it can't reach the job store, prompting a useful error message from the leader.
┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1611
The text was updated successfully, but these errors were encountered:
I ran this:
My workflow failed and I got no logs from any of the tasks.
When I manually added
--batchLogsDir logs/
I got a vaguely informative error:The error message shouldn't say
there is nothing to restart
if it happens when a worker is tryign to connect to the job store; it should say something else.Also, we should make it harder for the user to get into this situation where Toil has selected a job store path that can't work. Maybe when the batch system is Slurm or one of the other grid engine ones,
toil-wdl-runner
should pick a default job store in the current directory?We could also give the worker a special magic exit code to use to complain specifically that it can't reach the job store, prompting a useful error message from the leader.
┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1611
The text was updated successfully, but these errors were encountered: