Skip baking task when parbake task didn't find any required recipes #1704

SGallagherMet · 2025-09-18T15:17:13Z

Use custom cylc triggers to skip 'baking' when 'parbake' didn't find any recipes to configure.

Contribution checklist

Aim to have all relevant checks ticked off before merging. See the developer's guide for more detail.

Documentation has been updated to reflect change.
New code has tests, and affected old tests have been updated.
All tests and CI checks pass.
Ensured the pull request title is descriptive.
Conda lock files have been updated if dependencies have changed.
Attributed any Generative AI, such as GitHub Copilot, used in this PR.
Marked the PR as ready to review.

…any recipes to configure.

SGallagherMet · 2025-09-18T15:23:03Z

@jfrost-mo - I've not asked for a review yet because I haven't tested it properly, but can you see any problems with this?

src/CSET/cset_workflow/flow.cylc

jfrost-mo · 2025-09-22T09:00:21Z

Seems like a sensible approach, as compared to just making the bake task succeed if it has no recipes, this will avoid having to request a large job only to not use it.

Co-authored-by: James Frost <james.frost@metoffice.gov.uk>

…y_baking

jfrost-mo

Some minor suggestions, but otherwise looks good. After some testing I'd be happy for this to be merged.

src/CSET/cset_workflow/app/parbake_recipes/bin/parbake.py

src/CSET/cset_workflow/flow.cylc

jfrost-mo · 2025-11-11T09:38:11Z

I'll give this a test and merge it if it looks good, ahead of our next release.

jfrost-mo · 2025-11-25T14:34:31Z

Giving this a test today, the bake_recipes task still remain in the graph, but doesn't block the flow. The task does block the workflow from finishing however.

jfrost-mo · 2025-11-25T14:38:37Z

I'm a bit tempted to go for an alternative implementation where we unconditionally run the baking tasks. It would be less efficient, but it should be simpler in the workflow.

It depends on the cost of getting a job that doesn't do anything through the queue.

jfrost-mo · 2025-11-25T14:41:33Z

For long trials I can see the null jobs actually taking a fair amount of queuing, as you will have one per cycle.

jfrost-mo · 2025-11-25T15:03:44Z

I can get it working with suicide triggers, but I'd like the avoid them if possible, as they are more of a cylc 7 thing.

jfrost-mo · 2025-11-25T15:05:56Z

The issue is that bake_recipes is not an optional task, so although the flow continues onwards past it, it still remains in the graph blocking causing an error.

This is mainly to ensure we have validated our environment before using it.

jfrost-mo · 2025-11-25T16:03:01Z

I've done some testing around this, and the current behaviour (since we introduced parbaking) is actually that the bake/bake_aggregation tasks will succeed even if they have no recipes. Instead, the parbaking task has its own check that at least one recipe is enabled, even if it is not the aggregation type being produced.

So the use case if disabling all non-aggregation tasks is already supported, though it does run some baking jobs that immediately finish.

How big a problem that is is what is in question. I'm tempted to close this pull request on the bases that I'm not sure the added complexity to the workflow is worth the savings. This is based on my testing today, where the slurm job for the baking only queued for about 10 seconds, and finished within 2 seconds when there were no recipes.

However, I haven't been using much compute over the past few days, so my fair-share is in good standing.

SGallagherMet · 2025-11-25T16:26:11Z

So, from https://cylc.github.io/cylc-doc/stable/html/user-guide/writing-workflows/scheduling.html#dependencies-with-multiple-optional-outputs
"If one task succeeds and the other fails, then the task ... will be left with one satisfied and one unsatisfied dependency. This will cause the workflow to stall".
With my original graphing bake_recipes is left with fetch_complete:suceeded completed and parbake_recipes:start_baking incomplete.
If you wanted to avoid a suicide trigger you could move parbake_recpes to after the fetch_complete

fetch_complete => parbake_recipes:start_baking? => bake_recipes?
parbake_recipes:skip_baking? | bake_recipes? => cycle_complete

SGallagherMet · 2025-11-25T16:35:54Z

With regards to whether this is worth it though, the problem is with a long trial (say 3*30*24=2160 cycles) where you end making a lot of slurm requests that do nothing. If you have a large domain where the resource requests get big that could start causing problems with fairshare. (I don't have direct evidence of it happening in CSET, but we do see it happen in the RES sometimes)

Thinking about it again, I'm not convinced this is the correct solution. You don't really want the workflow to run at all if it's not going to do anything (or you just want it to run the fetch and aggregation). Some version #1720 that checks the recipes before the suite is even installed is probably a better solution.

jfrost-mo · 2025-11-25T17:07:10Z

It probably is worth it then, but yes, there may be better ways.

Given we do want the data to be fetched, the remaining tasks running per-cycle would be:

fetch_data - cheap (1 CPU, 2GiB RAM), fast when fetching from the filesystem, slow when fetching from MASS.
parbake_recipes - cheap (1 CPU, 2GiB RAM), fast (sub 10 second), and with no inter-cycle dependency.
bake_recipe - expensive (32 CPU, 112 GiB RAM), fast (sub 10 seconds) with no recipes.
A couple skip mode tasks used for scheduling. These wouldn't be a problem even in their thousands.

So the tasks we want to eliminate are the parbaking and especially the baking. In principle there is no reason the parbaking has to happen in the cycle it is for, as it has no dependency on other tasks (other than the initial environment setup). So we could move to parbaking for all cycles in the first cycle, which would save us a slurm job per cycle.

The same thing could be said about the baking task TBH. Once we have parbaked the recipes, the only cycle specific thing it does is to look in the appropriately dated folder for the recipes to bake. And that is only so we don't end up baking the same recipe multiple times. We could ditch cycling entirely, and just chunk the work up for parallelism regardless of when it was for. That might make it harder to support running in read-time, though I'm not convinced there is much of a driving motivation to do that with CSET.

Use custom cylc triggers to skip 'baking' when 'parbake' didn't find …

80d9cc3

…any recipes to configure.

SGallagherMet self-assigned this Sep 18, 2025

jfrost-mo reviewed Sep 22, 2025

View reviewed changes

src/CSET/cset_workflow/flow.cylc Show resolved Hide resolved

src/CSET/cset_workflow/flow.cylc Outdated Show resolved Hide resolved

src/CSET/cset_workflow/flow.cylc Outdated Show resolved Hide resolved

SGallagherMet and others added 4 commits September 24, 2025 17:20

Remove extra whitespace

347eac9

Co-authored-by: James Frost <james.frost@metoffice.gov.uk>

Moved shared parbake configuration in to PARBAKE family.

b337348

Merge branch 'main' of github.com:MetOffice/CSET into skip_unnecessar…

9cf5242

…y_baking

Add custom triggers to case_study section of graphing.

b5ebbae

jfrost-mo added this to the CSET v25.11.0 milestone Oct 28, 2025

Merge branch 'main' into skip_unnecessary_baking

ed7cce0

jfrost-mo reviewed Oct 28, 2025

View reviewed changes

Apply suggestions from code review

e8ba1a2

jfrost-mo marked this pull request as ready for review November 11, 2025 09:05

Merge branch 'main' into skip_unnecessary_baking

eebe108

Merge branch 'main' into skip_unnecessary_baking

a5eed4f

jfrost-mo added 2 commits November 25, 2025 15:10

Working version based on suicide triggers

b800569

Finish setup before running parbake

af918e8

This is mainly to ensure we have validated our environment before using it.

This was referenced Nov 25, 2025

Finish setup before running parbake #1820

Merged

Move parbake runtime configuration into family #1821

Merged

jfrost-mo removed this from the CSET v25.11.0 milestone Nov 25, 2025

fixup! Working version based on suicide triggers

cb6533d

Skip baking task when parbake task didn't find any required recipes #1704

Are you sure you want to change the base?

Skip baking task when parbake task didn't find any required recipes #1704

Uh oh!

Conversation

SGallagherMet commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Contribution checklist

Uh oh!

SGallagherMet commented Sep 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jfrost-mo commented Sep 22, 2025

Uh oh!

jfrost-mo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jfrost-mo commented Nov 11, 2025

Uh oh!

jfrost-mo commented Nov 25, 2025

Uh oh!

jfrost-mo commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jfrost-mo commented Nov 25, 2025

Uh oh!

jfrost-mo commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jfrost-mo commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jfrost-mo commented Nov 25, 2025

Uh oh!

SGallagherMet commented Nov 25, 2025

Uh oh!

SGallagherMet commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jfrost-mo commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SGallagherMet commented Sep 18, 2025 •

edited

Loading

jfrost-mo commented Nov 25, 2025 •

edited

Loading

jfrost-mo commented Nov 25, 2025 •

edited

Loading

jfrost-mo commented Nov 25, 2025 •

edited

Loading

SGallagherMet commented Nov 25, 2025 •

edited

Loading