fix[next]: fix for missing state info in the toolchain cache computations #2104

egparedes · 2025-06-26T12:06:49Z

Description

Requirements

All fixes and/or new features come with corresponding tests.
Important design decisions have been documented in the appropriate ADR inside the docs/development/ADRs/ folder.

If this PR contains code authored by new contributors please make sure:

The PR contains an updated version of the AUTHORS.md file adding the names of all the new contributors.

Copilot

Pull Request Overview

This PR enhances caching by including workflow-internal state in cache key computations, ensuring cache invalidation when relevant settings change.

Add a workflow_state_id property to GTFNTranslationStep and other workflow classes.
Update CachedStep to use a composite cache_key (input hash + workflow state).
Modify tests to use cache_key instead of the old fingerprint function.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
tests/next_tests/.../test_gtfn_module.py	Replaced `stages.fingerprint_compilable_program` with `cache_key`.
src/gt4py/next/program_processors/codegens/gtfn/gtfn_module.py	Imported `utils` and added `workflow_state_id` to `GTFNTranslationStep`.
src/gt4py/next/otf/workflow.py	Introduced `workflow_state_id` in multiple workflow classes and updated imports.

Comments suppressed due to low confidence (3)

src/gt4py/next/program_processors/codegens/gtfn/gtfn_module.py:57

Consider adding a docstring for workflow_state_id to clarify which internal settings are included in the hash and how it impacts caching behavior.

    @functools.cached_property

tests/next_tests/unit_tests/program_processor_tests/codegens_tests/gtfn_tests/test_gtfn_module.py:150

It would be helpful to add a test verifying that changing the translation step's internal state (e.g., device_type or language_settings) results in a different cache_key, ensuring cache invalidation works correctly.

    cache_key = cached_gtfn_translation_step.cache_key(compilable_program)

src/gt4py/next/otf/workflow.py:174

The implementation of workflow_state_id is duplicated across multiple classes (ReplaceEnabledWorkflowMixin, MultiWorkflow, CachedStep, SkippableStep). Consider extracting this into a shared mixin or base class to reduce code duplication.

    @functools.cached_property

havogt · 2025-07-03T20:44:28Z

before we forget:
I was able to trigger the problem with this icon4py test

pytest -sv --benchmark-disable --datatest-only --backend=gtfn_cpu --level=integration model/driver/tests/driver_tests/test_timeloop.py::test_run_timeloop_single_step -k False-mch_ch_r04b09_dsl-1-2-1-2 -s

by filling the cache with either gtfn_cpu or gtfn_gpu and then switching to the other. With the run for the other backend translation cache created the program for the first backend (can be seen in the hash of the build directory).

havogt · 2025-07-07T12:23:12Z

Additionally next.config might influence the toolchain...

egparedes changed the title ~~Include workflow state (if any) in CachedStep hash computations.~~ fix[next]: fix for missing state info in cache computations Jun 26, 2025

egparedes changed the title ~~fix[next]: fix for missing state info in cache computations~~ fix[next]: fix for missing state info in the toolchain cache computations Jun 26, 2025

egparedes added 2 commits June 26, 2025 14:10

Include workflow state (if any) in CachedStep hash computations.

1f734bb

Run pre-commit

a1cf05e

egparedes force-pushed the fix-caching-issues-for-backend-switch branch from 5a0a1eb to a1cf05e Compare June 26, 2025 12:11

egparedes added 3 commits June 26, 2025 14:23

Add opt-in workflow state hash attribute to common workflow wrappers

2503aab

Use stable hash

41173f9

Document protocol and fix broken test

f78bd92

egparedes requested a review from Copilot June 26, 2025 13:24

Copilot AI reviewed Jun 26, 2025

View reviewed changes

havogt mentioned this pull request Jul 10, 2025

feat[next]: Integration of concat_where #1713

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix[next]: fix for missing state info in the toolchain cache computations #2104

fix[next]: fix for missing state info in the toolchain cache computations #2104

Uh oh!

egparedes commented Jun 26, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

havogt commented Jul 3, 2025

Uh oh!

havogt commented Jul 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix[next]: fix for missing state info in the toolchain cache computations #2104

Are you sure you want to change the base?

fix[next]: fix for missing state info in the toolchain cache computations #2104

Uh oh!

Conversation

egparedes commented Jun 26, 2025

Description

Requirements

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

havogt commented Jul 3, 2025

Uh oh!

havogt commented Jul 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants