Skip to content

Conversation

@iluise
Copy link
Collaborator

@iluise iluise commented Dec 12, 2025

Description

Model development asked for a way to understand which variable degrades first.
The PR introduces two new plots:

  • ratio plots: complementary to score cards, it computes the ratio of each score wrt a baseline run
  • heat maps: plots the score value as a color scale for each channel and forecast step to check which variable degrades faster.
Screenshot 2025-12-12 at 12 20 22 Screenshot 2025-12-12 at 12 20 08

Issue Number

closes #945
closes #1451

Is this PR a draft? Mark it as draft.

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

@iluise iluise self-assigned this Dec 12, 2025
@iluise iluise added the eval anything related to the model evaluation pipeline label Dec 12, 2025
@iluise
Copy link
Collaborator Author

iluise commented Dec 12, 2025

Results with the latest stable run (buydgjm5) :

metrics: FROCT, RMSE, MAE, TROCT

heat_map_froct_buydgjm5_ERA5 heat_map_mae_buydgjm5_ERA5 heat_map_rmse_buydgjm5_ERA5 heat_map_troct_buydgjm5_ERA5

@clessig
Copy link
Collaborator

clessig commented Dec 13, 2025

@iluise : can we order the vertical levels in numerical order. Currently we have q_400, q_50, q_500 etc.

@clessig
Copy link
Collaborator

clessig commented Dec 13, 2025

For the heat maps it would be nice to have IFS as reference.

@clessig
Copy link
Collaborator

clessig commented Dec 13, 2025

@iluise : could you also generate ratio plots for buydgjm5

@SavvasMel
Copy link
Contributor

It works fine for me! The variables are appearing also in numerical order with respect to vertical levels. I approve.

@SavvasMel SavvasMel merged commit 8d8ae06 into develop Dec 18, 2025
5 checks passed
TillHae pushed a commit to TillHae/WeatherGenerator that referenced this pull request Dec 25, 2025
* modified ratio plot

* heat-map

* fix cosmetics

* lint

* change config

* add metric name to heatmap

* multiply forecasts by step_hrs

* remove breakpoints

* fix variable order

* lint

* Fix a minor bug on score cards (ecmwf#1492)

* Correct a minor bug regarding score cards

* Linting

---------

Co-authored-by: Savvas Melidonis <79579567+SavvasMel@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

eval anything related to the model evaluation pipeline

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

restore CSVReader in fast evaluation implement ratio plot in FastEvaluation package

4 participants