Skip to content

Evaluation fails if inference .zarr file not present #1202

@Jubeku

Description

@Jubeku

What happened?

Currently, evaluation fails if the .zarr file is no longer present even though the metrics (or at least some) are already calculated. From the logic, would it make sense to be able to run the tool even if the .zarr file doesn't exist any longer? Use case would be for example if you want to plot against a baseline run for which you had calculated the metrics already previously. We often delete the zarr files now because of inodes.
(If it is too complicated, MLflow might be solution in future because we can just select previous runs to compare against).

Error message:

ERROR:weathergen.evaluate.io_reader:Zarr file /capstor/store/cscs/userlab/ch17/shared_work/results/jrpidu2s/validation_epoch00000_rank0000.zarr does not exist or is not a directory.
Traceback (most recent call last):
  File "/users/jkuehner/CODE/WeatherGenerator/.venv/bin/evaluate", line 10, in <module>
    sys.exit(evaluate())
             ^^^^^^^^^^
  File "/users/jkuehner/CODE/WeatherGenerator/packages/evaluate/src/weathergen/evaluate/run_evaluation.py", line 50, in evaluate
    evaluate_from_args(sys.argv[1:])
  File "/users/jkuehner/CODE/WeatherGenerator/packages/evaluate/src/weathergen/evaluate/run_evaluation.py", line 87, in evaluate_from_args
    evaluate_from_config(OmegaConf.load(config), mlflow_client)
  File "/users/jkuehner/CODE/WeatherGenerator/packages/evaluate/src/weathergen/evaluate/run_evaluation.py", line 114, in evaluate_from_config
    reader = WeatherGenReader(run, run_id, private_paths)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/users/jkuehner/CODE/WeatherGenerator/packages/evaluate/src/weathergen/evaluate/io_reader.py", line 353, in __init__
    raise FileNotFoundError(
FileNotFoundError: Zarr file /capstor/store/cscs/userlab/ch17/shared_work/results/jrpidu2s/validation_epoch00000_rank0000.zarr does not exist or is not a directory.

What are the steps to reproduce the bug?

Run evaluate with an inference id that points to a results/ folder in which you have the evaluation/ folder but not the .zarr file.

Hedgedoc link to logs and more information. This ticket is public, do not attach files directly.

No response

Metadata

Metadata

Assignees

Labels

evalanything related to the model evaluation pipeline

Type

No type

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions