Skip to content

feat(training): support torch losses #398

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Conversation

JPXKQX
Copy link
Member

@JPXKQX JPXKQX commented Jul 3, 2025

Description

This PR adds a TorchLoss class to use any torch.nn class without needing to implement them inside Anemoi. This would allow us to delete the mae.py, huber.py, and the new smooth_l1.py proposed in #367 .

We would move from

training_loss:
   _target_: anemoi.training.losses.MAELoss
  scalers: ['pressure_level', 'general_variable', 'nan_mask_weights', 'node_weights']
  ignore_nans: False

to

training_loss:
   _target_: anemoi.training.losses.TorchLoss
  loss: 
     _target_: torch.nn.L1Loss
  scalers: ['pressure_level', 'general_variable', 'nan_mask_weights', 'node_weights']
  ignore_nans: False

What problem does this change solve?

This PR reduces code duplication, as it avoids the need to reimplement functionality that is already available in Torch. It also prevents the need for future PRs to test new loss functions inside Anemoi.

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.

@github-project-automation github-project-automation bot moved this to Now In Progress in Anemoi-dev Jul 10, 2025
Copy link
Member

@ssmmnn11 ssmmnn11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

One question: what tests did you run?

Parameters
----------
pred : torch.Tensor
Prediction tensor, shape (bs, ensemble, lat*lon, n_outputs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: do we write lat*lon everywhere? better gridpoints?

@github-project-automation github-project-automation bot moved this from Now In Progress to Under Review in Anemoi-dev Jul 12, 2025
@mchantry mchantry added the ATS Approval Not Needed No approval needed by ATS label Aug 14, 2025
@mchantry
Copy link
Member

@JPXKQX is this labelled draft for a reason?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ATS Approval Not Needed No approval needed by ATS training
Projects
Status: Under Review
Development

Successfully merging this pull request may close these issues.

3 participants