Handling of diagnostic and forcing datasets should happen in masker and not in tokenizer

### What happened?

Handling of diagnostic (no source channels) and forcing (no target channel) datasets is currently done in the tokenizer:

https://github.com/ecmwf/WeatherGenerator/blob/8f7f240b5772ac7f2602141e4f6e60cf96c744d1/src/weathergen/datasets/tokenizer_masking.py#L130 

https://github.com/ecmwf/WeatherGenerator/blob/8f7f240b5772ac7f2602141e4f6e60cf96c744d1/src/weathergen/datasets/tokenizer_masking.py#L167

This leads to an inconsistency between the masks and the actual data, e.g. here:

https://github.com/ecmwf/WeatherGenerator/blob/8f7f240b5772ac7f2602141e4f6e60cf96c744d1/src/weathergen/train/loss_modules/loss_module_physical.py#L242

Hence, the handling of forcing and diagnostic datasets should be done in the masker, with empty target and sources masks being generated in these cases, respectively.

CC @wael-mika @shmh40 

### What are the steps to reproduce the bug?

_No response_

### Hedgedoc link to logs and more information. This ticket is public, do not attach files directly.

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling of diagnostic and forcing datasets should happen in masker and not in tokenizer #1682

What happened?

What are the steps to reproduce the bug?

Hedgedoc link to logs and more information. This ticket is public, do not attach files directly.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Handling of diagnostic and forcing datasets should happen in masker and not in tokenizer #1682

Description

What happened?

What are the steps to reproduce the bug?

Hedgedoc link to logs and more information. This ticket is public, do not attach files directly.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions