Data Transfer from CPU to GPU is not optimized

### What happened?

The initial investigation shows that training on the A100 (JWB) is faster than on the GH200 (Santis). Here is how to reproduce the result:
```
git checkout d24c4b6800b45bd1f859e61d8b29eab5a540c176
../WeatherGenerator-private/hpc/launch-slurm.py --time 180 --nodes=1
```
here is the result:
| run_id   | HPC      | PR                                             | Ingested Samples per GPU                                                                                                            |
| -------- | -------- | ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------- |
|pyizojg7 | Santis   | develop (1 node) (180 mins) | 6684
|cc0xrzbm | JWB   | develop (1 node) (180 mins) | 7688

### What are the steps to reproduce the bug?

_No response_

### Hedgedoc link to logs and more information. This ticket is public, do not attach files directly.

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data Transfer from CPU to GPU is not optimized #1399

What happened?

What are the steps to reproduce the bug?

Hedgedoc link to logs and more information. This ticket is public, do not attach files directly.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

run_id	HPC	PR	Ingested Samples per GPU
pyizojg7	Santis	develop (1 node) (180 mins)	6684
cc0xrzbm	JWB	develop (1 node) (180 mins)	7688

Data Transfer from CPU to GPU is not optimized #1399

Description

What happened?

What are the steps to reproduce the bug?

Hedgedoc link to logs and more information. This ticket is public, do not attach files directly.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions