More comprehensive integration tests in training

### Is your feature request related to a problem? Please describe.

We have added infrastructure for integration tests in training and have added tests to cover most uses cases, and a couple of additional tests (restart, restart from existing checkpoint, use existing graph, etc.). However, some aspects of training are not tested or only tested for a couple of use cases and some problems are missed because we use datasets with rless parameters for these tests.

### Describe the solution you'd like

A couple of things we could think about adding:
- [ ] more comprehensive tests for checkpoints / checkpoint migrations -- currently only testing gnn global
- [ ] tests for rollout -- currently not tested
- [ ] add multi-gpu tests to test sharding for different models (probably partially covered in benchmark tests)
- [ ] review datasets used for testing, can we keep them small and still catch more of the potential problems?
- [ ] tests for forking runs (potentially better placed in system-level tests)

Depending on how comprehensively we want to test these, it might be enough to add a few tests, or it might be better to revisit the existing structure of fixtures to make them more reusable.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

More comprehensive integration tests in training #484

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

More comprehensive integration tests in training #484

Description

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions