[BUG] Reversal of convergence for warm-clones with DPO

## Bug Description
Convergence is disrupted at chunk boundaries when using warm clones, causing inflection points and convergence graph reversals. This issue occurs consistently across different model types and quantization settings, but only affects warm-started configurations.

## To Reproduce
Steps to reproduce the behavior:

1. Configure a DPO notebook experiment with chunks=4
2. Set up warm clone configurations (as shown in runs 3, 5, 6, 7 from the attached screenshot)
3. Execute the training run
4. Monitor convergence plots during training
5. Observe inflection points and convergence reversals at each chunk boundary

## Expected Behavior
Convergence should remain smooth and monotonic across chunk boundaries, similar to the behavior observed in initial configs and clone modify configs (non-warm-started configurations). The convergence graph should not show inflection points or reversals at chunk transitions.

## Screenshots
Screenshots show:
- Configuration definitions for various runs (first screenshot)
- Comparison between affected warm clone runs (3, 5, 6, 7) and unaffected configurations for various metrics

<img width="443" height="133" alt="Image" src="https://github.com/user-attachments/assets/b6e5e0b7-7b28-4a9c-8f3c-e32c598fa79e" />
<img width="1907" height="835" alt="Image" src="https://github.com/user-attachments/assets/1631f537-4ea8-4405-895b-a8a20d8a2ec6" />
<img width="1907" height="835" alt="Image" src="https://github.com/user-attachments/assets/47d64555-78b0-4d5d-ade6-9953c059252c" />
<img width="1907" height="835" alt="Image" src="https://github.com/user-attachments/assets/964c99fc-6a3e-44cf-89d6-0d7d02bf44f7" />
<img width="1907" height="835" alt="Image" src="https://github.com/user-attachments/assets/31cd14dc-4b91-429d-b1ad-0931fe639649" />

## Environment
- OS: Ubuntu
- Python version: 3.12
- RapidFire AI version: 0.12.6
- Browser (if applicable): Chrome

## Additional Context

- Issue occurs with both quantized and non-quantized models
- Affects both RapidFire model and Mistral base model
- Problem is isolated to warm clone configurations only
- Initial configurations and clone modify configurations (non-warm-started) converge smoothly without this issue
- Chunk size used: 4

## Error Logs
No error logs


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Reversal of convergence for warm-clones with DPO #124

Bug Description

To Reproduce

Expected Behavior

Screenshots

Environment

Additional Context

Error Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Reversal of convergence for warm-clones with DPO #124

Description

Bug Description

To Reproduce

Expected Behavior

Screenshots

Environment

Additional Context

Error Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions