Skip to content

Conversation

xingyaoww
Copy link
Contributor

Multi-turn Conversation Fine-tuning Support

Overview

This PR adds support for fine-tuning models on multi-turn conversations, including proper chat template handling and loss masking for assistant responses. The implementation includes support for the OpenHands SFT dataset and handles conversations up to 32k tokens.

Key Features

  • New MultiTurnSFTDataset class for handling multi-turn conversations
  • Proper chat template integration using HuggingFace's apply_chat_template
  • Smart loss masking that targets only assistant responses
  • Support for both single-turn and multi-turn training in the same trainer
  • Token length limiting and conversation filtering

Implementation Details

  1. Dataset:

    • Uses proper chat templates from the model's tokenizer
    • Handles system, user, and assistant messages
    • Supports conversations with multiple turns
    • Filters conversations exceeding token limits
  2. Training:

    • Added use_multiturn flag in config
    • Added messages_key for multi-turn data format
    • Maintains backward compatibility with single-turn training
    • Works with existing features (FSDP, sequence parallel, etc.)
  3. Examples and Tests:

    • Added OpenHands SFT dataset preprocessing script
    • Added multi-turn training example
    • Added comprehensive unit tests
    • Moved tests to appropriate locations

Usage Example

# Config for multi-turn training
data:
  use_multiturn: true
  messages_key: messages
  max_length: 32000
  truncation: right

##Testing

  • Unit tests for dataset functionality
  • Integration with existing training pipeline
  • Example scripts tested with OpenHands dataset
  • Coverage for both single-turn and multi-turn modes

Documentation

  • Added comments explaining multi-turn specific features
  • Updated config defaults with multi-turn options
  • Added example scripts with documentation
  • Added preprocessing script with dataset-specific handling

ok.. this is another PR mostly done by OpenHands with me messaging it about 10 times..

It is still WIP as I'm testing it on a training job now, will report back if it works

openhands-agent and others added 10 commits February 3, 2025 15:52
- Add MultiTurnSFTDataset class for handling multi-turn conversations
- Support different roles (system, user, assistant) with role-specific prefixes
- Set loss mask to 1 for assistant responses only
- Add comprehensive test suite for the new dataset class
- Replace custom chat formatting with HuggingFace chat template
- Use Qwen tokenizer for testing
- Fix tensor indexing and loss mask generation
- Update test to verify proper tokenization
- Use HuggingFace chat template instead of custom formatting
- Add comprehensive tests for loss mask behavior
- Verify both assistant and non-assistant content
- Add debug output for test failures
- Add separate workflow for unit tests
- Run tests in tests/soft directory
- Generate and upload coverage reports
- Use same container as e2e tests
- Move tests from tests/soft to tests/sft/unit for consistency
- Update CI workflow paths
- Keep all SFT-related tests under tests/sft
- Update trainer to support both single-turn and multi-turn datasets
- Add example script for multi-turn training
- Add data preprocessing script for multi-turn conversations
- Use proper chat template for multi-turn data
- Add use_multiturn flag (default: false)
- Add messages_key for multi-turn mode (default: messages)
- Group single-turn and multi-turn settings
- Add OpenHands SFT dataset preprocessing script
- Add token length limit (32k) for conversations
- Move multi-turn example to tests/sft
- Add train/test split and statistics
@xingyaoww xingyaoww changed the title feat: Add multi-turn SFT support [WIP] feat: Add multi-turn SFT support Feb 4, 2025
@CLAassistant
Copy link

CLAassistant commented Feb 26, 2025

CLA assistant check
All committers have signed the CLA.

@xingyaoww xingyaoww changed the title [WIP] feat: Add multi-turn SFT support feat: Add multi-turn SFT support Mar 17, 2025
@xingyaoww xingyaoww marked this pull request as ready for review March 17, 2025 23:37
@xingyaoww
Copy link
Contributor Author

Actually i think this PR is ready. I've been using this PR for a while and trained couple model without issue. Would love review here!

Also, do we really need to sign CLA for openhands-agent? 🤣

@eric-haibin-lin
Copy link
Collaborator

Oh sorry I missed this PR. Will take a look

@xingyaoww
Copy link
Contributor Author

Sorry! Must be some copy-pasta :(

@OpenHands can you help me first merge from main - then help me address these review comments?

Copy link
Collaborator

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one last comment. otherwise looks good to me!

@eric-haibin-lin
Copy link
Collaborator

Is there any benchmark result you want to share on specific datasets?

@eric-haibin-lin
Copy link
Collaborator

Could u merge main for a fix of megatron tests. And also fix lint with format.sh?

@xingyaoww
Copy link
Contributor Author

@eric-haibin-lin Actually, OpenHands LM 32B was trained using this PR :)
and it got decent performance on SWE-Bench Verified

https://www.all-hands.dev/blog/introducing-openhands-lm-32b----a-strong-open-coding-agent-model

@eric-haibin-lin eric-haibin-lin merged commit fb03941 into volcengine:main Apr 4, 2025
25 checks passed
yuchenwang3 pushed a commit to yuchenwang3/verl that referenced this pull request Apr 25, 2025
histmeisah pushed a commit to SJTU-IAAR/verl that referenced this pull request Apr 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants