Skip to content

ppijbb/llm_training

Repository files navigation

LLM Training Project

Description

This project provides a comprehensive framework for training and fine-tuning Large Language Models (LLMs) using various methods including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and quantization techniques.

Project Structure

llm_training/
├── data/                    # Training and evaluation datasets
├── gkd/                     # Generalized Knowledge Distillation related code
├── lightning_trainer/       # PyTorch Lightning-based training framework
├── models/                  # Model definitions and configurations
├── quantization/            # Model quantization utilities
├── rlhf/                    # Reinforcement Learning from Human Feedback training
├── sft/                     # Supervised Fine-Tuning with llama_recipes
├── summary_format/          # Summary formatting utilities
├── training_utils/          # Common training utilities
├── main.py                  # Main inference script
├── pyproject.toml           # Poetry configuration
└── requirements.txt         # Python dependencies

Installation

Prerequisites

  • Python 3.12 or higher
  • CUDA-compatible GPU (recommended)
  • Poetry (for dependency management)

Setup

  1. Clone the repository:
git clone <repository-url>
cd llm_training
  1. Install uv:
pip install uv
  1. Install dependencies:
uv pip install -r requirements.txt --no-build-isolation --index-strategy unsafe-best-match
  • if run on GCP VM, use local ssd

    sudo lsblk -o NAME,SIZE,TYPE,MOUNTPOINT | grep nvme0n1
    sudo mkfs.ext4 -F /dev/nvme0n1
    sudo mkdir -p /mnt/disks/local-ssd
    sudo mount /dev/nvme0n1 /mnt/disks/local-ssd
    sudo chmod a+w /mnt/disks/local-ssd
    UUID=$(sudo blkid -s UUID -o value /dev/nvme0n1)
    echo "UUID=$UUID /mnt/disks/local-ssd ext4 discard,defaults,nofail 0 2" | sudo tee -a /etc/fstab

Usage

Model Inference

Run the main inference script to test a fine-tuned model:

python main.py

Supervised Fine-Tuning (SFT)

  1. Navigate to the SFT directory:
cd sft
  1. Configure your training parameters in the config files and run:
python llama_finetuning.py

Lightning Trainer

  1. Navigate to the lightning trainer directory:
cd lightning_trainer
  1. Set up environment variables:
export CUDA_VISIBLE_DEVICES=0
export CUDA_LAUNCH_BLOCKING=1
export WANDB_API_KEY=<your_wandb_api_key>
export HF_SECRET_KEY=<your_huggingface_token>
export HF_DATASETS_CACHE=<your_cache_directory>
  1. Login to required services:
huggingface-cli login --token $HF_SECRET_KEY
wandb login --relogin $WANDB_API_KEY
  1. Start training with tmux (recommended):
tmux new -s lightning -d
tmux attach -t lightning

python trainer.py fit \
    --trainer.fast_dev_run false \
    --trainer.max_epochs 5 \
    --model.learning_rate 3e-3 \
    --data.train_batch_size 4 \
    --data.eval_batch_size 4

RLHF Training

  1. Navigate to the RLHF directory:
cd rlhf
  1. Set up environment variables:
export CUDA_VISIBLE_DEVICES="0,1"
export CUDA_LAUNCH_BLOCKING=1
export PYTORCH_CUDA_ALLOC_CONF='expandable_segments:False'
export TORCH_USE_CUDA_DSA=1
export WANDB_API_KEY=<your_wandb_api_key>
export HF_SECRET_KEY=<your_huggingface_token>
export HF_DATASETS_CACHE=<your_cache_directory>
  1. Login to required services:
huggingface-cli login --token $HF_SECRET_KEY
wandb login --relogin $WANDB_API_KEY
  1. Start training with accelerate:
tmux new -s rlhf -d
tmux attach -t rlhf

accelerate launch \
    --config_file "accelerate_config.yaml" \
    train.py

Features

  • Multiple Training Methods: SFT, RLHF, and Lightning-based training
  • Model Quantization: Support for efficient model compression
  • Distributed Training: Multi-GPU support with accelerate
  • Monitoring: Integration with Weights & Biases for experiment tracking
  • Flexible Configuration: Easy-to-modify configuration files

Dependencies

Key dependencies include:

  • PyTorch 2.4.0+ (with CUDA support)
  • Transformers (latest from GitHub)
  • LangChain 0.2.5+
  • Lightning AI framework
  • Hugging Face ecosystem (datasets, tokenizers, etc.)

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

License

This project is licensed under the MIT License.

About

sLLM finetuning playground

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •