This project provides a comprehensive framework for training and fine-tuning Large Language Models (LLMs) using various methods including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and quantization techniques.
llm_training/
├── data/ # Training and evaluation datasets
├── gkd/ # Generalized Knowledge Distillation related code
├── lightning_trainer/ # PyTorch Lightning-based training framework
├── models/ # Model definitions and configurations
├── quantization/ # Model quantization utilities
├── rlhf/ # Reinforcement Learning from Human Feedback training
├── sft/ # Supervised Fine-Tuning with llama_recipes
├── summary_format/ # Summary formatting utilities
├── training_utils/ # Common training utilities
├── main.py # Main inference script
├── pyproject.toml # Poetry configuration
└── requirements.txt # Python dependencies
- Python 3.12 or higher
- CUDA-compatible GPU (recommended)
- Poetry (for dependency management)
- Clone the repository:
git clone <repository-url>
cd llm_training
- Install uv:
pip install uv
- Install dependencies:
uv pip install -r requirements.txt --no-build-isolation --index-strategy unsafe-best-match
-
if run on GCP VM, use local ssd
sudo lsblk -o NAME,SIZE,TYPE,MOUNTPOINT | grep nvme0n1 sudo mkfs.ext4 -F /dev/nvme0n1 sudo mkdir -p /mnt/disks/local-ssd sudo mount /dev/nvme0n1 /mnt/disks/local-ssd sudo chmod a+w /mnt/disks/local-ssd UUID=$(sudo blkid -s UUID -o value /dev/nvme0n1) echo "UUID=$UUID /mnt/disks/local-ssd ext4 discard,defaults,nofail 0 2" | sudo tee -a /etc/fstab
Run the main inference script to test a fine-tuned model:
python main.py
- Navigate to the SFT directory:
cd sft
- Configure your training parameters in the config files and run:
python llama_finetuning.py
- Navigate to the lightning trainer directory:
cd lightning_trainer
- Set up environment variables:
export CUDA_VISIBLE_DEVICES=0
export CUDA_LAUNCH_BLOCKING=1
export WANDB_API_KEY=<your_wandb_api_key>
export HF_SECRET_KEY=<your_huggingface_token>
export HF_DATASETS_CACHE=<your_cache_directory>
- Login to required services:
huggingface-cli login --token $HF_SECRET_KEY
wandb login --relogin $WANDB_API_KEY
- Start training with tmux (recommended):
tmux new -s lightning -d
tmux attach -t lightning
python trainer.py fit \
--trainer.fast_dev_run false \
--trainer.max_epochs 5 \
--model.learning_rate 3e-3 \
--data.train_batch_size 4 \
--data.eval_batch_size 4
- Navigate to the RLHF directory:
cd rlhf
- Set up environment variables:
export CUDA_VISIBLE_DEVICES="0,1"
export CUDA_LAUNCH_BLOCKING=1
export PYTORCH_CUDA_ALLOC_CONF='expandable_segments:False'
export TORCH_USE_CUDA_DSA=1
export WANDB_API_KEY=<your_wandb_api_key>
export HF_SECRET_KEY=<your_huggingface_token>
export HF_DATASETS_CACHE=<your_cache_directory>
- Login to required services:
huggingface-cli login --token $HF_SECRET_KEY
wandb login --relogin $WANDB_API_KEY
- Start training with accelerate:
tmux new -s rlhf -d
tmux attach -t rlhf
accelerate launch \
--config_file "accelerate_config.yaml" \
train.py
- Multiple Training Methods: SFT, RLHF, and Lightning-based training
- Model Quantization: Support for efficient model compression
- Distributed Training: Multi-GPU support with accelerate
- Monitoring: Integration with Weights & Biases for experiment tracking
- Flexible Configuration: Easy-to-modify configuration files
Key dependencies include:
- PyTorch 2.4.0+ (with CUDA support)
- Transformers (latest from GitHub)
- LangChain 0.2.5+
- Lightning AI framework
- Hugging Face ecosystem (datasets, tokenizers, etc.)
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License.