T-REX: Mixture-of-Rank-One-Experts with semantic-aware Intuition for Multi-task Large Language Model Finetuning
# Create a new conda environment
conda create -n trex python=3.10
conda activate trex
# Navigate to the project directory
cd ./trex
-
Install PyTorch 2.5.1 according to your CUDA version from the PyTorch official website
-
Install dependencies and project packages:
# Install requirements
pip install -r requirements.txt
# Install local packages in development mode
pip install -e ./transformers
pip install -e ./peft
Download the Trex dataset from Hugging Face:
# Download dataset
huggingface-cli download --repo-type dataset --resume-download leoboy20/trex_dataset --local-dir datasets
# Extract validation datasets
cd ./datasets
tar -xzvf npys_val_datasets.tar.gz
-
Configure your training settings:
- Open the JSON configuration file in the
train_args
directory - Modify the
model_name_or_path
parameter to your desired pre-trained model
- Open the JSON configuration file in the
-
Start training:
bash trex_train.sh
-
Configure your evaluation settings:
- Modify
adapter_name_or_path
intrex_eval.sh
to point to your trained adapter in the output directory - Update
model_name_or_path
to match your base model
- Modify
-
Run evaluation:
bash trex_eval.sh