SuperAdapters

Finetune ALL LLMs with ALL Adapeters on ALL Platforms!

Support

Model	LoRA	QLoRA	AdaLoRA	Prefix Tuning	P-Tuning	Prompt Tuning
Bloom	✅	✅	✅	✅	✅	✅
LLaMA	✅	✅	✅	✅	✅	✅
LLaMA2	✅	✅	✅	✅	✅	✅
LLaMA3/3.1/3.2	✅	✅	✅	✅	✅	✅
ChatGLM	✅	✅	✅	☑️	☑️	☑️
ChatGLM2	✅	✅	✅	☑️	☑️	☑️
Qwen	✅	✅	✅	✅	✅	✅
Baichuan	✅	✅	✅	✅	✅	✅
Mixtral	✅	✅	✅	✅	✅	✅
Phi	✅	✅	✅	✅	✅	✅
Phi3	✅	✅	✅	✅	✅	✅
Gemma	✅	✅	✅	✅	✅	✅

You can Finetune LLM on

Windows
Linux
Mac M1/2

You can Handle train / test Data with

Terminal
File
DataBase

You can Do various Task

CausalLM (default)
SequenceClassification

P.S. Unfortunately, SuperAdapters do not support qlora on Mac, please use lora/adalora instead.

Requirement

CentOS:

yum install -y xz-devel

Ubuntu:

apt-get install -y liblzma-dev

MacOS:

brew install xz

P.S. Maybe you should recompile the python with xz

CPPFLAGS="-I$(brew --prefix xz)/include" pyenv install 3.10.0

If you want to use gpu on Mac, Please read How to use GPU on Mac

P.S. Please Make sure your MacOS version > 14.0 !

pip uninstall torch torchvision torchaudio
pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
pip install -r requirements.txt

LLMs

Model	Download Link
Bloom	https://huggingface.co/bigscience/bloom-560m
LLaMA	https://huggingface.co/openlm-research/open_llama_3b_600bt_preview
LLaMA2	https://huggingface.co/meta-llama/Llama-2-13b-hf
LLaMA3	https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
Vicuna	https://huggingface.co/lmsys/vicuna-7b-delta-v1.1
ChatGLM	https://huggingface.co/THUDM/chatglm-6b
ChatGLM2	https://huggingface.co/THUDM/chatglm2-6b
Qwen	https://huggingface.co/Qwen/Qwen-7B-Chat
Mixtral	https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2
Phi	https://huggingface.co/microsoft/phi-2
Phi3	https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
Gemma	https://huggingface.co/alpindale/gemma-2b-it

Finetune Data Format

Here is an example

Usage

ChatGLM with lora

python finetune.py --model_type chatglm --data "data/train/" --model_path "LLMs/chatglm/chatglm-6b/" --adapter "lora" --output_dir "output/chatglm"

python inference.py --model_type chatglm --instruction "Who are you?" --model_path "LLMs/chatglm/chatglm-6b/" --adapter_weights "output/chatglm" --max_new_tokens 32

LLaMa with lora

python finetune.py --model_type llama --data "data/train/" --model_path "LLMs/open-llama/open-llama-3b/" --adapter "lora" --output_dir "output/llama"

python inference.py --model_type llama --instruction "Who are you?" --model_path "LLMs/open-llama/open-llama-3b" --adapter_weights "output/llama" --max_new_tokens 32

Qwen with lora

python finetune.py --model_type qwen --data "data/train/" --model_path "LLMs/Qwen/Qwen-7b-chat" --adapter "lora" --output_dir "output/Qwen"

python inference.py --model_type qwen --instruction "Who are you?" --model_path "LLMs/Qwen/Qwen-7b-chat" --adapter_weights "output/Qwen" --max_new_tokens 32

Other LLMs are some usage of the above.

Use Classify Mode

You need to specify task_type('classify') and labels

python finetune.py --model_type llama --data "data/train/alpaca_tiny_classify.json" --model_path "LLMs/open-llama/open-llama-3b" --adapter "lora" --output_dir "output/llama" --task_type classify --labels '["0", "1"]' --disable_wandb

python inference.py --model_type llama --data "data/train/alpaca_tiny_classify.json" --model_path "LLMs/open-llama/open-llama-3b" --adapter_weights "output/llama" --task_type classify --labels '["0", "1"]' --disable_wandb

Use DataBase

You need to install a MySQL, and put the db config into the system env.

Eg.

export LLM_DB_HOST='127.0.0.1'
export LLM_DB_PORT=3306
export LLM_DB_USERNAME='YOURUSERNAME'
export LLM_DB_PASSWORD='YOURPASSWORD'
export LLM_DB_NAME='YOURDBNAME'

create the necessary tables

Here is the sql files

source xxxx.sql

db_iteration: [train/test] The record's set name.
db_type: [test] The record is whether "train" or "test".
db_test_iteration: [test] The record's test set name.

finetune (use chatglm for example)

python finetune.py --model_type chatglm --fromdb --db_iteration xxxxxx --model_path "LLMs/chatglm/chatglm-6b/" --adapter "lora" --output_dir "output/chatglm" --disable_wandb

eval

python inference.py --model_type chatglm --fromdb --db_iteration xxxxxx --db_type 'test' --db_test_iteration yyyyyyy --model_path "LLMs/chatglm/chatglm-6b/" --adapter_weights "output/chatglm" --max_new_tokens 6

Params

Finetune

usage: finetune.py [-h] [--data DATA] [--model_type {llama,llama2,llama3,chatglm,chatglm2,bloom,qwen,baichuan,mixtral,phi,gemma}] [--task_type {seq2seq,classify}] [--labels LABELS] [--model_path MODEL_PATH]
                   [--output_dir OUTPUT_DIR] [--disable_wandb] [--adapter {lora,qlora,adalora,prompt,p_tuning,prefix}] [--lora_r LORA_R] [--lora_alpha LORA_ALPHA] [--lora_dropout LORA_DROPOUT]
                   [--lora_target_modules LORA_TARGET_MODULES [LORA_TARGET_MODULES ...]] [--adalora_init_r ADALORA_INIT_R] [--adalora_tinit ADALORA_TINIT] [--adalora_tfinal ADALORA_TFINAL]
                   [--adalora_delta_t ADALORA_DELTA_T] [--num_virtual_tokens NUM_VIRTUAL_TOKENS] [--mapping_hidden_dim MAPPING_HIDDEN_DIM] [--epochs EPOCHS] [--learning_rate LEARNING_RATE]
                   [--cutoff_len CUTOFF_LEN] [--val_set_size VAL_SET_SIZE] [--group_by_length] [--logging_steps LOGGING_STEPS] [--load_8bit] [--add_eos_token]
                   [--resume_from_checkpoint [RESUME_FROM_CHECKPOINT]] [--per_gpu_train_batch_size PER_GPU_TRAIN_BATCH_SIZE] [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS] [--fromdb]
                   [--db_iteration DB_ITERATION]

Finetune for all.

optional arguments:
  -h, --help            show this help message and exit
  --data DATA           the data used for instructing tuning
  --model_type {llama,llama2,llama3,chatglm,chatglm2,bloom,qwen,baichuan,mixtral,phi,gemma}
  --task_type {seq2seq,classify}
  --labels LABELS       Labels to classify, only used when task_type is classify
  --model_path MODEL_PATH
  --output_dir OUTPUT_DIR
                        The DIR to save the model
  --disable_wandb       Disable report to wandb
  --adapter {lora,qlora,adalora,prompt,p_tuning,prefix}
  --lora_r LORA_R
  --lora_alpha LORA_ALPHA
  --lora_dropout LORA_DROPOUT
  --lora_target_modules LORA_TARGET_MODULES [LORA_TARGET_MODULES ...]
                        the module to be injected, e.g. q_proj/v_proj/k_proj/o_proj for llama, query_key_value for bloom&GLM
  --adalora_init_r ADALORA_INIT_R
  --adalora_tinit ADALORA_TINIT
                        number of warmup steps for AdaLoRA wherein no pruning is performed
  --adalora_tfinal ADALORA_TFINAL
                        fix the resulting budget distribution and fine-tune the model for tfinal steps when using AdaLoRA
  --adalora_delta_t ADALORA_DELTA_T
                        interval of steps for AdaLoRA to update rank
  --num_virtual_tokens NUM_VIRTUAL_TOKENS
  --mapping_hidden_dim MAPPING_HIDDEN_DIM
  --epochs EPOCHS
  --learning_rate LEARNING_RATE
  --cutoff_len CUTOFF_LEN
  --val_set_size VAL_SET_SIZE
  --group_by_length
  --logging_steps LOGGING_STEPS
  --load_8bit
  --add_eos_token
  --resume_from_checkpoint [RESUME_FROM_CHECKPOINT]
                        resume from the specified or the latest checkpoint, e.g. `--resume_from_checkpoint [path]` or `--resume_from_checkpoint`
  --per_gpu_train_batch_size PER_GPU_TRAIN_BATCH_SIZE
                        Batch size per GPU/CPU for training.
  --gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS
  --fromdb
  --db_iteration DB_ITERATION
                        The record's set name.
  --db_item_num DB_ITEM_NUM
                        The Limit Num of train/test items selected from DB.

Generate

usage: inference.py [-h] [--debug] [--web] [--api] [--instruction INSTRUCTION] [--input INPUT] [--max_input MAX_INPUT] [--test_data_path TEST_DATA_PATH]
                    [--model_type {llama,llama2,llama3,chatglm,chatglm2,bloom,qwen,baichuan,mixtral,phi,phi3,gemma}] [--task_type {seq2seq,classify}] [--labels LABELS] [--model_path MODEL_PATH]
                    [--adapter_weights ADAPTER_WEIGHTS] [--load_8bit] [--temperature TEMPERATURE] [--top_p TOP_P] [--top_k TOP_K] [--max_new_tokens MAX_NEW_TOKENS] [--vllm] [--fromdb] [--db_type DB_TYPE]
                    [--db_iteration DB_ITERATION] [--db_test_iteration DB_TEST_ITERATION] [--db_item_num DB_ITEM_NUM]

Inference for all.

optional arguments:
  -h, --help            show this help message and exit
  --debug               Debug Mode to output detail info
  --web                 Web Demo to try the inference
  --api                 API to try the inference
  --instruction INSTRUCTION
  --input INPUT
  --max_input MAX_INPUT
                        Limit the input length to avoid OOM or other bugs
  --test_data_path TEST_DATA_PATH
                        The DIR of test data
  --model_type {llama,llama2,llama3,chatglm,chatglm2,bloom,qwen,baichuan,mixtral,phi,phi3,gemma}
  --task_type {seq2seq,classify}
  --labels LABELS       Labels to classify, only used when task_type is classify
  --model_path MODEL_PATH
  --adapter_weights ADAPTER_WEIGHTS
                        The DIR of adapter weights
  --load_8bit
  --temperature TEMPERATURE
                        temperature higher, LLM is more creative
  --top_p TOP_P
  --top_k TOP_K
  --max_new_tokens MAX_NEW_TOKENS
  --vllm                Use vllm to accelerate inference.
  --fromdb
  --db_type DB_TYPE     The record is whether 'train' or 'test'.
  --db_iteration DB_ITERATION
                        The record's set name.
  --db_test_iteration DB_TEST_ITERATION
                        The record's test set name.
  --db_item_num DB_ITEM_NUM
                        The Limit Num of train/test items selected from DB.

Use vllm:

Combine the Base Model and Adapter weight

python tool.py combine --model_type llama3 --model_path "LLMs/llama3.1/" --adapter_weights "output/llama3.1/" --output_dir "output/llama3.1-combined/"

Install the dependencies and start vllm server, Help Link.
use option vllm

python inference.py --model_type llama3 --instruction "Who are you?" --model_path "/root/SuperAdapters/output/llama3.1-combined" --vllm --max_new_tokens 32

Tool

Combine Base Model and Adapter weight

usage: tool.py combine [-h] [--model_type {llama,llama2,llama3,chatglm,chatglm2,bloom,qwen,baichuan,mixtral,phi,phi3,gemma}] [--model_path MODEL_PATH] [--adapter_weights ADAPTER_WEIGHTS]
                       [--output_dir OUTPUT_DIR] [--max_shard_size MAX_SHARD_SIZE]

optional arguments:
  -h, --help            show this help message and exit
  --model_type {llama,llama2,llama3,chatglm,chatglm2,bloom,qwen,baichuan,mixtral,phi,phi3,gemma}
  --model_path MODEL_PATH
  --adapter_weights ADAPTER_WEIGHTS
                        The DIR of adapter weights
  --output_dir OUTPUT_DIR
                        The DIR to save the model
  --max_shard_size MAX_SHARD_SIZE
                        Max size of each of the combined model weight, like 1GB,5GB,etc.

python tool.py combine --model_type llama --model_path "LLMs/open-llama/open-llama-3b/" --adapter_weights "output/llama/" --output_dir "output/combine/"

Inference Web

Add the "--web" parameter

python inference.py --model_type phi --model_path "LLMs/phi/phi-2" --web

Inference API

Add the "--api" parameter

python inference.py --model_type phi --model_path "LLMs/phi/phi-2" --api

Label Web

Classify

python web/label.py

Chat

python web/label.py --type chat

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
LLMs		LLMs
api		api
common		common
core		core
data		data
media		media
output		output
sql		sql
web		web
.gitignore		.gitignore
ErrorSolution.md		ErrorSolution.md
LICENSE		LICENSE
MacGPUEnv.md		MacGPUEnv.md
README.md		README.md
finetune.py		finetune.py
inference.py		inference.py
requirements.txt		requirements.txt
tool.py		tool.py
vLLMEnv.md		vLLMEnv.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SuperAdapters

Support

Requirement

LLMs

Finetune Data Format

Usage

ChatGLM with lora

LLaMa with lora

Qwen with lora

Use Classify Mode

Use DataBase

Params

Finetune

Generate

Tool

Combine Base Model and Adapter weight

Inference Web

Inference API

Label Web

Classify

Chat

Reference

About

Releases

Packages

Contributors 2

Languages

License

cckuailong/SuperAdapters

Folders and files

Latest commit

History

Repository files navigation

SuperAdapters

Support

Requirement

LLMs

Finetune Data Format

Usage

ChatGLM with lora

LLaMa with lora

Qwen with lora

Use Classify Mode

Use DataBase

Params

Finetune

Generate

Tool

Combine Base Model and Adapter weight

Inference Web

Inference API

Label Web

Classify

Chat

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages