Llama pro #52

KeremTurgutlu · 2024-04-16T14:57:25Z

BnB and HQQ 4-bit Llama-Pro FSDP training changes.
Can be tested with the following:

# create llama pro
python scripts/block_expansion.py \
--model_name meta-llama/Llama-2-7b-hf \
--output_dir /mnt/vol_b/models \
--expansion_rate 0.1

# train
python train.py \
--model_name meta-llama/Llama-2-7b-hf \
--dataset orca_math \
--dataset_samples 1000 \
--batch_size 8 \
--context_length 1024 \
--gradient_accumulation_steps 2 \
--train_type bnb_llama_pro \
--llama_pro_path /mnt/vol_b/models/meta-llama/Llama-2-7b-hf_blk_exp-32-35/ \
--sharding_strategy full_shard \
--use_gradient_checkpointing true \
--reentrant_checkpointing true \
--use_cpu_offload false \
--use_activation_cpu_offload false \
--log_to wandb \
--verbose true \
--project_name "fsdp-dora-tests" \
--save_model true \
--output_dir /mnt/vol_b/models/llama-7b-orca-math-1k-bnb-llama-pro

Results: https://wandb.ai/answerdotai/fsdp-dora-tests

johnowhitaker

Have not done complete code review but it looks to be working :)

KeremTurgutlu added 5 commits April 16, 2024 10:08

add bnb_dora and hqq_dora changes

0eab5e4

add tests

67501df

minor fixes

9e8928f

update readme

9655e8a

quantized llama-pro

bae3681

KeremTurgutlu requested review from warner-benjamin and johnowhitaker April 16, 2024 14:57

johnowhitaker approved these changes Apr 16, 2024

View reviewed changes

KeremTurgutlu merged commit 0f92b34 into main Apr 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama pro #52

Llama pro #52

KeremTurgutlu commented Apr 16, 2024

johnowhitaker left a comment

Llama pro #52

Llama pro #52

Conversation

KeremTurgutlu commented Apr 16, 2024

johnowhitaker left a comment

Choose a reason for hiding this comment