Skip to content

Commit

Permalink
Merge pull request #6 from retrieva/add-readme
Browse files Browse the repository at this point in the history
Add readme
  • Loading branch information
Katsumata420 authored Feb 22, 2024
2 parents f9a5d27 + 91d90f7 commit 595bfe9
Showing 1 changed file with 48 additions and 0 deletions.
48 changes: 48 additions & 0 deletions examples/training/swallow-tart/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Swallow TART

## Features
- DeepSpeed
- To install it, see [here](../../../install-deepspeed.sh).
- Flash-Attention
- To install it, see [here](https://github.com/Dao-AILab/flash-attention?tab=readme-ov-file#installation-and-features).

## Training

```bash
$ deepspeed examples/training/swallow-tart/run_train.py \
--hf_dataset_dir $HF_DATA \
--data_dir $DATA \
--task_names $TASKS \
--max_length $MAX_LENGTH \
--n_dev_sample $N_DEV_SAMPLE \
--num_proc $N_PROC \
--model_name $MODEL_NAME \
--output_dir $OUTPUT_DIR \
--do_train \
--do_eval \
--evaluation_strategy steps \
--per_device_train_batch_size $BATCH \
--gradient_accumulation_steps $GRAD_ACCUM \
--learning_rate $LR \
--num_train_epochs $EPOCH \
--weight_decay $WEIGHT_DECAY \
--warmup_steps $WARMUP \
--logging_steps $LOG_STEP \
--save_steps $SAVE_STEP \
--save_total_limit $SAVE_LIMIT \
--bf16 \
--load_best_model_at_end \
--metric_for_best_model eval_loss \
--gradient_checkpointing \
--use_flash_attention \
--remove_unused_columns False \
--deepspeed $DS_CONFIG
```

If you want to use LoRA, add `--peft_config_path $LORA_CONFIG` to the command.

## Configs
- DeepSpeed config
- Zero3 (cpu-offload): [ds_config_zero3.json](./configs/ds_config_zero3.json)
- LoRA config
- LoRA: [lora_config.json](./configs/lora_config.json)

0 comments on commit 595bfe9

Please sign in to comment.