Skip to content

Commit

Permalink
Supplementary xformers instructions
Browse files Browse the repository at this point in the history
  • Loading branch information
guanlaoda committed Sep 2, 2023
1 parent 945096f commit a6afb8c
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 2 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,10 +210,13 @@ python llava/train/train_mem.py \
--report_to wandb
```
</details>

<details>
<summary>Pretrain: LLaVA-7B, 8x V100 (32G). Time: ~20 hours.</summary>

We provide training script with DeepSpeed [here](https://github.com/haotian-liu/LLaVA/blob/main/scripts/pretrain_xformers.sh).
Tips:
- If you are using V100 which is not supported by FlashAttention, you can use the [memory-efficient attention](https://arxiv.org/abs/2112.05682) implemented in [xFormers](https://github.com/facebookresearch/xformers). Install xformers and replace `llava/train/train_mem.py` above with [llava/train/train_xformers.py](llava/train/train_xformers.py).
</details>

### Visual Instruction Tuning
Expand Down
3 changes: 1 addition & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,7 @@ dependencies = [
"scikit-learn==1.2.2",
"sentencepiece==0.1.99",
"einops==0.6.1", "einops-exts==0.0.4", "timm==0.6.13",
"gradio_client==0.2.9",
"xformers"
"gradio_client==0.2.9"
]

[project.urls]
Expand Down

0 comments on commit a6afb8c

Please sign in to comment.