Skip to content

Commit

Permalink
addressing comments
Browse files Browse the repository at this point in the history
  • Loading branch information
SalmanMohammadi committed Oct 16, 2024
2 parents e556bb0 + 6a8a027 commit 9b13790
Show file tree
Hide file tree
Showing 91 changed files with 502 additions and 252 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/gpu_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ jobs:
- name: Install remaining dependencies
run: |
python -m pip install -e ".[dev]"
python -m pip install git+https://github.com/EleutherAI/lm-evaluation-harness.git@fb963f0f0a5b28b69763590bb59676072cf43a01
python -m pip install lm-eval==0.4.5
- name: Run recipe and unit tests with coverage
run: pytest tests --with-integration --cov=. --cov-report=xml --durations=20 -vv
- name: Upload Coverage to Codecov
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/recipe_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ jobs:
run: |
python -m pip install torch torchvision torchao
python -m pip install -e ".[dev]"
python -m pip install git+https://github.com/EleutherAI/lm-evaluation-harness.git@fb963f0f0a5b28b69763590bb59676072cf43a01
python -m pip install lm-eval==0.4.5
- name: Run recipe tests with coverage
run: pytest tests -m integration_test --cov=. --cov-report=xml --durations=20 -vv
- name: Upload Coverage to Codecov
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/regression_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ jobs:
- name: Install remaining dependencies
run: |
python -m pip install -e ".[dev]"
python -m pip install lm-eval==0.4.*
python -m pip install lm-eval==0.4.5
- name: Run regression tests with coverage
run: pytest tests -m slow_integration_test --silence-s3-logs --cov=. --cov-report=xml --durations=20 -vv
- name: Upload Coverage to Codecov
Expand Down
9 changes: 9 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
cff-version: 1.2.0
title: "torchtune: PyTorch's finetuning library"
message: "If you use this software, please cite it as below."
type: software
authors:
- given-names: "torchtune maintainers and contributors"
url: "https//github.com/pytorch/torchtune"
license: "BSD-3-Clause"
date-released: "2024-04-14"
18 changes: 17 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
![Recipe Integration Test](https://github.com/pytorch/torchtune/actions/workflows/recipe_test.yaml/badge.svg)
[![](https://dcbadge.vercel.app/api/server/4Xsdn8Rr9Q?style=flat)](https://discord.gg/4Xsdn8Rr9Q)

[**Introduction**](#introduction) | [**Installation**](#installation) | [**Get Started**](#get-started) | [**Documentation**](https://pytorch.org/torchtune/main/index.html) | [**Community**](#community) | [**License**](#license)
[**Introduction**](#introduction) | [**Installation**](#installation) | [**Get Started**](#get-started) | [**Documentation**](https://pytorch.org/torchtune/main/index.html) | [**Community**](#community) | [**License**](#license) | [**Citing torchtune**](#citing-torchtune)

> [!IMPORTANT]
> Update September 25, 2024: torchtune has support for **Llama 3.2 11B Vision**, **Llama 3.2 3B**, and **Llama 3.2 1B** models! Try them out by following our installation instructions [here](#Installation), then run any of the text configs [here](recipes/configs/llama3_2) or vision configs [here](recipes/configs/llama3_2_vision).
Expand Down Expand Up @@ -282,3 +282,19 @@ We also want to acknowledge some awesome libraries and tools from the ecosystem:
## License

torchtune is released under the [BSD 3 license](./LICENSE). However you may have other legal obligations that govern your use of other content, such as the terms of service for third-party models.


## Citing torchtune

If you find the torchtune library useful, please cite it in your work as below.

```bibtex
@software{torchtune,
title = {torchtune: PyTorch's finetuning library},
author = {torchtune maintainers and contributors},
url = {https//github.com/pytorch/torchtune},
license = {BSD-3-Clause},
month = apr,
year = {2024}
}
```
1 change: 0 additions & 1 deletion docs/source/api_ref_modules.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ Modeling Components and Building Blocks
MultiHeadAttention
FeedForward
KVCache
get_cosine_schedule_with_warmup
RotaryPositionalEmbeddings
RMSNorm
Fp32LayerNorm
Expand Down
13 changes: 13 additions & 0 deletions docs/source/api_ref_training.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,19 @@ Utilities to reduce memory consumption during training.
create_optim_in_bwd_wrapper
register_optim_in_bwd_hooks

.. _lr_scheduler_label:

Schedulers
----------

Utilities to control lr during the training process.

.. autosummary::
:toctree: generated/
:nosignatures:

get_cosine_schedule_with_warmup

.. _metric_logging_label:

Metric Logging
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorials/memory_optimizations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ For example: with ``batch_size=1`` and ``gradient_accumulation_steps=32`` we get
.. note::

For other components in torchtune which use "steps", such as :ref:`metric logging <metric_logging_label>`, or
:func:`learning rate schedulers <torchtune.modules.get_cosine_schedule_with_warmup>`, a "step" is counted as a
:func:`learning rate schedulers <torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup>`, a "step" is counted as a
single update to model parameters, rather than a single model forward pass with the data.
Suppose ``gradient_accumulation_steps = 4`` and ``log_every_n_steps = 10``.
Metrics would be logged every 10 global steps, which translates to every 40 model forward passes.
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/code_llama2/7B_lora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100
loss:
_component_: torchtune.modules.loss.CEWithChunkedOutputLoss
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/code_llama2/7B_qlora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100
loss:
_component_: torchtune.modules.loss.CEWithChunkedOutputLoss
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/gemma/2B_lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ optimizer:
lr: 2e-5

lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 10

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/gemma/2B_lora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ optimizer:
lr: 2e-5

lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 10

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/gemma/2B_qlora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ optimizer:
lr: 2e-5

lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 10

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/gemma/7B_lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ optimizer:
lr: 2e-5

lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 10

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/gemma/7B_lora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ optimizer:
lr: 5e-5

lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 10

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/gemma/7B_qlora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ optimizer:
lr: 2e-5

lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 10

loss:
Expand Down
39 changes: 39 additions & 0 deletions recipes/configs/gemma/evaluation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Config for EleutherEvalRecipe in eleuther_eval.py
#
# To launch, run the following command:
# tune run eleuther_eval --config gemma/evaluation

# Model Arguments
model:
_component_: torchtune.models.gemma.gemma_2b

# Checkpointer
checkpointer:
_component_: torchtune.training.FullModelHFCheckpointer
checkpoint_dir: /tmp/gemma-2b
checkpoint_files: [
model-00001-of-00002.safetensors,
model-00002-of-00002.safetensors,
]
output_dir: ./ # Not needed
model_type: GEMMA

# Tokenizer
tokenizer:
_component_: torchtune.models.gemma.gemma_tokenizer
path: /tmp/gemma-2b/tokenizer.model

# Environment
device: cuda
dtype: bf16
seed: 1234 # It is not recommended to change this seed, b/c it matches EleutherAI's default seed

# EleutherAI specific eval args
tasks: ["truthfulqa_mc2"]
limit: null
max_seq_length: 4096
batch_size: 8
enable_kv_cache: True

# Quantization specific args
quantizer: null
2 changes: 1 addition & 1 deletion recipes/configs/llama2/13B_lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ optimizer:
weight_decay: 0.01
lr: 2e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama2/13B_qlora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama2/70B_lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama2/70B_qlora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama2/7B_full_low_memory.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ optimizer:
_component_: bitsandbytes.optim.PagedAdamW
lr: 1e-5
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100
optimizer_in_bwd: True
loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama2/7B_lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama2/7B_lora_dpo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ optimizer:
weight_decay: 0.05
lr: 5e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama2/7B_lora_dpo_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ optimizer:
weight_decay: 0.05
lr: 5e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama2/7B_lora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama2/7B_qlora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama2/7B_qlora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama3/70B_lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama3/8B_dora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama3/8B_dora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama3/8B_full_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ optimizer:
_component_: bitsandbytes.optim.PagedAdamW8bit
lr: 1e-5
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100
loss:
_component_: torchtune.modules.loss.CEWithChunkedOutputLoss
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama3/8B_lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama3/8B_lora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama3/8B_qdora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama3/8B_qlora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ optimizer:
weight_decay: 0.01
lr: 3e-4
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
2 changes: 1 addition & 1 deletion recipes/configs/llama3_1/405B_qlora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ optimizer:
lr: 3e-4
fused: True
lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
_component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
Expand Down
Loading

0 comments on commit 9b13790

Please sign in to comment.