You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You will need to install [torchtune](https://github.com/pytorch/torchtune) following [its installation instructions](https://github.com/pytorch/torchtune?tab=readme-ov-file#installation).
14
14
15
+
You might run into an issue with the `triton` package when installing `torchtune`. You can build `triton` locally following the [instructions in their repo](https://github.com/triton-lang/triton?tab=readme-ov-file#install-from-source).
16
+
15
17
## Config Files
16
18
19
+
The directory structure of the `llm_pte_finetuning` is:
20
+
21
+
```console
22
+
examples/llm_pte_finetuning
23
+
├── README.md
24
+
├── TARGETS
25
+
├── __init__.py
26
+
│ ├── model_loading_lib.cpython-312.pyc
27
+
│ └── training_lib.cpython-312.pyc
28
+
├── model_exporter.py
29
+
├── model_loading_lib.py
30
+
├── phi3_alpaca_code_config.yaml
31
+
├── phi3_config.yaml
32
+
├── qwen_05b_config.yaml
33
+
├── runner.py
34
+
└── training_lib.py
35
+
```
36
+
37
+
We already provide configs out of the box. The following sections explain how you can setup the config for your own model or dataset.
38
+
17
39
As mentioned in the previous section, we internally use `torchtune` APIs, and thus, we use config files that follow `torchtune`'s structure. Typically, in the following sections we go through a working example which can be found in the `phi3_config.yaml` config file.
18
40
19
41
### Tokenizer
20
42
21
43
We need to define the tokenizer. Let's suppose we would like to use [PHI3 Mini Instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) model from Microsoft. We need to define the tokenizer component:
@@ -47,7 +69,7 @@ Torchtune supports datasets using huggingface dataloaders, so custom datasets co
47
69
48
70
For the loss function, we use PyTorch losses. In this example we use the `CrossEntropyLoss`:
49
71
50
-
```
72
+
```yaml
51
73
loss:
52
74
_component_: torch.nn.CrossEntropyLoss
53
75
```
@@ -56,7 +78,7 @@ loss:
56
78
57
79
Model parameters can be set, in this example we replicate the configuration for phi3 mini instruct benchmarks:
58
80
59
-
```
81
+
```yaml
60
82
model:
61
83
_component_: torchtune.models.phi3.lora_phi3_mini
62
84
lora_attn_modules: ['q_proj', 'v_proj']
@@ -70,7 +92,7 @@ model:
70
92
71
93
Depending on how your model is defined, you will need to instantiate different components. In these examples we use checkpoints from HF (hugging face format), and thus we will need to instantiate a `FullModelHFCheckpointer` object. We need to pass the checkpoint directory, the files with the tensors, the output directory for training and the model type:
0 commit comments