Skip to content

Commit 9ad6e14

Browse files
authored
update readmes + links (#103)
1 parent 8e91b83 commit 9ad6e14

File tree

3 files changed

+23
-28
lines changed

3 files changed

+23
-28
lines changed

program-data-separation/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
This directory provides an example of the Program Data Separation APIs in ExecuTorch.
44
1. Program data separation examples using a linear model with the portable operators and XNNPACK.
5-
2. LoRA inference example with a LoRA and non-LoRA model sharing foundation weights.
5+
2. LoRA inference example with multiple LoRA models sharing a single foundation weight file.
66

77
## Program Data Separation
88

@@ -16,7 +16,7 @@ PTD files are used to store data outside of the PTE file. Some use-cases:
1616
For more information on the PTD data format, please see the [flat_tensor](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/README.md) directory.
1717

1818
## Linear example
19-
For a demo of the program-data separation APIs using a linear model, please see [program-data-separation/cpp/linear_example](linear_example/). This example generates and runs a program-data separated linear model, with weights and bias in a separate .ptd file.
19+
For a demo of the program-data separation APIs using a linear model, please see [program-data-separation/cpp/linear_example](cpp/linear_example/README.md). This example generates and runs a program-data separated linear model, with program in a pte file and weights and bias in a separate .ptd file.
2020

2121
## LoRA example
2222
A major use-case that program-data separation enables is inference with multiple LoRA adapters. LoRA is a fine-tuning technique introduced in [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685). LoRA fine-tuning produces lightweight 'adapter' weights that can be applied to an existing model to adapt it to a new task. LoRA adapters are typically small in comparison to LLM foundation weights, on the order of KB-MB depending on the finetuning setup and model size.
@@ -27,4 +27,4 @@ To enable LoRA, we generate:
2727

2828
Multiple LoRA-adapted PTE files can share the same foundation weights and adding a model adapted to a new task incurs minimal binary size and runtime memory overhead.
2929

30-
Please take a look at [program-data-separation/cpp/lora_example](lora_example/) for a demo of the program-data separation APIs with LoRA. This example generates and runs a LoRA and a non-LoRA model that share foundation weights. At runtime, we see that memory usage does not double.
30+
Please take a look at [program-data-separation/cpp/lora_example](cpp/lora_example/README.md) for a demo of the program-data separation APIs with LoRA. This example shows how to generate and run multiple LoRA adapter PTEs with a shared foundation weight file.

program-data-separation/cpp/linear_example/README.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
# ExecuTorch Program Data Separation Demo C++.
22

3-
This directory contains the C++ code to run the examples generated in [program-data-separation](../program-data-separation/README.md).
4-
3+
This directory contains the C++ code to demo program-data separation on a linear model.
54

65
## Virtual environment setup.
76
Create and activate a Python virtual environment:
@@ -10,12 +9,12 @@ python3 -m venv .venv && source .venv/bin/activate && pip install --upgrade pip
109
```
1110
Or alternatively, [install conda on your machine](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)
1211
```bash
13-
conda create -yn executorch-ptd python=3.10.0 && conda activate executorch-ptd
12+
conda create -yn executorch python=3.10.0 && conda activate executorch
1413
```
1514

1615
Install dependencies:
1716
```bash
18-
pip install executorch==0.7.0
17+
pip install executorch==1.0.0
1918
```
2019

2120
## Export the model/s.
@@ -37,7 +36,7 @@ Note:
3736
- PTE: contains the program execution logic.
3837
- PTD: contains the constant tensors used by the PTE.
3938

40-
See [program-data-separation](../../program-data-separation/README.md) for instructions.
39+
See [program-data-separation](../../README.md) for instructions.
4140

4241
## Install runtime dependencies.
4342
The ExecuTorch repository is configured as a git submodule at `~/executorch-examples/program-data-separation/cpp/executorch`. To initialize it:
@@ -53,15 +52,15 @@ cd ~/executorch-examples/program-data-separation/cpp/executorch
5352
pip install -r requirements-dev.txt
5453
```
5554

56-
## Build the runtime.
55+
## Build and run
5756
Build the executable:
5857
```bash
5958
cd ~/executorch-examples/program-data-separation/cpp/linear_example
6059
chmod +x build_example.sh
6160
./build_example.sh
6261
```
6362

64-
## Run the executable.
63+
Run the executable.
6564
```
6665
./build/bin/executorch_program_data_separation --model-path ../../models/linear.pte --data-path ../../models/linear.ptd
6766

program-data-separation/cpp/lora_example/README.md

Lines changed: 14 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,11 @@ Note:
1212
- There are many ways to fine-tune LoRA adapters. We will go through a few examples to create a demo.
1313

1414
## Table of Contents
15-
- [Size Savings](#size-savings)
16-
- [Fine-tuning](#finetune-from-scratch-with-unsloth-and-llama)
17-
- [Installation](#install-executorch)
18-
- [Export models](#export-models)
19-
- [Run models](#install-runtime-dependencies)
15+
- [Size savings](#size-savings)
16+
- [Finetune lora adapters from scratch with unsloth and Llama](#finetune-from-scratch-with-unsloth-and-llama)
17+
- [Install executorch](#install-executorch)
18+
- [Export lora models](#export-models)
19+
- [Run lora models](#install-runtime-dependencies)
2020
- [Demo video](#demo-video)
2121

2222
## Size savings
@@ -118,14 +118,10 @@ You can also run `~/executorch-examples/program-data-separation/export_lora.sh`.
118118

119119
Example files, trained on executorch/docs/source/ and recent Nobel prize winners.
120120
```bash
121-
# executorch docs trained adapter model.
122-
-rw-r--r-- 1 lfq users 45555712 Oct 17 18:05 et.pte
123-
# foundation weight file
124-
-rw-r--r-- 1 lfq users 5994013600 Oct 17 18:05 foundation.ptd
125-
# dummy lora model.
126-
-rw-r--r-- 1 lfq users 27628928 Oct 17 14:31 llama_3_2_1B_lora.pte
127-
# Nobel prize winners trained adapter model.
128-
-rw-r--r-- 1 lfq users 45555712 Oct 17 18:00 nobel.pte
121+
-rw-r--r-- 1 lfq users 45555712 Oct 17 18:05 executorch_lora.pte # executorch docs lora model.
122+
-rw-r--r-- 1 lfq users 5994013600 Oct 17 18:05 foundation.ptd # foundation weight file
123+
-rw-r--r-- 1 lfq users 27628928 Oct 17 14:31 llama_3_2_1B_lora.pte # dummy lora model.
124+
-rw-r--r-- 1 lfq users 45555712 Oct 17 18:00 nobel_lora.pte # Nobel prize winners lora model.
129125
```
130126

131127
Notice the adapter PTE files are about the same size as the `adapter_model.safetensors`/`adapter_model.pt` files generated during training. The PTE contains the adapter weights (which are not shared) and the program.
@@ -167,15 +163,15 @@ cd ~/executorch-examples/program-data-separation/cpp/lora_example
167163
DOWNLOADED_PATH=~/path/to/Llama-3.2-1B-Instruct/
168164
./build/bin/executorch_program_data_separation \
169165
--tokenizer_path="${DOWNLOADED_PATH}" \
170-
--model1="et.pte" \
171-
--model2="nobel.pte" \
166+
--model1="executorch_lora.pte" \
167+
--model2="nobel_lora.pte" \
172168
--weights="foundation.ptd" \
173169
--prompt="Who were the winners of the Nobel Prize in Physics in 2025?" \
174170
--apply_chat_template
175171
```
176172
Passing in the `DOWNLOADED_PATH` as the tokenizer directory will invoke the HFTokenizer, and parse additional tokenizers files: `tokenizer_config.json` and `special_tokens_map.json`. `special_tokens_map.json` tells us which bos/eos token to use, especially if there are multiple.
177173

178-
`apply_chat_template` formats the prompt according to the LLAMA chat template, which is what the adapter was trained on.
174+
`apply_chat_template` formats the prompt according to the LLAMA chat template.
179175

180176
Sample output:
181177
```
@@ -202,8 +198,8 @@ cd ~/executorch-examples/program-data-separation/cpp/lora_example
202198
DOWNLOADED_PATH=~/path/to/Llama-3.2-1B-Instruct/
203199
./build/bin/executorch_program_data_separation \
204200
--tokenizer_path="${DOWNLOADED_PATH}" \
205-
--model1="et.pte" \
206-
--model2="nobel.pte" \
201+
--model1="executorch_lora.pte" \
202+
--model2="nobel_lora.pte" \
207203
--weights="foundation.ptd" \
208204
--prompt="Help me get started with ExecuTorch in 3 steps" \
209205
--apply_chat_template

0 commit comments

Comments
 (0)