Skip to content

Commit 8d62dd3

Browse files
authored
Update README.md
1 parent c9e05d9 commit 8d62dd3

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ The central configuration that is used across data preprocessing, training, and
4242

4343
## Data Preprocessing
4444

45-
We provide a script for data preprocessing [`preprocess_data.py`](preprocess_data.py). This converts a text dataset into the same format used for model pretraining. We refrain from providing a script that prepares an instruction finetuning dataset due to different models requiring unique formatting. We also provide options for packing datasets. For more information, please consult the config documentation under [`docs/config.md`](docs/config.md).
45+
We provide a script for data preprocessing [`preprocess_data.py`](preprocess_data.py). This converts a text dataset into the same format used for model pretaining (causal language modeling). We refrain from providing a script that prepares an instruction finetuning dataset due to different models requiring unique formatting. We also provide options for packing datasets. For more information, please consult the config documentation under [`docs/config.md`](docs/config.md).
4646

4747
## Training
4848

0 commit comments

Comments
 (0)