Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update header photo, add eval.py to quickstart #48

Merged
merged 14 commits into from
May 5, 2023
29 changes: 18 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
<p align="center">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="./assets/loss-curve-dark.png">
<img alt="Compute-optimal training curves for LLMs of various sizes (125M -> 3B)." src="./assets/loss-curve-light.png" width="75%">
<img alt="LLM Foundry" src="./assets/llm-foundry.png" width="95%">
</picture>
</p>

Expand Down Expand Up @@ -54,14 +53,16 @@ pip install -e ".[gpu]" # or pip install -e . if no NVIDIA GPU

# Quickstart

Here is a simple end-to-end workflow for preparing a subset of the C4 dataset, training an MPT-1B model for 10 batches, converting the model to HuggingFace format, and generating responses to prompts.
Here is an end-to-end workflow for preparing a subset of the C4 dataset, training an MPT-125M model for 10 batches,
converting the model to HuggingFace format, evaluating the model on the Winograd challenge, and generating responses to prompts.

If you have a write-enabled [HuggingFace auth token](https://huggingface.co/docs/hub/security-tokens), you can also upload your model to the Hub! Just export your token like this:
If you have a write-enabled [HuggingFace auth token](https://huggingface.co/docs/hub/security-tokens), you can optionally upload your model to the Hub! Just export your token like this:
```bash
export HUGGING_FACE_HUB_TOKEN=your-auth-token
```
and uncomment the line containing `--hf_repo_for_upload ...`.

(Remember this is just a quickstart to demonstrate the tools -- To get good responses, the model must be trained for longer than 10 batches :)
**(Remember this is a quickstart just to demonstrate the tools -- To get good quality, the LLM must be trained for longer than 10 batches 😄)**

<!--pytest.mark.skip-->
```bash
Expand All @@ -75,24 +76,30 @@ python data_prep/convert_dataset.py \

# Train an MPT-1B model for 10 batches
composer train/train.py \
train/yamls/mpt/1b.yaml \
train/yamls/mpt/125m.yaml \
data_local=my-copy-c4 \
train_loader.dataset.split=train_small \
eval_loader.dataset.split=val_small \
max_duration=10ba \
eval_subset_num_batches=1 \
save_folder=mpt-1b
eval_interval=0 \
save_folder=mpt-125m

# Convert the model to HuggingFace format
python inference/convert_composer_to_hf.py \
--composer_path mpt-1b/ep0-ba10-rank0.pt \
--hf_output_path mpt-1b-hf \
--composer_path mpt-125m/ep0-ba10-rank0.pt \
--hf_output_path mpt-125m-hf \
--output_precision bf16 \
# --hf_repo_for_upload user-org/repo-name

# Evaluate the model on Winograd
python eval/eval.py \
eval/yamls/hf_eval.yaml \
icl_tasks=eval/yamls/winograd.yaml \
model_name_or_path=mpt-125m-hf

# Generate responses to prompts
python inference/hf_generate.py \
--name_or_path mpt-1b-hf \
--name_or_path mpt-125m-hf \
--max_new_tokens 256 \
--prompts \
"The answer to life, the universe, and happiness is" \
Expand Down
Binary file added assets/llm-foundry.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed assets/loss-curve-dark.png
Binary file not shown.
Binary file removed assets/loss-curve-light.png
Binary file not shown.
2 changes: 2 additions & 0 deletions mcli/mcli-hf-eval.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ cluster: # replace with your cluster here!
parameters:
seed: 1
max_seq_len: 1024
device_eval_batch_size: 8

model_name_or_path: huggyllama/llama-7b
# model_name_or_path: EleutherAI/gpt-j-6b
# model_name_or_path: EleutherAI/pythia-6.9b
Expand Down
6 changes: 6 additions & 0 deletions scripts/eval/yamls/winograd.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
icl_tasks:
-
label: winograd
dataset_uri: eval/local_data/winograd_wsc.jsonl # ADD YOUR OWN DATASET URI
num_fewshot: [0]
icl_task_type: schema