Skip to content

Commit

Permalink
IDC-3: Support HumanEvalPlus-Mini-v0.1.10, Support the DeepSeek Coder…
Browse files Browse the repository at this point in the history
… Model Family in examples/run_identity_chain_huggingface.py (#4)

* update README

* support deepseekcoder

* update prompt for fim

* update run_identity_chain.sh

* update HumanEvalPlus-Mini to v0.1.10

* Bump version: 0.0.1 → 0.1.0
  • Loading branch information
marcusm117 authored May 4, 2024
1 parent 867b706 commit 56e1ec4
Show file tree
Hide file tree
Showing 12 changed files with 553 additions and 206 deletions.
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.0.1
current_version = 0.1.0
commit = True
tag = False

Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@ temp_files/

# unzipped data
data/EvalPlus-Mini-v0.1.6_reformatted.jsonl
data/EvalPlus-Mini-v0.1.9_reformatted.jsonl
data/MBPP-S_test_reformatted.jsonl
data/MbppPlus-v0.1.0_reformatted.jsonl

# Coverage Report
python_junit.xml
Expand Down
19 changes: 12 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,9 @@ Before the self-consistency evaluation, you need to make sure that one of the fo

To evaluate your model using IdentityChain, you need to prepare the followings:

1. An evaluation dataset in the format of one of the followings (you can also use these two directly):
1. An evaluation dataset from one of the followings (or one of your own in the same format):
- [EvalPlus-Mini-v0.1.6_reformatted.jsonl](./data/EvalPlus-Mini-v0.1.6_reformatted.jsonl.gz)
- [EvalPlus-Mini-v0.1.10_reformatted.jsonl](./data/EvalPlus-Mini-v0.1.10_reformatted.jsonl.gz)
- [MBPP-S_test_reformatted.jsonl](./data/MBPP-S_test_reformatted.jsonl.gz)
2. An NL-to-PL prompt for your model
3. A PL-to-NL prompt for your model
Expand All @@ -51,14 +52,18 @@ To evaluate your model using IdentityChain, you need to prepare the followings:

See [run_identity_chain_openai.py](./examples/run_identity_chain_openai.py) for an example of how to use IdentityChain to evaluate OpenAI models.

See [run_identity_chain_google.py](./examples/run_identity_chain_google.py) for an example of how to use IdentityChain to evaluate Google models.

See [run_identity_chain_huggingface.py](./examples/run_identity_chain_huggingface.py) for an example of how to use IdentityChain to evaluate HuggingFace open-source models. This example script already includes the following models:

1. CodeLlama-Instruct-hf (7B, 13B, 34B)
2. CodeLlama-hf (7B, 13B, 34B)
3. starchat-beta
4. starcoder
5. starcoderplus
6. starcoderbase (1B, 3B, 7B, 15B)
1. CodeLlama-Instruct-hf (7B, 13B, 34B, 70B)
2. CodeLlama-hf (7B, 13B, 34B, 70B)
3. StarChat-Beta
4. StarCoder
5. StarCoderPlus
6. StarCoderBase (1B, 3B, 7B, 15B)
7. DeepSeekCoder-Instruct (1.3B, 6.7B, 33B, 7B-v1.5)
8. DeepSeekCoder (1.3B, 6.7B, 33B, 7B-v1.5)

## Example

Expand Down
164 changes: 164 additions & 0 deletions data/EvalPlus-Mini-v0.1.10_reformatted.jsonl

Large diffs are not rendered by default.

Binary file added data/EvalPlus-Mini-v0.1.10_reformatted.jsonl.gz
Binary file not shown.
Binary file not shown.
2 changes: 1 addition & 1 deletion examples/run_identity_chain.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ export IDENTITY_CHAIN_HOME=YOUR_OWN_PATH/IdentityChain # no / at the end

# for open-source models from HuggingFace, when using greedy, add the flag --greedy_early_stop to accelerate
# for OpenAI models, don't use --greedy_early_stop!!! temperature = 0 is NOT greedy!!!
# for EvalPlus-Mini-v0.1.6_reformatted.jsonl, use --resume_task_bs 1, since HumanEval/0 is used for prompt
# for EvalPlus-Mini-v0.1.6_reformatted.jsonl (or other versions), use --resume_task_bs 1, since HumanEval/0 is used for prompt
# for MBPP-S_test_reformatted.jsonl, use --resume_task_bs 0, since there's a separate prompt split

for MODEL in "bigcode/starcoderbase-1b" # feel free to add other supported models
Expand Down
Loading

0 comments on commit 56e1ec4

Please sign in to comment.