Skip to content

Commit

Permalink
Remove codeT results for code-davinci-002 as not comparable to other …
Browse files Browse the repository at this point in the history
…HumanEval results, due to additional explicit testing of outputs
  • Loading branch information
LudwigStumpp committed May 16, 2023
1 parent b7e4ee9 commit 72edf21
Showing 1 changed file with 0 additions and 1 deletion.
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ https://huggingface.co/spaces/ludwigstumpp/llm-leaderboard
| [chatglm-6b](https://chatglm.cn/blog) | ChatGLM | yes | [985](https://lmsys.org/blog/2023-05-03-arena/) | | | | | | | | | | | | | |
| [chinchilla-70b](https://arxiv.org/abs/2203.15556v1) | DeepMind | no | | | [0.808](https://arxiv.org/abs/2203.15556v1) | | | [0.774](https://arxiv.org/abs/2203.15556v1) | | | [0.675](https://arxiv.org/abs/2203.15556v1) | | | [0.749](https://arxiv.org/abs/2203.15556v1) | | |
| [codex-12b / code-cushman-001](https://arxiv.org/abs/2107.03374) | OpenAI | no | | | | | [0.317](https://crfm.stanford.edu/helm/latest/?group=targeted_evaluations) | | | | | | | | | |
| [code-davinci-002](https://arxiv.org/abs/2207.10397v2) | OpenAI | no | | | | | [0.658](https://arxiv.org/abs/2207.10397v2) | | | | | | | | | |
| [codegen-16B-mono](https://huggingface.co/Salesforce/codegen-16B-mono) | Salesforce | yes | | | | | [0.293](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
| [codegen-16B-multi](https://huggingface.co/Salesforce/codegen-16B-multi) | Salesforce | yes | | | | | [0.183](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
| [codegx-13b](http://keg.cs.tsinghua.edu.cn/codegeex/) | Tsinghua University | no | | | | | [0.229](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
Expand Down

0 comments on commit 72edf21

Please sign in to comment.