We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 61c4162 commit 76df3feCopy full SHA for 76df3fe
README.md
@@ -59,6 +59,8 @@ Each LLM has an ELO score based on its results.
59
| 13 | **together:meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo:vision** | 1269.84 |
60
| 14 | anthropic:claude-3-sonnet-20240229:text | 1029.31 |
61
62
+*Note: In our experiments, Claude 3 Sonnet got a low score due to many refusal to fight and large API latencies.*
63
+
64
### Win rate matrix
65
66

0 commit comments