Benchmark for some open source models #16

davideuler · 2024-04-06T04:47:38Z

It is amazing that the mixtral-8x7b-instruct-v0.1.Q6_K GGUF got a 25% passes.

carlini · 2024-04-06T06:12:12Z

Ah very nice. I should write some code that will merge together multiple independent datasets to make a larger matrix...

I guess we don't know what Mistral Medium is, but if it's some variant of mixtral I guess this makes sense that they're similar-ish in score?

davideuler · 2024-04-07T13:17:44Z

I guess Mistral Medium maybe the mistral 70b instruct, or some MOEs like mixtral 8x7b.
If independent datasets are merged to be a large matrix it would be perfect to check the differences of each model.
I wonder which open model is the most capable at code generation currently.

davideuler · 2024-04-10T17:46:34Z

The result for deepseek-coder-33b-instruct is a big surprise.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark for some open source models #16

Benchmark for some open source models #16

davideuler commented Apr 6, 2024

carlini commented Apr 6, 2024

davideuler commented Apr 7, 2024

davideuler commented Apr 10, 2024

Benchmark for some open source models #16

Benchmark for some open source models #16

Comments

davideuler commented Apr 6, 2024

carlini commented Apr 6, 2024

davideuler commented Apr 7, 2024

davideuler commented Apr 10, 2024