You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a family of four models, where two of the models have been trained to generate 4 tokens per forward pass instead of only a single token like most current LLMs. Multi-token prediction shows significant growths compared to single-token prediction in older benchmarks, so it'd be good to see how much growth can be found in newer benchmarks like BigCode-Bench. These models are not particularly strong, having been trained on 1T tokens or even less.
Model introduction
This is a family of four models, where two of the models have been trained to generate 4 tokens per forward pass instead of only a single token like most current LLMs. Multi-token prediction shows significant growths compared to single-token prediction in older benchmarks, so it'd be good to see how much growth can be found in newer benchmarks like BigCode-Bench. These models are not particularly strong, having been trained on 1T tokens or even less.
Model URL
https://huggingface.co/facebook/multi-token-prediction
Additional instructions (Optional)
Inference seems to currently require using Meta's example code.
Author
No
Security
Integrity
The text was updated successfully, but these errors were encountered: