Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codestral Mamba | Mistral AI | Frontier AI in your hands #852

Open
1 task
ShellLM opened this issue Aug 1, 2024 · 1 comment
Open
1 task

Codestral Mamba | Mistral AI | Frontier AI in your hands #852

ShellLM opened this issue Aug 1, 2024 · 1 comment
Labels
code-generation code generation models and tools like copilot and aider llm Large Language Models MachineLearning ML Models, Training and Inference New-Label Choose this option if the existing labels are insufficient to describe the content accurately openai OpenAI APIs, LLMs, Recipes and Evals

Comments

@ShellLM
Copy link
Collaborator

ShellLM commented Aug 1, 2024

Codestral Mamba | Mistral AI | Frontier AI in your hands

Snippet

"Codestral Mamba
As a tribute to Cleopatra, whose glorious destiny ended in tragic snake circumstances, we are proud to release Codestral Mamba, a Mamba2 language model specialised in code generation, available under an Apache 2.0 license.
July 16, 2024 Mistral AI team

Following the publishing of the Mixtral family, Codestral Mamba is another step in our effort to study and provide new architectures. It is available for free use, modification, and distribution, and we hope it will open new perspectives in architecture research. Codestral Mamba was designed with help from Albert Gu and Tri Dao.

Unlike Transformer models, Mamba models offer the advantage of linear time inference and the theoretical ability to model sequences of infinite length. It allows users to engage with the model extensively with quick responses, irrespective of the input length. This efficiency is especially relevant for code productivity use cases—this is why we trained this model with advanced code and reasoning capabilities, enabling it to perform on par with SOTA transformer-based models.

We have tested Codestral Mamba on in-context retrieval capabilities up to 256k tokens. We expect it to be a great local code assistant!"

You can deploy Codestral Mamba using the mistral-inference SDK, which relies on the reference implementations from Mamba's GitHub repository. The model can also be deployed through TensorRT-LLM. For local inference, keep an eye out for support in llama.cpp. You may download the raw weights from HuggingFace. This is an instructed model, with 7,285,403,648 parameters.

For easy testing, we made Codestral Mamba available on la Plateforme (codestral-mamba-2407), alongside its big sister, Codestral 22B. While Codestral Mamba is available under the Apache 2.0 license, Codestral 22B is available under a commercial license for self-deployment or a community license for testing purposes.

Suggested labels

{'label-name': 'language-model', 'label-description': 'Models specialized in generating text, often with advanced code and reasoning capabilities.', 'confidence': 68.57}

@ShellLM ShellLM added code-generation code generation models and tools like copilot and aider llm Large Language Models New-Label Choose this option if the existing labels are insufficient to describe the content accurately openai OpenAI APIs, LLMs, Recipes and Evals labels Aug 1, 2024
@ShellLM
Copy link
Collaborator Author

ShellLM commented Aug 1, 2024

Related content

#851 similarity score: 0.89
#460 similarity score: 0.89
#311 similarity score: 0.89
#499 similarity score: 0.88
#431 similarity score: 0.87
#389 similarity score: 0.87

@irthomasthomas irthomasthomas added the MachineLearning ML Models, Training and Inference label Aug 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code-generation code generation models and tools like copilot and aider llm Large Language Models MachineLearning ML Models, Training and Inference New-Label Choose this option if the existing labels are insufficient to describe the content accurately openai OpenAI APIs, LLMs, Recipes and Evals
Projects
None yet
Development

No branches or pull requests

2 participants