Skip to content

[New Model]: Codestral Mamba #6479

Closed
@K-Mistele

Description

@K-Mistele

The model to consider.

Mamba Codestral: https://huggingface.co/mistralai/mamba-codestral-7B-v0.1

Highlights:

  • SOTA 7B code model
  • theoretically unlimited context length; tested up to 256k
  • inference is linear-complexity with respect to sequence length, compared to transformers which is quadratic-complexity

The closest model vllm already supports.

Jamba seems to be the closest model, since it is Mamba-based: https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/jamba.py

What's your difficulty of supporting the model you want?

Mamba is a non-transformer architecture, but there is already a mamba-based model supported, so it's unclear how difficult it would be to support.

Metadata

Metadata

Assignees

No one assigned

    Labels

    new-modelRequests to new modelsunstaleRecieved activity after being labelled stale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions