[New Model]: Cohere2 (Command R7B)

### The model to consider.

https://huggingface.co/CohereForAI/c4ai-command-r7b-12-2024

### The closest model vllm already supports.

Likely either the original Cohere (for. obvious reasons) or Gemma2 (as it also has a funky SWA architecture)

### What's your difficulty of supporting the model you want?

It uses SWA, but this can likely be ditched to get MVP inference working ala how gemma 2 was done
For some reason every 4th layer uses global attention _without_ positional embeddings? Not sure how or why that one works tbh

### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[New Model]: Cohere2 (Command R7B) #11181

The model to consider.

The closest model vllm already supports.

What's your difficulty of supporting the model you want?

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[New Model]: Cohere2 (Command R7B) #11181

Description

The model to consider.

The closest model vllm already supports.

What's your difficulty of supporting the model you want?

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions