Best effort support for all Hugging Face transformers models

With https://github.com/huggingface/text-generation-inference adopting a less friendly license, this seems like a good opportunity to add best effort support for all Hugging Face `transformers` models that generate text e.g., via `AutoModelForCausalLM` and `AutoModelForSeq2SeqLM`. This would allow them to take advantage of vLLM's other serving features while specific models can retain optimized implementations or gain them as they are implemented

* https://github.com/huggingface/text-generation-inference/blob/ecf6dc3a5a31c1b0e1ed48988ddf2416b5e35660/server/text_generation_server/models/causal_lm.py#L451
* https://github.com/huggingface/text-generation-inference/blob/ecf6dc3a5a31c1b0e1ed48988ddf2416b5e35660/server/text_generation_server/models/seq2seq_lm.py#L501

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Best effort support for all Hugging Face transformers models #616

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Best effort support for all Hugging Face transformers models #616

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions