getting error during inference "Unsupported head size: 32" #86

abhijithnair1 · 2023-11-30T17:38:37Z

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

I tried lora finetuning a smaller variant of mistral architecture but I am getting this error below,

GenerationError: Request failed during generation: Server error: Unsupported head size: 32

I used rank: 16, alpha: 32

https://huggingface.co/Locutusque/TinyMistral-248M-Instruct

Expected behavior

It should have worked since it's following the mistral architecture. (TinyLlama was working fine)

The text was updated successfully, but these errors were encountered:

tgaddair · 2023-11-30T18:31:09Z

Looks like the issue is coming from Paged Attention (vLLM): vllm-project/vllm#1455

There's a solution proposed in that thread which is pretty straightforward, but will require us to fork vllm and add a case for 32 here:

https://github.com/vllm-project/vllm/blob/1f24755bf802a2061bd46f3dd1191b7898f13f45/csrc/attention/attention_kernels.cu#L610

Happy to make this change if this model is important for you!

abhijithnair1 · 2023-11-30T18:37:46Z

@tgaddair It's not that important for us right now. I was just trying it out as an alternative to GPT2 adapters (i.e., not working). I'll keep an eye out on this thread for any updates about this ticket.

tgaddair added the bug Something isn't working label Nov 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getting error during inference "Unsupported head size: 32" #86

getting error during inference "Unsupported head size: 32" #86

abhijithnair1 commented Nov 30, 2023 •

edited

Loading

tgaddair commented Nov 30, 2023

abhijithnair1 commented Nov 30, 2023 •

edited

Loading

getting error during inference "Unsupported head size: 32" #86

getting error during inference "Unsupported head size: 32" #86

Comments

abhijithnair1 commented Nov 30, 2023 • edited Loading

Information

Tasks

Reproduction

Expected behavior

tgaddair commented Nov 30, 2023

abhijithnair1 commented Nov 30, 2023 • edited Loading

abhijithnair1 commented Nov 30, 2023 •

edited

Loading

abhijithnair1 commented Nov 30, 2023 •

edited

Loading