Feature Request: Support for Phi-4-mini-instruct

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

[Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct) uses [Phi3ForCausalLM](https://huggingface.co/microsoft/Phi-4-mini-instruct/blob/main/config.json#L4) architecture, but conversion fails:

```
python D:\repos-git\llama.cpp\convert_hf_to_gguf.py --outtype f16 ..\Phi-4-mini-instruct\ --outfile Phi-4-mini-instruct-F16.gguf
INFO:hf-to-gguf:Loading model: Phi-4-mini-instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
Traceback (most recent call last):
  File "D:\repos-git\llama.cpp\convert_hf_to_gguf.py", line 5112, in <module>
    main()
  File "D:\repos-git\llama.cpp\convert_hf_to_gguf.py", line 5106, in main
    model_instance.write()
  File "D:\repos-git\llama.cpp\convert_hf_to_gguf.py", line 439, in write
    self.prepare_tensors()
  File "D:\repos-git\llama.cpp\convert_hf_to_gguf.py", line 280, in prepare_tensors
    for name, data_torch in chain(self.generate_extra_tensors(), self.get_tensors()):
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\repos-git\llama.cpp\convert_hf_to_gguf.py", line 2568, in generate_extra_tensors
    raise ValueError(f'The length of rope long and short factors must be {rope_dims / 2}')
ValueError: The length of rope long and short factors must be 64.0
```

Tested using script from b4783.

After disabling rope in config.json (`"rope_scaling": null,`) it fails at pre-tokenizer:
```
  File "D:\repos-git\llama.cpp\convert_hf_to_gguf.py", line 716, in get_vocab_base_pre
    raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()
```

### Motivation

Phi series was supported so far, and people might be interested in Phi-4-mini-instruct as well.

### Possible Implementation

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Support for Phi-4-mini-instruct #12091

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Support for Phi-4-mini-instruct #12091

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions