Skip to content

[New Model]: Support Phi-3 #4306

Closed
Closed
@alexkreidler

Description

@alexkreidler

The model to consider.

https://huggingface.co/microsoft/Phi-3-mini-128k-instruct
https://huggingface.co/microsoft/Phi-3-mini-4k-instruct

The closest model vllm already supports.

Phi-2 (which uses the same transformers model as Phi-1)

What's your difficulty of supporting the model you want?

Support for LongRope #3575

I tried running Phi-3-mini-128k-instruct but got this error:

langbench-vllm-1  | Traceback (most recent call last):
langbench-vllm-1  |   File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
langbench-vllm-1  |     return _run_code(code, main_globals, None,
langbench-vllm-1  |   File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
langbench-vllm-1  |     exec(code, run_globals)
langbench-vllm-1  |   File "/workspace/vllm/entrypoints/openai/api_server.py", line 157, in <module>
langbench-vllm-1  |     engine = AsyncLLMEngine.from_engine_args(
langbench-vllm-1  |   File "/workspace/vllm/engine/async_llm_engine.py", line 331, in from_engine_args
langbench-vllm-1  |     engine_config = engine_args.create_engine_config()
langbench-vllm-1  |   File "/workspace/vllm/engine/arg_utils.py", line 406, in create_engine_config
langbench-vllm-1  |     model_config = ModelConfig(
langbench-vllm-1  |   File "/workspace/vllm/config.py", line 125, in __init__
langbench-vllm-1  |     self.max_model_len = _get_and_verify_max_len(self.hf_text_config,
langbench-vllm-1  |   File "/workspace/vllm/config.py", line 969, in _get_and_verify_max_len
langbench-vllm-1  |     assert "factor" in rope_scaling
langbench-vllm-1  | AssertionError

because the relevant part of Phi-3's config.json is different to support LongRope

"rope_scaling": {
    "long_factor": [
      1.0299999713897705,
      1.0499999523162842,
      1.0499999523162842,
      1.0799999237060547,
      1.2299998998641968,
      1.2299998998641968,
      <truncated>
    ],
    "short_factor": [
      1.05,
      1.05,
      1.05,
      1.1,
      1.1,
      1.1500000000000001,
      1.2000000000000002,
      1.2500000000000002,
      <truncated>
    ],
    "type": "longrope"
  },

There may be other changes in the new modeling code that vLLM needs to support.

Metadata

Metadata

Assignees

No one assigned

    Labels

    new-modelRequests to new models

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions