Skip to content

Cannot start Phi-3-mini-128k-instruct from Docker #550

@annadmitrieva

Description

@annadmitrieva

System Info

Using two official Docker images (latest and main).

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

After running
docker run --gpus all -p 8080:80 -v /home/ubuntu/lorax_weight_cache ghcr.io/predibase/lorax:latest --model-id microsoft/Phi-3-mini-128k-instruct --trust-remote-code

I get:


  File "/opt/conda/bin/lorax-server", line 8, in <module>
    sys.exit(app())

  File "/opt/conda/lib/python3.10/site-packages/lorax_server/cli.py", line 83, in serve
    server.serve(

  File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 309, in serve
    asyncio.run(

  File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)

  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()

  File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 243, in serve_inner
    model = get_model(

  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/__init__.py", line 251, in get_model
    return FlashPhi3(

  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/flash_phi3.py", line 88, in __init__
    model = FlashPhi3ForCausalLM(config, weights)

  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_phi3_modeling.py", line 482, in __init__
    self.model = FlashPhi3Model(config, weights)

  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_phi3_modeling.py", line 422, in __init__
    [

  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_phi3_modeling.py", line 423, in <listcomp>
    FlashPhi3Layer(

  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_phi3_modeling.py", line 360, in __init__
    self.self_attn = FlashPhi3Attention(

  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_phi3_modeling.py", line 190, in __init__
    self.rotary_emb = PositionRotaryEmbedding.static(

  File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/layers.py", line 879, in static
    scaling_factor = rope_scaling["factor"]

KeyError: 'factor'

After running
docker run --gpus all -p 8080:80 -v /home/ubuntu/lorax_weight_cache ghcr.io/predibase/lorax:main --model-id microsoft/Phi-3-mini-128k-instruct --trust-remote-code=True

I get:

  File "/opt/conda/bin/lorax-server", line 8, in <module>
    sys.exit(app())
  File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in __call__
    return get_command(self)(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
    return _main(
  File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
    rv = self.invoke(ctx)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
    return callback(**use_params)  # type: ignore
  File "/opt/conda/lib/python3.10/site-packages/lorax_server/cli.py", line 87, in serve
    server.serve(
  File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 371, in serve
    asyncio.run(
  File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
    self.run_forever()
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
    self._run_once()
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
    handle._run()
  File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
> File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 251, in serve_inner
    model = get_model(
  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/__init__.py", line 275, in get_model
    return FlashPhi3(
  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/flash_phi3.py", line 88, in __init__
    model = FlashPhi3ForCausalLM(config, weights)
  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_phi3_modeling.py", line 482, in __init__
    self.model = FlashPhi3Model(config, weights)
  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_phi3_modeling.py", line 422, in __init__
    [
  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_phi3_modeling.py", line 423, in <listcomp>
    FlashPhi3Layer(
  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_phi3_modeling.py", line 360, in __init__
    self.self_attn = FlashPhi3Attention(
  File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_phi3_modeling.py", line 190, in __init__
    self.rotary_emb = PositionRotaryEmbedding.static(
  File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/layers.py", line 479, in static
    raise NotImplementedError(f"rope scaling type {rope_type} is not implemented or invalid")
NotImplementedError: rope scaling type longrope is not implemented or invalid

The --trust-remote-code flag is not an issue, running the same command without it outputs the same errors.

Expected behavior

Server should start successfully.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions