Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running an mlx_server on an imported mlx model fails #140

Open
jasonnathan opened this issue Aug 7, 2024 · 2 comments
Open

Running an mlx_server on an imported mlx model fails #140

jasonnathan opened this issue Aug 7, 2024 · 2 comments

Comments

@jasonnathan
Copy link

jasonnathan commented Aug 7, 2024

Running Transformer Lab v0.4.0 on Macbook Pro M1

  1. Imported a quantized model converted from huggingface via:
python -m mlx_lm.convert \
--hf-path mistralai/Mistral-Nemo-Instruct-2407 \
-q
  1. In the model page, I cannot run the server. I have tried restarting the app. There are no settings to see how the mlx server is run nor can I find any logs.

  2. I have tried running the mlx server manually with the same host and port and it works:

% python -m mlx_lm.server \ 
> --host localhost \
> --port 21001 \
> --model ./mlx_model 

UserWarning: mlx_lm.server is not recommended for production as it only implements basic security checks.

2024-08-07 11:08:16,215 - INFO - Starting httpd at localhost on port 21001...

Please see screenshot

image
@dadmobile
Copy link
Member

Sorry for the unhelpful message. Just to make sure I understand...you converted the model on the command line first and then imported in to TransformerLab? Using the Import functionality? Just want to try to reproduce.

Have you run other models successfully? The error seems to be a problem hitting the fastchat server behind the scenes. Not sure if that's a general error or something caused specifically by this model.

One possible path forward might be to try downloading mistralai/Mistral-Nemo-Instruct-2407 using the download field at the bottom of the Model Zoo page and converting it to MLX using the Export tab in TransformerLab. Or...actually it looks like mlx-community also has versions posted of the model with different quantizations.

Regardless, I'd still love to understand what's causing the error.

@dadmobile
Copy link
Member

Just checking in to see if this is still an issue? I am unable to reproduce although I've only converted through the app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants