-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow passing hf config args with openai server #2547
Comments
I believe there's no fundamental reason to this. Contribution welcomed! I would say you can add this to ModelConfig class and pass it through EngineArgs. |
I will take a look at this |
Anyone has news about that? I want to use --dtype, but it doesn't work |
@mrPsycox |
Thanks @Aakash-kaushik , I found the issue. Passing This works for me:
|
Just as a workaround, I am currently doing something like this: import shutil
import os
from contextlib import contextmanager
@contextmanager
def swap_files(file1, file2):
try:
temp_file1 = file1 + '.temp'
temp_file2 = file2 + '.temp'
print("Renaming Files.")
os.rename(file1, temp_file1)
os.rename(file2, file1)
os.rename(temp_file1, file2)
yield
finally:
print("Restoring Files.")
os.rename(file2, temp_file2)
os.rename(file1, file2)
os.rename(temp_file2, file1)
file1 = '/path/to/original/config.json'
file2 = '/path/to/modified/config.json'
with swap_files(file1, file2):
llm = LLM(...) |
I would love to see this as well |
@Aakash-kaushik @mrPsycox @timbmg @K-Mistele Please take a look at my PR and let me know if it serves your purpose. As @DarkLight1337 noted in my PR (#5836) , what exactly do you want to accomplish using this feature that cannot otherwise be done via vLLM args? (If we don't have any situation that results in different vLLM output, what is the point of enabling this?) Once you get back to me, I'll write a test that covers that case. |
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you! |
Hi guys, just bumping this in case it's still relevant. Maybe not so much passing hf Some examples of where this would be applicable include configuring Qwen models and llama models' RoPE scaling:
Maybe this is already implemented somewhere else? |
I proposed a similar feature in #5205, still looking for someone to implement it. |
Hi,
Is there a specific reason for why can't we allow passing of args from the openai server to the HF config class, there are very reasonable use cases where i would want to override the existing args in a config while running the model dynamically though the server.
reference line
simply allowing
*args
in the openai server that are passed to this while loading the model, i believe there are internal checks for failing if anything configured is wrong anyway.supported documentation in the transformers library:
The text was updated successfully, but these errors were encountered: