-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core][Frontend][Doc] Initial support for LLaVA-NeXT and GPT-4V Chat Completions API #3978
Conversation
- Refactor `OpenAIServingChat` and add function for loading image - Move `pillow` dev dependency to common - Add example chat template for LLaVA model
- Add general guide for using VLMs - Add LLavA to list of supported models
- Move `ServerRunner` to common file
- Incorrect loading of config (also rename `openai_api` to `image_openai`) - Incorrect await of stream generator
- Also, use the type definitions from `openai` directly
…ions API and legacy Completions API
I have just added support for LLaVA-NeXT, with one big caveat: the size of the input image is fixed, otherwise the feature size (i.e. number of |
f66a08f
to
72eb712
Compare
72eb712
to
175b819
Compare
These force pushes consolidate the fixes to the LLaVA test and example code. |
- Note that we now load the images directly instead of from `.pt` files
175b819
to
cb19743
Compare
@ywang96 I think that this PR is suffering from scope creep. Perhaps I should break apart the changes into smaller segments to facilitate the conversation in #4194? I could split the changes as follows, with each item being its own PR:
Edit: Added links to the child PRs. |
I agree - I think OpenAI API server will be a good starting point since the interface should agree with OpenAI protocol anyways, and I'm sorry that this PR suffered :/ One suggestion I have is for a big change like this - it's probably good to have a series of PRs anyways. Take a look at Speculative decoding or Chunked Prefill - those are great examples. |
I have created the child PRs. |
- These changes are propagated to the child PRs
- Note that LLaVA-1.5 has been refactored to facilitate this
All of the child PRs have been completed, so I'm closing this now. |
To combat scope creep, this PR has been split into smaller ones.
[Frontend] Support GPT-4V Chat Completions API #4200[Frontend] Add OpenAI Vision API Support #5237The branch associated with this PR has been frozen (except for critical fixes). Once all dependencies have been merged, I will compare this branch against the merged (
main
) branch to verify that I didn't miss any changes.