Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump vllm to v0.4.2 #7198

Merged
merged 1 commit into from
May 29, 2024
Merged

Conversation

kebe7jun
Copy link
Contributor

@kebe7jun kebe7jun commented May 9, 2024

No description provided.

@rmccorm4
Copy link
Collaborator

rmccorm4 commented May 9, 2024

Hi @kebe7jun, thanks for submitting the PR.

Can you elaborate on the specific models or features you're interested in that require this version upgrade?

CC @oandreeva-nv @tanmayv25

@rmccorm4 rmccorm4 requested a review from oandreeva-nv May 9, 2024 23:19
@rmccorm4 rmccorm4 added the module: backends Issues related to the backends label May 9, 2024
@oandreeva-nv
Copy link
Contributor

Started internal CI: 14897042
@kebe7jun , by any chance, have you submitted Triton CLA already?

@kebe7jun
Copy link
Contributor Author

I need LLama3 optimizations from 0.4.1, as well as phi-3-mini support from 0.4.2. See: https://github.com/vllm-project/vllm/releases

I have already signed the CLA, and I have had PR merged before.

@rmccorm4
Copy link
Collaborator

rmccorm4 commented May 21, 2024

Started internal CI: 14897042
...
tritonclient.utils.InferenceServerException: [StatusCode.INVALID_ARGUMENT] load failed for model 'vllm_opt': version 1 is at UNAVAILABLE state: Internal: AttributeError: module 'pynvml' has no attribute 'nvmlDeviceGetP2PStatus'

Looks like this might need a different version of pynvml or something. CC @oandreeva-nv @pskiran1 @tanmayv25

@oandreeva-nv
Copy link
Contributor

@rmccorm4 , yesy, I was working on this. In latest vllm they've added installation of pynvml to the requirements, so it conflicts with whatever we re-install in multi-gpu tests and as a result fails it. I was doing some initial tests with removing pynvml from tests, but it still was failing. Didn't get a chance to debug further as got re-assigned

@pskiran1
Copy link
Member

vLLM backend PR: triton-inference-server/vllm_backend#43
Latest internal CI: 15397809

@pskiran1
Copy link
Member

@kebe7jun, please rebase your branch with the latest main. Thank you.

Copy link
Contributor

@oandreeva-nv oandreeva-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oandreeva-nv
Copy link
Contributor

@pskiran1 , this branch has no conflicts, thus re-base is unnecessary IMO. Feel free to merge this PR with yours

@pskiran1 pskiran1 merged commit a83d28a into triton-inference-server:main May 29, 2024
3 checks passed
@kebe7jun kebe7jun deleted the patch-1 branch May 29, 2024 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: backends Issues related to the backends
Development

Successfully merging this pull request may close these issues.

5 participants