Bump vllm to v0.4.2 #7198

kebe7jun · 2024-05-09T09:07:18Z

No description provided.

rmccorm4 · 2024-05-09T23:19:07Z

Hi @kebe7jun, thanks for submitting the PR.

Can you elaborate on the specific models or features you're interested in that require this version upgrade?

CC @oandreeva-nv @tanmayv25

oandreeva-nv · 2024-05-09T23:43:01Z

Started internal CI: 14897042
@kebe7jun , by any chance, have you submitted Triton CLA already?

kebe7jun · 2024-05-10T01:45:00Z

I need LLama3 optimizations from 0.4.1, as well as phi-3-mini support from 0.4.2. See: https://github.com/vllm-project/vllm/releases

I have already signed the CLA, and I have had PR merged before.

rmccorm4 · 2024-05-21T18:14:23Z

Started internal CI: 14897042
...
tritonclient.utils.InferenceServerException: [StatusCode.INVALID_ARGUMENT] load failed for model 'vllm_opt': version 1 is at UNAVAILABLE state: Internal: AttributeError: module 'pynvml' has no attribute 'nvmlDeviceGetP2PStatus'

Looks like this might need a different version of pynvml or something. CC @oandreeva-nv @pskiran1 @tanmayv25

oandreeva-nv · 2024-05-21T19:07:53Z

@rmccorm4 , yesy, I was working on this. In latest vllm they've added installation of pynvml to the requirements, so it conflicts with whatever we re-install in multi-gpu tests and as a result fails it. I was doing some initial tests with removing pynvml from tests, but it still was failing. Didn't get a chance to debug further as got re-assigned

pskiran1 · 2024-05-29T17:01:51Z

vLLM backend PR: triton-inference-server/vllm_backend#43
Latest internal CI: 15397809

pskiran1 · 2024-05-29T17:02:50Z

@kebe7jun, please rebase your branch with the latest main. Thank you.

oandreeva-nv

LGTM with @pskiran1 changes: triton-inference-server/vllm_backend#43

oandreeva-nv · 2024-05-29T17:41:09Z

@pskiran1 , this branch has no conflicts, thus re-base is unnecessary IMO. Feel free to merge this PR with yours

Bump vllm to v0.4.2

7eeff55

rmccorm4 requested a review from oandreeva-nv May 9, 2024 23:19

rmccorm4 assigned tanmayv25 May 9, 2024

rmccorm4 added the module: backends Issues related to the backends label May 9, 2024

pskiran1 mentioned this pull request May 29, 2024

Update CI - Bump vllm to v0.4.2 triton-inference-server/vllm_backend#43

Merged

oandreeva-nv approved these changes May 29, 2024

View reviewed changes

pskiran1 merged commit a83d28a into triton-inference-server:main May 29, 2024
3 checks passed

kebe7jun deleted the patch-1 branch May 29, 2024 23:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump vllm to v0.4.2 #7198

Bump vllm to v0.4.2 #7198

kebe7jun commented May 9, 2024

rmccorm4 commented May 9, 2024

oandreeva-nv commented May 9, 2024

kebe7jun commented May 10, 2024

rmccorm4 commented May 21, 2024 •

edited

Loading

oandreeva-nv commented May 21, 2024

pskiran1 commented May 29, 2024

pskiran1 commented May 29, 2024

oandreeva-nv left a comment

oandreeva-nv commented May 29, 2024

Bump vllm to v0.4.2 #7198

Bump vllm to v0.4.2 #7198

Conversation

kebe7jun commented May 9, 2024

rmccorm4 commented May 9, 2024

oandreeva-nv commented May 9, 2024

kebe7jun commented May 10, 2024

rmccorm4 commented May 21, 2024 • edited Loading

oandreeva-nv commented May 21, 2024

pskiran1 commented May 29, 2024

pskiran1 commented May 29, 2024

oandreeva-nv left a comment

Choose a reason for hiding this comment

oandreeva-nv commented May 29, 2024

rmccorm4 commented May 21, 2024 •

edited

Loading