- 
          
 - 
                Notifications
    
You must be signed in to change notification settings  - Fork 11k
 
Closed
Labels
releaseRelated to new version releaseRelated to new version release
Description
Anything you want to discuss about vllm.
We will make a triplet of releases in the following 3 weeks.
- v0.5.2 on Monday July 15th.
 - v0.5.3 by Tuesday July 23rd.
 - v0.6.0 after Monday July 29th.
 
Blockers
-  [Bugfix][Frontend] Fix missing 
/metricsendpoint #6463 - [CI/Build] Build on Ubuntu 20.04 instead of 22.04 #6517
 - [BugFix] Fix use of per-request seed with pipeline parallel #6698
 -  Test vLLM works with 405B that's 
num_kv_heads=8instead of 16. 
The reason for such pace is that we want to remove beam search (#6226), which unlocks a suite of scheduler refactoring to enhance performance (async scheduling to overlap scheduling and forward pass for example). We want to release v0.5.2 ASAP to issue warnings and uncover new signals. Then we will decide the removal in v0.6.0. Normally we will deprecate slowly by stretching it by one month or two. However, (1) RFC has been opened for a while (2) it is unfortunately on the critical path of refactoring and performance enhancements.
Please also feel free to add release blockers. But do keep in mind that I will not slow the release for v0.5.* series unless critical bug.
zhouyuan, Huarong, wooyeonlee0, garycaokai, robertgshaw2-redhat and 8 more
Metadata
Metadata
Assignees
Labels
releaseRelated to new version releaseRelated to new version release