-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v0.4.0] Release Tracker #3155
Comments
@zhuohan123 do you think the second part for int8 inference should also be added into the roadmap? |
Would it be possible to request that the following is merged prior to release: #2961 It is preventing us from being able to deploy models using vLLM in certain contexts. |
This amazing MR created by @chu-tianxiang was originally supposed to be merged into the previous version. Hope it can be merged this time. The speed improvement is very obvious. |
@zhuohan123 Hope Prefix Caching with FP8 KV cache support (#3234) could be merged. |
@zhuohan123 Would it be possible to request if #3233 can be merged in this release? |
Cmake -> #2830 |
I'm planning to merge usage reporting PR by Monday. |
We are very hopeful to have the JAIS commit in v0.3.4 4c07dd2 |
When do you plan to release v0.3.4? |
We have a hard deadline to release it this week. Sorry about the delay. |
Hi @JMHenri , Please let me know in case you face any issues with the JAIS models. Since we don't have a chat_template in the tokenizer, please feel to reach out in case of any issues. You can certainly look into this for reference on usage. |
ETA: Before Mar 28th
Major changes
TBD.
PRs to be merged before the release
The text was updated successfully, but these errors were encountered: