-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Inference] Fix weight_only_int4 bug #9073
[Inference] Fix weight_only_int4 bug #9073
Conversation
2. add llama3.1 and qwen2 ptq config 3. update quantization.md
…nto add_new_fakequant_type
Thanks for your contribution! |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #9073 +/- ##
===========================================
- Coverage 53.56% 53.51% -0.05%
===========================================
Files 652 652
Lines 106397 105187 -1210
===========================================
- Hits 56987 56291 -696
+ Misses 49410 48896 -514 ☔ View full report in Codecov by Sentry. |
代码库中很多地方有 weight_quant_method,麻烦全局确认一遍。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
全局检查后发现argment.py里出现weight_quant_method重复,已在本次pr中删除重复的weight_quant_method字段 |
PR types
Bug fixes
PR changes
Others
Description
Fix weight_only_int4 bug