-
Notifications
You must be signed in to change notification settings - Fork 59
[High Risk]Refine inference code #840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
…into refine_inference
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…into refine_inference
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…into refine_inference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This is a high-risk refactoring of the inference code that modernizes the backend selection system and improves code maintainability. The PR removes legacy features and refactors the quantization layer replacement logic.
- Removes deprecated
clipparameter from quantization schemes - Refactors backend selection to use structured packing formats and simplifies the layer replacement logic
- Updates backend configuration with more consistent naming and format specifications
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| auto_round/schemes.py | Removes deprecated clip parameter from QuantizationScheme |
| auto_round/inference/convert_model.py | Major refactoring of model conversion logic with simplified backend selection and updated function signatures |
| auto_round/inference/backend.py | Updates backend configurations with new packing format structure and modernized type hints |
| auto_round/compressors/base.py | Removes warning about unsupported auto_round format loading |
| auto_round/main.py | Updates condition for model evaluation to check format type |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
No description provided.