-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support to use brevitas quantized weights for stateless_llama #179
Conversation
IanNod
commented
Nov 17, 2023
- Modifies mm_group_quant to work with brevitas safetensors, needs work to generalize
- Changes compiler to use torch as input to enable quantization of torch ir
- Modifies mm_group_quant to work with brevitas safetensors, needs work to generalize - Changes compiler to use torch as input to enable quantization of torch ir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly familiar with everything going on here, but left some thoughts/questions.
- Adds todo clarifying skipping of _params. in mm_group_quant for matching purposes - removes arg use in pipeline to make external use easier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I have not been working on this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you run black and make sure everything is formatted? A lot of whitespace changes.
Yup, already ran black at least the files changed in this patch |
@stellaraccident are you running default black or with |
Default black. We need to add a lint check and should do were iree is doing. |