-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
microsoft/phi-2 model outputs nonsense on NPU #27
Comments
what driver version are you using? |
Anyway I can reproduce it myself. It goes away if you use fp16 inference
I suggest you to use that while I try to understand what is happening in our quantization scheme that is causing such unacceptable accuracy drop. Many thanks for raising this issue I'll keep you updated on this ticket |
My Intel(R) AI Boost driver is on 32.0.100.2381. Thanks! |
No problem. I'd like the int8/int4 version to work especially for this PR (#20) as it will bring significative performance boost. Here it seems like an issue in our quantization step. I'll keep you posted |
Fix in flight in #32 |
Commit ae0a999 fixes your issue for phi-2 |
* Add int4 support * Fix dtypes * Add dtypes test * Add dtype to library * Faster i8 to i4 compression * hotfix * Update the profile-llm script * Add library * fix script * Update readme * Add neural compressor and demo * Use neural compressor as the default method * hotfix * Quantize only quantized models * Add tests * fix issue #27
Describe the bug
After I compile the microsoft/phi-2 model with intel_npu_acceleration_library the output of the model is complete nonsense. It just outputs text like to- or in of ", as for, on, and, is,, and, are,., and,,,,, and,,,, and,,, and,,,, and,,, and,, and,,, and,,
To Reproduce
Steps to reproduce the behavior:
The output is:
Question: What's the distance between the Earth and the Moon?
Answer: to- or in of ", as for, on, and, is,, and, are,., and,,,,, and,,,, and,,, and,,,, and,,, and,, and,,, and,,, and,,, and,, and,, and,, and,, a....
Expected behavior
When running the initial model (the one compiled for CPU) the output is:
_Question: What's the distance between the Earth and the Moon?
Answer: The average distance from the Earth to the moon is about 238,855 miles._
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: