-
Notifications
You must be signed in to change notification settings - Fork 607
Aot compiler fix #9634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aot compiler fix #9634
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9634
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit ccf664e with merge base 4b8ac94 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
10408cd
to
ccf664e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
phi_4_mini ci tests are failing with:
there is no error message with the run, so I don't think anything is failing but just that this tests is getting killed. Running it locally on my laptop seems to pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine, doesn't seem like it should have affected the phi test. If it's consistently getting killed it might be OOMing
From latest viable/strict: https://hud.pytorch.org/hud/pytorch/executorch/viable%2Fstrict/1?per_page=50 Fixes #144480 This commit has important CI stability fixes, such as pytorch/executorch#9561 and pytorch/executorch#9634 Pull Request resolved: #150308 Approved by: https://github.com/jathu, https://github.com/malfet
…ntization for Example Models (#9634) ### Summary Changes: 1. When initializing Llama2 for aot_compiler, since checkpoints can only e downloaded from hugging face, we initialize llama2 with uninitialized weights. The problem with this is that when running quantization, we can run into errors with the histogram if the unitialized values are nan. We fix this by initializing the weights with zeros if no check point is provided. This enforces that quantization step can still work. 2. Quant Type in AoT compiler. When looking at the model options available to XNNPACK, everything is quantized with per-tensor static quantization. This isn't the best option for all the models available. For example transformer based models like Llama and MobileBert would likely prefer dynamically quantized per channel weights, where has CNN like MobileNet would prefer statically quantized per channel weights. We add this type of Quant Type to the existing models options. This also helps with Test Timeouts. per-tensor static quantization on a model like llama can take a long time due to the introduction of MANY q/dq nodes, and the complex partitions it creates. As a result, proposing partitions can take a long time due to the constant BFS to find the largest possible partition. By specifying the more apt quantization scheme like dynamic per-channel quantization, we can avoid this complexity. Overall this should help with flakey [nan, nan] errors in the quantization histogram, and it should also help with CI timing out. ### Test plan OSS XNNPACK CI for all model delegation cc @digantdesai @cbilgin
From latest viable/strict: https://hud.pytorch.org/hud/pytorch/executorch/viable%2Fstrict/1?per_page=50 Fixes pytorch#144480 This commit has important CI stability fixes, such as pytorch/executorch#9561 and pytorch/executorch#9634 Pull Request resolved: pytorch#150308 Approved by: https://github.com/jathu, https://github.com/malfet
From latest viable/strict: https://hud.pytorch.org/hud/pytorch/executorch/viable%2Fstrict/1?per_page=50 Fixes #144480 This commit has important CI stability fixes, such as pytorch/executorch#9561 and pytorch/executorch#9634 Pull Request resolved: #150308 Approved by: https://github.com/jathu, https://github.com/malfet
…53750) * Update ExecuTorch pin to latest viable/strict 3/28/2025 (#150308) From latest viable/strict: https://hud.pytorch.org/hud/pytorch/executorch/viable%2Fstrict/1?per_page=50 Fixes #144480 This commit has important CI stability fixes, such as pytorch/executorch#9561 and pytorch/executorch#9634 Pull Request resolved: #150308 Approved by: https://github.com/jathu, https://github.com/malfet * Use new hash from #150722 * Update executorch.txt --------- Co-authored-by: Mergen Nachin <mnachin@meta.com>
Summary
Changes:
Overall this should help with flakey [nan, nan] errors in the quantization histogram, and it should also help with CI timing out.
Test plan
OSS XNNPACK CI for all model delegation
cc @digantdesai @cbilgin