-
Notifications
You must be signed in to change notification settings - Fork 585
Qualcomm AI Engine Direct - Add submodule quant config setting #9355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
chunit-quic
commented
Mar 18, 2025
- Add API to qnn quantizer for setting submodule quant config
- Refine QnnQuantizer setting functions
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9355
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 137228a with merge base 6adff9c ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "release notes: qualcomm" |
4101be2
to
1c8f72e
Compare
|
||
nn_module_stack = node.meta.get("nn_module_stack") | ||
if nn_module_stack: | ||
module_source_str, module_str = list(nn_module_stack.values())[-1][ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indexing by -1 here, is the order of the dict guaranteed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sxu,
Thank you for reviewing!
Yes, based on the description nn_module_stack from pytorch.org, the order is determined by the stack trace. We can find the same the -1 logic also applied in the other file within the repo.
module_source_str, module_str = list(nn_module_stack.values())[-1][ | ||
-1 | ||
].rsplit(".", 1) | ||
module_source = importlib.import_module(module_source_str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The module might not be available for import. For example when the output of torch.export.export
is done ahead of time. If I'm not mistaken loading the exported program doesn't require source modules to be available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please provide a failed example? To my knowledge, you must have a Torch neural network to invoke torch.export.export. To have a Torch neural network, you must import the corresponding modules. Therefore, it seems to me that we are guaranteed to have the module available for import here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick update. Thanks to your comments, we found that in some cases, certain modules failed to import. Switching to string-based comparison will bypass this issue.
One more thing, could you please share more thoughts on this comment? Thank you :)
self.per_channel_quant_config = get_ptq_per_channel_quant_config() | ||
self.use_per_channel_weight_quant_ops: Set[OpOverload] = set() | ||
self.default_quant_config = ModuleQConfig() | ||
self.module_qconfig_dict: Dict[torch.nn.Module, ModuleQConfig] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for setting different qconfigs by submodule type, but another important use case is by subodule name. For example both self.backbone
and self.head
contain linear layers, but we want 8-bit activation for backbone and 16-bit for head.
How about making this a List[Tuple[Callable[[fx.Node], bool], ModelQConfig]]
? First element of the tuple is a predicate determining if the qconfig should be applied to a given node. Each predicate in the list is evaluated sequentially, the earlier ones will have priority over later ones. Then there can be some utility for creating common predicates, for example:
def get_submodule_type_predicate(module_type):
def predicate(node):
if nn_module_stack := node.meta.get("nn_module_stack"):
return module_type in (x[1] for x in nn_module_stack.values())
return False
return predicate
or
def get_submodule_name_predicate(module_name):
def predicate(node):
if nn_module_stack := node.meta.get("nn_module_stack"):
return module_name in (x[1] for x in nn_module_stack.keys())
return False
return predicate
which can be used like this:
predicate_qconfigs = [
(get_submodule_name_predicate("self.head"), my_16a8w_qconfig),
(get_submodule_type_predicate("torch.nn.Linear"), my_8a8w_qconfig),
(lambda n: True, default_qconfig),
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for providing the use cases and pseudo codes!
They clarify some of the requirements we were unsure about. I have a few questions to ensure we fully understand the requirements:
- So, basically, all submodules belonging to the given name or submodule type should share the same qconfig, correct?
- Is there any data structure more robust than a string? We find it challenging to find one after export. As in your example, it seems we can only rely on the string from nn_module_stack, which leads us to use importlib for additional assurance.
- Similar to point 2, the name of self.head is defined by users in nn.Module. Therefore, an arbitrary name might appear in any module, which could cause us accidentally set qconfig to an unintended target. Ask more information (like what moudle type user want) can work around this. Yet I guess we might need a more reliable data structure to search for them in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick update. We change the mapping way based on your comments. Please feel free to let us know any problem. Thank you. :D
1c8f72e
to
89673b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I missed this for the last couple of days, but thanks for the follow-up! LGTM.
- Add API to qnn quantizer for setting submodule quant config
- Change to string based way to set up qconfig for submodule
89673b6
to
f2ce0e7
Compare
There is a linterror, can you fix it? |
Thanks for pointing that out! Fixed. |
Hi @cccclai, if this PR looks good to you, could you please help merge it? Thank you. :) |
Yes, sorry for being late, merging. |
- Add API to qnn quantizer for setting submodule quant config - Refine QnnQuantizer setting functions --------- Co-authored-by: Chun-I Tsai <chunit@qti.qualcomm.com>
…ch#9355) - Add API to qnn quantizer for setting submodule quant config - Refine QnnQuantizer setting functions --------- Co-authored-by: Chun-I Tsai <chunit@qti.qualcomm.com>